Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
19
19
20
29
29
41
45
47
47
59
60
60
62
66
69
71
72
73
75
76
77
81
83
86
88
88
90
91
94
103
108
110
112
116
121
124
129
132
133
137
143
147
151
152
156
158
163
164
168
170
179
181
182
183
184
186
188
188
190
191
194
197
198
205
208
209
212
214
217
222
224
228
1.4.62 jdbc_response (Java Database Connectivity SQL Queries Response Monitoring) Release Notes . . . . . . . . . . . . . . . . . . .
1.4.63 jdbcgtw (JDBC Gateway) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.64 jobqs (iSeries Job Queues Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.65 jobs (iSeries Job Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.66 jobsched (iSeries Job Schedule Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.67 journal (iSeries Journal Message Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.68 jvm_monitor (Java Virtual Machine Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.69 logmon (Log Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.70 lync_monitor (Microsoft Lync Server Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.71 maintenance_mode Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.72 selfservice_cm (Self Service Configuration Management) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.73 mongodb_monitor (MongoDB Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.74 monitoring_services Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.75 nas (Alarm Server) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.76 net_connect (Network Connectivity Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.77 net_traffic (Network Traffic Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.78 netapp (NetApp Storage Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.79 nfa_inventory (NFA Inventory) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.80 notes_response (IBM Notes Server Response Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.81 notes_server (IBM Domino Server Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.82 nq_services (NQ Services) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.83 nsa (Script Agent) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.84 nsdgtw (CA Nimsoft Service Desk Gateway) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.85 ntevl (NT Event Log Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.86 ntp_response (Network Time Protocol Response Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.87 ntperf (Performance Collector) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.88 ntservices (Microsoft Windows NT Services Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.89 oracle (Oracle Database Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.90 outqs (iSeries Output Queue Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.91 perfmon (System Performance Engine Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.92 policy_engine Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93 ppm (Probe Provisioning Manager) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1 ppm Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.1 Probe Provisioning UIs Only Support Viewing Alarm Message Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.2 adevl/ntevl Probe UI Restricts Update Events Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.3 apache Probe UI Does Not Support Summary and Checkpoint Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.4 Adogtw Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.5 cdm Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.6 cluster UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.7 controller Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.8 db2 Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.9 dirscan Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.10 dns_response Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.11 e2e_appmon Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.12 emailgtw Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.13 file_adapter Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.14 ica_response Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.15 iis Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.16 informix Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.17 Jboss Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.18 jvm_monitor Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.19 logmon Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.20 ntevl Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.21 ntservices Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.22 oracle Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.23 perfmon Probe UI Does Not Show Some Fields Under Status Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.24 printers Probe UI Does Not Support 'Add Print Watcher' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.25 rsp Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.26 smsgtw Probe UI Does Not Provide Editable Drop Down for Sending Messages . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.27 sql_response Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.28 sqlserver Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.29 sybase Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.30 sybase_rs Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.31 url_response Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.32 webservicemon Probe UI Know Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.93.1.33 Websphere Probe UI Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.94 prediction_engine Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.95 printers (Printer Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.96 processes (Process Monitoring) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.97 qos_processor Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.98 reboot Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.99 rsp (Remote System Probe) Release Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
235
237
238
239
240
242
243
246
254
257
258
258
260
260
265
271
273
276
278
279
280
281
283
285
292
294
298
301
311
313
315
316
323
324
324
324
324
325
325
325
326
327
328
328
329
329
330
330
331
332
333
333
334
334
335
336
336
336
337
337
338
339
340
341
342
342
343
346
347
354
355
356
366
367
370
370
372
375
376
378
382
382
384
386
387
390
391
396
400
401
406
408
409
412
419
421
427
429
430
431
433
433
440
441
457
459
459
461
463
466
470
471
477
482
485
486
488
489
490
491
493
493
494
494
497
501
504
508
508
509
512
517
519
526
529
533
535
541
546
546
549
555
558
566
566
566
570
570
572
574
582
583
583
583
589
591
595
595
596
598
600
603
607
609
611
611
616
623
624
626
629
631
631
632
632
635
637
637
637
638
638
638
639
639
639
643
645
654
654
655
661
673
675
687
692
693
697
700
705
708
713
714
714
716
719
721
723
724
726
727
729
731
731
732
733
733
736
739
739
742
742
747
747
748
751
754
758
759
759
779
779
783
791
796
818
821
826
826
838
854
865
880
891
913
923
939
949
965
974
987
988
990
991
995
1002
1005
1010
1010
1019
1020
1020
1021
1032
1032
1038
1050
1050
1056
1067
1110
1111
1114
1121
1126
1126
1134
1137
1144
1148
1174
1175
1176
1179
1180
1182
1187
1203
1207
1211
1213
1216
1220
1223
1229
1232
1237
1240
1246
1247
1251
1257
1258
1259
1259
1260
1262
1272
1273
1279
1281
1290
1291
1293
1295
1301
1305
1309
1314
1319
1324
1329
1333
1338
1343
1347
1352
1355
1355
1362
1363
1368
1373
1382
1388
1391
1407
1408
1408
1411
1415
1419
1425
1426
1427
1428
1428
1428
1432
1436
1436
1437
1443
1450
1450
1452
1456
1457
1458
1460
1463
1467
1468
1468
1468
1471
1475
1477
1483
1484
1494
1496
1496
1497
1498
1504
1511
1511
1511
1512
1514
1516
1516
1517
1521
1527
1527
1531
1537
1537
1540
1544
1546
1551
1553
1556
1558
1564
1564
1564
1565
1565
1570
1576
1580
1593
1599
1605
1610
1624
1636
1636
1656
1658
1658
1664
1667
1672
1676
1676
1677
1678
1680
1689
1689
1689
1690
1691
1693
1697
1697
1698
1701
1702
1705
1710
1711
1716
1717
1718
1718
1719
1723
1736
1736
1737
1739
1739
1739
1742
1744
1747
1749
1750
1752
1755
1758
1761
1764
1765
1766
1767
1769
1771
1773
1776
1779
1779
1779
1782
1784
1784
1785
1794
1800
1810
1823
1830
1835
1849
1862
1868
1874
1888
1902
1903
1903
1904
1910
1915
1917
1917
1919
1922
1926
1928
1935
1936
1938
1940
1945
1947
1953
1953
1955
1961
1966
1966
1973
1975
1976
1981
1987
1989
1990
1994
2001
2003
2009
2011
2014
2021
2021
2029
2039
2040
2041
2045
2048
2048
2053
2054
2054
2057
2068
2071
2075
2076
2077
2092
2105
2107
2107
2108
2111
2115
2116
2117
2118
2118
2122
2126
2126
2126
2129
2132
2132
2132
2135
2138
2141
2144
2145
2145
2148
2152
2155
2158
2159
2159
2161
2164
2166
2169
2169
2173
2177
2180
2184
2184
2187
2189
2192
2196
2202
2203
2204
2204
2205
2220
2223
2234
2238
2238
2251
2265
2279
2293
2305
2306
2307
2310
2315
2320
2326
2342
2359
2360
2363
2364
2364
2367
2371
2397
2397
2398
2398
2399
2399
2402
2402
2404
2404
2419
2445
2446
2448
2450
2455
2458
2458
2471
2496
2498
2500
2505
2507
2509
2509
2517
2533
2533
2535
2540
2542
2542
2543
2543
2543
1.5.84.8.5 get_alarms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.6 get_ao_status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.7 get_info . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.8 get_sid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.9 host_summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.10 nameservice_create . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.11 nameservice_delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.12 nameservice_list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.13 nameservice_lookup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.14 nameservice_setlock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.15 nameservice_update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.16 note_attach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.17 note_create . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.18 note_delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.19 note_detach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.20 note_list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.21 Reorganize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.22 repl_queue_post . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.23 repl_queue_info . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.24 script_delete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.25 script_rename . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.26 script_list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.27 script_run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.28 script_validate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.29 set_loglevel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.30 set_visible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.31 transaction_list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.8.32 trigger_list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.9 The nas Script Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.9.1 The nas Extentions to Lua (All Versions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.84.10 nas Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85 net_connect (Network Connectivity Monitoring) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.1 net_connect AC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.1.1 net_connect AC GUI Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.2 net_connect IM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.2.1 net_connect IM GUI Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.3 net_connect Versions 3.1-3.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.3.1 v3.1 net_connect AC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.3.2 v3.1 net_connect IM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.3.3 v3.0 net_connect AC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.3.4 v3.0 net_connect IM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.4 net_connect Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.85.5 net_connect Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.86 net_traffic (Network Traffic Monitoring) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.86.1 net_traffic IM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.86.2 net_traffic Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.87 netapp (NetApp Storage Monitoring) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.87.1 netapp Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.87.2 netapp Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.87.3 netapp MIB Files Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.87.4 netapp IM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.87.5 netapp Version 1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.87.5.1 v1.3 netapp IM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.88 nfa_inventory (NFA Inventory) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.88.1 v1.1 nfa_inventory AC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.88.2 v1.0 nfa_inventory AC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.89 notes_response (IBM Notes Server Response Monitoring) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.89.1 notes_response Set Up IBM Notes Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.89.2 notes_response AC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.89.3 notes_response IM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.89.4 notes_response Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.90 notes_server64 (IBM Domino Server Monitoring) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.90.1 notes_server64 AC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.90.2 notes_server64 IM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.91 notes_server (IBM Domino Server Monitoring) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.91.1 notes_server Preconfiguration Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.91.2 notes_server AC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.91.2.1 notes_server AC GUI Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.91.3 notes_server IM Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.91.3.1 notes_server IM GUI Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.91.4 notes_server Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.91.5 notes_server Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2543
2544
2544
2544
2544
2545
2545
2545
2545
2545
2546
2546
2546
2546
2546
2547
2547
2547
2547
2547
2547
2548
2548
2548
2548
2548
2548
2549
2549
2554
2565
2565
2566
2569
2574
2581
2590
2590
2598
2627
2633
2653
2654
2656
2656
2663
2664
2664
2667
2671
2672
2682
2682
2692
2693
2696
2697
2697
2698
2701
2705
2705
2706
2706
2706
2706
2708
2711
2713
2717
2720
2720
2721
2721
2721
2723
2729
2733
2734
2735
2735
2744
2754
2758
2758
2758
2766
2776
2786
2797
2797
2798
2802
2803
2808
2808
2809
2809
2809
2809
2809
2809
2810
2813
2816
2819
2831
2833
2835
2837
2846
2848
2851
2851
2852
2855
2859
2862
2867
2869
2872
2874
2878
2879
2880
2891
2894
2902
2914
2922
2925
2933
2933
2957
2957
2959
2961
2963
2965
2965
2966
2971
2973
2973
2974
2976
2978
2980
2980
2980
2980
2982
2982
2984
2986
2987
2988
2988
2989
2989
2992
2993
2995
2996
2997
2999
3000
3001
3003
3004
3005
3006
3006
3006
3009
3009
3009
3012
3017
3022
3039
3041
3045
3051
3060
3065
3083
3086
3088
3088
3089
3092
3092
3098
3099
3104
3112
3129
3142
3146
3150
3155
3162
3179
3181
3195
3197
3198
3204
3207
3208
3209
3214
3215
3215
3225
3225
3225
3225
3227
3228
3229
3230
3235
3238
3238
3242
3244
3246
3249
3249
3255
3258
3265
3274
3278
3279
3279
3280
3286
3287
3289
3289
3297
3305
3306
3310
3311
3316
3323
3332
3333
3337
3337
3342
3350
3355
3360
3360
3366
3373
3382
3388
3390
3392
3393
3394
3406
3406
3414
3421
3426
3434
3435
3435
3437
3440
3442
3444
3444
3445
3451
3457
3459
3459
3460
3473
3473
3477
3483
3483
3485
3490
3492
3497
3498
3503
3503
3510
3516
3517
3518
3524
3534
3534
3543
3544
3550
3560
3565
3578
3578
3579
3580
3584
3603
3604
3605
3608
3613
3618
3624
3636
3637
3646
3664
3665
3666
3671
3671
3674
3675
3678
3680
3680
3681
3682
3683
3685
3689
3691
3695
3696
3699
3700
3701
3701
3704
3705
3707
3709
3709
3709
3713
3717
3719
3724
3727
3734
3734
3740
3746
3756
3757
3760
3767
3768
3772
3779
3779
3783
3793
3793
3795
3797
3801
3802
3805
3811
3821
3821
3823
3825
3829
3833
3833
3842
3842
3844
3846
3847
3849
3852
3859
3860
3863
3864
3865
3867
3870
3878
3879
3881
3882
3884
3899
3900
3901
3904
3907
3912
3912
3913
3922
3936
3946
3960
3973
3974
3975
3975
3977
3980
3987
3989
3990
3991
3991
3992
3998
4021
4024
4024
4025
4025
4026
4028
4035
4036
4040
4043
4044
4052
4053
4056
4060
4067
4077
4078
4078
4079
4080
4094
4096
4111
4115
4116
4123
4132
4134
4137
4150
4153
4170
4170
4177
4179
4183
4184
4192
4194
4194
4204
4205
4206
4217
4221
4226
4228
4229
4229
4230
4231
4233
4237
4238
4243
4248
4249
4250
4250
4252
4254
4255
4256
4256
4256
4257
4257
4257
4258
4258
4259
4259
4264
4269
4270
4270
4272
4274
4275
4276
4276
4277
4277
4277
4278
4283
4285
4285
4285
4290
4290
4291
4292
4296
4300
4303
4303
4307
4308
4313
4314
4317
4318
4319
4319
4321
4322
4325
4326
4332
4336
4337
4340
4341
4342
4344
4350
4351
4352
4352
4353
4355
4358
4359
4360
4380
4392
4412
4422
4432
4440
4447
4449
4483
4485
4506
4531
4541
A probe is small piece of software that performs a dedicated task. CA Unified Infrastructure Management has two types of probes:
Monitoring probes gather availability and performance data. Some probes gather data from the computer on which they reside. Remote
probes monitor devices external to themselves, such as network switches and routers.
Service probes (also named infrastructure or utility probes) provide product utility functions.
Probes can be easily configured for your own specific monitoring requirements. For example, you can configure them to run at a specific time
(timed probe) or continuously (daemon probe). Each probe maintains its own configuration file.
The user information for core CA Unified Infrastructure Management components, such as the UIM database or UMP, is now available at the CA
Unified Infrastructure Management DocOps Space. You can use the following window to search the CA UIM DocOps space from this site.
Supported Platforms
Compatibility Support Matrix for the latest information about supported platforms
Support Matrix for Probes for additional information about the probe
How To Articles
View common probe use cases.
Release Notes
View a summary of the changes for a probe release.
Unified Dashboards
View information about the Unified Dashboards in UMP.
This article is a quick reference guide to help you identify the monitoring capabilities of a particular probe. Be aware that some probe versions are
only supported in Admin Console (AC) or Infrastructure Manager (IM). To view any older versions of a probe guide, see the IM Probe Archive or
the AC Probe Archive.
Probe ID
Name
Group
AC
Articles
IM
Articles
ace
System
AC
IM
ad_response
Application
AC
IM
ad_server
Application
AC
IM
adevl
Application
AC
IM
adogtw
Gateway
AC
IM
alarm_enrichment
Alarm Enrichment
Infrastructure
IM
apache
Application
AC
IM
apmgtw
Gateway
AC
IM
applogic_mon
Application
IM Archive
applogic_ws
Application
IM Archive
arcserv_d2d
ARCserv D2D
Application
IM Archive
arcserv_rha
ARCserv RHA
Application
IM Archive
assetmgmt
Asset Management
Infrastructure
IM
audit
Service
IM
automated_deployment_engine
System
AC
aws
Application
AC
IM Archive
azure
Application
AC
IM Archive
baseline_engine
Baseline Engine
SLM
IM
billing
Billing
Service
IM
capman_da
System
AC
casdgtw
CA ServiceDesk Gateway
Gateway
AC
IM Archive
cassandra_monitor
Cassandra Monitoring
Application
AC
ccm_monitor
Application
IM
cdm
System
AC
IM
celerra
Storage
AC
IM
cisco_monitor
Network
IM
cisco_nxos
Network
IM Archive
cisco_qos
Network
IM
cisco_ucm
Application
AC
IM
cisco_ucs
Application
AC
IM
cisco_unity
Network
IM Archive
clariion
Clariion Monitoring
Storage
AC
IM
cloudstack
Cloudstack
Application
IM Archive
cluster
System
AC
IM
cm_data_import
Data Import
Service
AC
cmdbgtw
Gateway
IM Archive
controller
Controller
Infrastructure
AC
IM
cuegtw
Gateway
AC
IM Archive
dap
Service
IM Archive
dashboard_engine
Dashboard Engine
Service
IM Archive
dashboard_server
Dashboard Server
Infrastructure
IM Archive
data_engine
Data Engine
SLM
AC
IM
db2
Database
AC
IM
dcim
System
IM
dhcp_response
Network
IM Archive
dirscan
System
AC
IM
discovery_agent
Discovery Agent
Service
AC
discovery_server
Discovery Server
Service
AC
diskstat
System
AC
IM
distsrv
Distribution Server
Infrastructure
AC
IM
dns_response
Network
AC
IM
e2e_appmon
Application
AC
IM
easerver
EAServer
Application
IM Archive
ecometer
CA ecoMeter
Application
IM
emailgtw
Email Gateway
Gateway
AC
IM
email_response
Application
AC
IM
ems
System
AC
ews_response
Application
AC
IM
exchange_monitor
Application
AC
IM
exchange_monitor_backend
Application
IM
exchange_monitor_reports
Installation
IM
fetchmsg
System
AC
IM
file_adapter
File Adapter
Adapter
AC
IM
flow
Flow Analysis
Network
IM Archive
fsmounts
System
AC
IM
google_app_engine
Application
IM Archive
google_apps
Application
AC
IM Archive
group_server
Group Server
Infrastructure
IM Archive
ha
High Availability
System
AC
IM
hadoop_monitor
Hapdoop Monitoring
Application
AC
hdb
Service
IM
health_index
Health Index
SLM
AC
history
System
AC
IM
hitachi
Hitachi Monitoring
Storage
AC
IM
hp_3par
Storage
AC
hpovsdgtw
HP OpenView SD Gateway
Gateway
IM Archive
hpsmgtw
Gateway
AC
IM Archive
httpd
Web Application
Infrastructure
IM Archive
hub
Hub
Infrastructure
AC
IM
hyperv
Application
AC
IM Archive
ibm-ds
Storage
AC
IM
ibm_ds_next
Storage
AC
ibm_svc
Storage
AC
IM
ibmvm
Application
AC
IM
ica_response
Application
AC
IM
ica_server
Terminal Server
Application
IM Archive
icmp
Network
AC
iis
Application
AC
IM
informix
Database
AC Archive
IM Archive
interface_traffic
Network
IM
iostat
System
IM Archive
jboss
Jboss Monitoring
Application
AC Archive
IM Archive
jdbc_response
Database
AC
IM
jdbcgtw
Gateway
AC
IM
jmx
Application
IM
jobqs
System
AC
IM
jobs
System
AC
IM
jobsched
System
AC
IM
journal
System
AC
IM
jvm_monitor
Application
AC
IM
ldap_response
Network
AC Archive
IM Archive
logmon
Log Monitoring
System
AC
IM
lync_monitor
Application
AC
IM
maintenance_mode
Maintenance Mode
Service
AC
mongodb_monitor
MongoDB Monitoring
Database
AC
monitoring_services
Monitoring Services
Service
AC
mpse
Service
AC
mysql
Database
AC Archive
IM Archive
nas
Alarm Server
Infrastructure
IM
net_connect
Network
AC
IM
netapp
Storage
IM
net_traffic
Network
IM
netware
System
IM Archive
nexec
Service
AC Archive
IM Archive
nfa_inventory
NFA Inventory
Gateway
AC
nis_server
NIS Server
Infrastructure
IM Archive
notes_response
Application
AC
IM
notes_server
Application
AC
IM
nq_services
NQ Services
Service
AC
nsdgtw
Gateway
AC
IM
ntevl
System
AC
IM
ntp_response
Network
AC
IM
ntperf
System
AC
IM
ntperf64
System
AC
IM
ntservices
System
AC
IM
ocs_monitor
Application
AC Archive
IM Archive
oracle
Database
AC
IM
outqs
System
AC
IM
packageeditor
Package Editor
Infrastructure
IM
perfmon
System
AC
IM
policy_engine
Policy Engine
Service
AC
power
Power Monitoring
Application
IM Archive
ppm
Service
IM
prediction_engine
Prediction Engine
SLM
AC
printers
Printer Monitoring
System
AC
IM Archive
processes
Process Monitoring
System
AC
IM
proxy
Proxy
System
IM Archive
pvs
PVS
Application
IM Archive
qos_processor
QoS Processor
SLM
IM
rackspace
Rackspace
Application
IM Archive
reboot
Reboot Computer
System
IM
remedy_response
Application
IM Archive
remedygtw
Remedy Gateway
Gateway
IM Archive
report_engine
Report Engine
SLM
IM Archive
rhev
Application
IM Archive
rsp
System
AC
IM
saa_monitor
Network
IM Archive
salesforce
Salesforce Monitoring
Application
AC
IM Archive
sdgtw
Gateway
AC
service_host
Service Host
Service
AC
sharepoint
Application
AC
IM
sla_engine
SLM
IM
smsgtw
Gateway
AC
IM
sngtw
ServiceNow Gateway
Gateway
AC
IM Archive
snmpcollector
Network
AC
snmpget
Network
IM
snmpgtw
SNMP Gateway
Gateway
IM
snmptd
Gateway
IM
snmptoolkit
Network
IM
spooler
Spooler
Infrastructure
AC
IM
sql_response
Database
AC
IM
sqlserver
Database
AC
IM
sybase
Sybase Monitoring
Database
AC
IM
sybase_rs
Database
AC
IM
sysloggtw
Gateway
AC
IM
sysstat
System
AC
IM
tcp_proxy
Service
AC
IM Archive
temperature
Temperature
System
IM Archive
threshold_migrator
Threshold Migrator
Service
AC
tnggtw
Gateway
IM Archive
tomcat
Application
AC Archive
IM Archive
topology_agent
Service
IM
udm_manager
UDM Manager
Service
AC
ugs
Service
IM
url_response
Application
AC
IM
usage_metering
Usage Metering
Service
IM
variable_server
Variable Server
Infrastructure
IM Archive
vcloud
Application
IM Archive
vmax
Storage
IM Archive
vmware
VMware Monitoring
Application
AC
IM
vplex
Application
AC
wasp
Service
AC
IM
webgtw
Web Gateway
Gateway
AC
weblogic
Weblogic Monitoring
Application
AC
IM
webservicemon
Application
AC
IM
webservices_rest
Service
IM
webservices_soap
Service
IM
websphere
WebSphere Monitoring
Application
AC
IM
websphere_mq
WebSphere MQ Monitoring
Application
AC
wins_response
Network
IM
xenapp
Application
IM
xendesktop
Application
IM
xenserver
Application
IM
xmlparser
XML Parser
Adapter
IM
zdataservice
Data Service
Application
zones
Application
IM
zops
Zops Monitoring
Application
IM
zstorage
zStorage Monitoring
Application
IM
zvm
Application
IM
agram shows the location of the navigation pane and details pane in the Admin Console probe configuration interface.
IM
The left navigation pane contains a hierarchical representation of the probe inventory which can include monitoring targets and configurable
elements.
The right details pane usually contains information that is based on your selection in the navigation pane.
Hide unmatched branches: Removes all nodes that do not match the search criteria from the navigation pane. Click to remove the
check mark from this option and the unmatched nodes reappear in the navigation pane.
Scope filter to currently selected node: Finds nodes that match the search criteria underneath the currently selected node. For
example, use the search field to locate all nodes with the Publish Data check box selected on their profiles. Then select this option
and enter additional search criteria to further refine monitoring targets and configuration elements listed in the navigation pane within
the already located nodes.
Show only nodes with configuration (for templates only): Highlights the nodes in the navigation pane that have monitors configured
and included in a template.
and
Name
Description
Help
Open/Closed
folders
Options
Click to view a list of actions you can perform related to the selected
object. The appearance and location of this icon can vary depending on
your version of CA Unified Infrastructure Management.
Profile or
Resource
or
- Discovery of
subcomponents is complete
- Discovery of
subcomponents is in progress
- Unknown error
- Discovery of
subcomponents failed
How To Articles
These are common how to articles for probes that are typically completed by CA UIM administrators.
Configuring Baselines
Publishing QoS Data
Supported Threshold Types
Running the threshold_migrator Probe
Configuring Alarm Thresholds
Variable Substitution Supported with ppm 3.20
Configuring Dynamic Alarm Thresholds
For Hubs Running ppm v3.11 and later
For Hubs Running ppm v3.0
For Hubs Running ppm v2.38
Configuring Static Alarm Thresholds
For Hubs Running ppm v3.11 and later
For Hubs Running ppm v3.0 and later
For Hubs Running ppm v2.38 and later
Configuring Time Over Threshold
For Hubs Running ppm v3.0 and later
For Hubs Running ppm v2.38
Configuring Time To Threshold
For Hubs Running ppm v3.0 and later
For Hubs Running ppm v2.38
Create Baselines for Probes Without Using the Web-based GUI
Hubs Running ppm v3.11 and later
Hubs Running ppm v2.38 and v3.0
Create Thresholds for Probes Without Using the Web-based GUI
Hubs Running ppm v3.11 and later
Hubs Running ppm v2.38 and v3.0
Configuring Baselines
The baseline_engine probe calculates and provides a baseline for all monitoring QoS metrics in the UIM domain that it belongs to using the
following process:
1. The baseline_engine probe listens for messages with the subject QoS on the message bus.
2. When the Compute Baseline check box is selected for a QoS metric in a monitoring probe GUI, the baseline_engine samples the QoS
data up to 30 times during each interval. This sampling rate provides a statistically accurate baseline while minimizing system resource
use.
3. At the top of each hour, baseline_engine calculates a baseline data point for each QoS monitor that has Compute Baseline enabled.
4. baseline_engine sends the data points to the qos_processor probe, which processes the data and writes it to the UIM database.
Publish Data - publishes QoS data from monitoring probes to the UIM message bus.
Compute Baseline - allows the probe, or the baseline_engine and/or prediction_engine (on behalf of the probe), to compute hourly
baselines based on QoS data.
Note: If a probes do not support static alarms, you might see an Alarm Thresholds configuration section at the top of the page.
Dynamic - A dynamic threshold is calculated on variance from the calculated static baseline with no averaging. Variances can be set to one of the
following algorithms:
Scalar - A set value past the calculated baseline.
Percent - A set percent past the baseline.
Standard Deviation - A set standard deviation past the baseline.
Note: Before you use the threshold_migrator probe, the latest version of the probe must be deployed to each target device.
For more information about using the threshold_migrator probe and the probes that can be migrated, go to the threshold_migrator (Threshold
Migrator) article.
Note: If you enter ${ and select a variable for a text field where the variables are displayed but not applicable (for example, the
Subsystem field) the system ignores the variable.
Important! In order to create dynamic alarm thresholds, you must have the baseline_engine probe version 2.0 or later installed on the
hub robot and configured.
To set a Dynamic alarm, determine the version of the ppm probe running on the hub robot and complete one of the following procedures:
Hub robots running ppm v3.11 and later
Hub robots running ppm v3.0
Hub robots running ppm v2.38
3.
Note: If you cannot select the Compute Baseline check box, click
deployed or activated.
4. Click the Dynamic Alarm check box. The Dynamic Alarm options become available.
5. Choose an Algorithm to use:
Scalar - Each threshold is a specific value from the computed baseline.
Percent - Each threshold is a specific percentage of the computed baseline.
Standard Deviation - Each threshold is a measure of the variation from the computed baseline. A large standard deviation indicates
that the data points are far from the computed baseline. A small standard deviation indicates that they are clustered closely around the
computed baseline.
6. Choose an operator for the threshold:
> An alarm occurs when the metric is greater than the set threshold.
>= An alarm occurs when the metric is greater than or equal to the set threshold.
< An alarm occurs when the metric is below the set threshold.
< = An alarm occurs when the metric is below or equal to the set threshold.
= An alarm occurs when the metric is equal to the set threshold.
!= An alarm occurs when the metric is not equal to the set threshold.
7. Set the threshold for each alarm state.
8. (Optional) Enter a different (overriding) Subsystem ID using the Subsystem (override) field. This is only required if the Subsystem ID
shown in the Subsystem (default) field is not correct for your configuration.
This option is available with baseline_engine 2.1 or later.
9. (Optional) Create a custom alarm message for your environment:
a. In the Custom Alarm Message field, enter or select variables (see variable substitution for details) to include in a custom message.
The available variables are:
${baseline} - The baseline calculated for a QoS metric if the Compute Baseline option is selected for the metric. Baselines are not
calculated for static messages, so this value will always be zero for static alarms.
${level} - The numerical severity level of the alarm. Valid values are: 1 (critical), 2 (major), 3 (minor), 4 (warning), or 5 (information)
${operator} - The operator (>, >=, <, <=, ==, or !=) for the alarm severity level.
${qos_name} - The name of the QoS metric.
${source} - The source of the QoS metric that generated an alarm.
${target} - The target of the QoS metric that generated an alarm
${threshold} - Specifies the threshold upon which an alarm is generated.
${value} - Specifies the value contained in the generated QoS message.
b. In the Custom Alarm Clear Message field, enter the message displayed when the alarm and the source of the alarm is returned to a
normal state.
The available variables include:
${qos_name} - The name of the QoS metric.
${value} - Specifies the value contained in the generated QoS message.
${source} - The source of the QoS metric that generated an alarm.
10. Save your settings.
Note: If you cannot select the Compute Baseline check box, click
deployed or activated.
4. Click the Dynamic Alarm check box. The Dynamic Alarm options become available.
5. Choose an Algorithm to use:
5.
Scalar - Each threshold is a specific value from the computed baseline.
Percent - Each threshold is a specific percentage of the computed baseline.
Standard Deviation - Each threshold is a measure of the variation from the computed baseline. A large standard deviation indicates
that the data points are far from the computed baseline. A small standard deviation indicates that they are clustered closely around the
computed baseline.
6. Choose an operator for the threshold:
Greater than (>) - An alarm occurs when the metric increases past the set threshold.
Less than (<) - An alarm occurs when the metric falls below the set threshold.
7. Set the threshold for each alarm state.
8. Save your settings.
If you are using baseline_engine 2.1 or later, you can also change the Subsystem ID using the Subsystem (override) field. This is only required
if the Subsystem ID shown in the Subsystem (default) field is not correct for your configuration.
Important! In order to create dynamic alarm thresholds, you must have the baseline_engine probe version 2.0 or later installed on the
hub robot and configured.
To set a Static alarm, determine the version of the ppm probe running on the hub robot and complete one of the following procedures:
Hub robots running ppm v3.11 and later
Hub robots running ppm v3.0
Hub robots running ppm v2.38
Note: If you used the threshold_migrator probe to add the static threshold fields for a probe, the operator and configured
threshold are carried over.
4.
3. Click the Publish Data, Publish Alarms, and Compute Baseline (for a dynamic alarm) check boxes.
4. Click either the Dynamic Alarm or Static Alarm check box.
5. Configure the dynamic or static alarm settings. See dynamic alarms or static alarms for details.
6. For either a dynamic or static alarm, select the Enable Dynamic Time Over Threshold or Enable Static Time Over Threshold check
box.
7. Enter values for the following fields:
Time Over Threshold <TOT> - The length of time a metric must remain over threshold before an alarm is sent.
Sliding Time Window <TW> -The length of time in the sliding window in which metrics are monitored for threshold violations.
Time Units for <TOT> and <TW> - The unit of measurement used by the Time Over Threshold and Time Window parameters.
Limited to minutes, hours, or days.
Automatically Clear Alarm - Enables the Auto-clear functionality.
Clear Delay Time - The length of time used in the Auto-clear timer. If no alarms are sent in the set time period, the alarm is
automatically cleared.
Time Units for <TC> - The unit of measurement used by the Auto-clear. Limited to minutes, hours, or days.
8. Save your changes.
The following changes will take effect immediately:
New Time Over Threshold rules.
Changes to the Clear Delay Time parameter.
Changes to the Time Over Threshold active state.
The following changes will take effect at the next received alarm:
Changes to the Time Over Threshold parameter.
Changes to the Sliding Time Window parameter.
5. Select the Time To Threshold Alarm check box to configure a Time To Threshold alarm.
Note: You must select the Publish Data check box before you can configure the Time To Threshold settings.
To set up baselines for probes without using the web-based configuration, use the following command:
queue: Indicates to send configurations over the BASELINE_CONFIG queue instead of using callbacks.
Note: If you specify one operator, you must specify all operators.
Note: The alarm threshold values are generated in the format 50.0 (for 50%). To generate an alarm, you must specify at least
one level alarm threshold value.
subsysId: The subsystem ID of the QoS for which the thresholds are being defined. Only one subsystem ID can be specified using the
subsysId option.
threshID (Optional): Unique ID which distinguishes between multiple thresholds of the same threshType and id (metric ID).
delete: Remove the threshold identified by the id (metric ID), threshType, and threshID.
customAlarmMessage: A custom alarm message generated as the alarm message when a threshold is breached. Variables include:
${baseline} - The baseline calculated for a QoS metric if the Compute Baseline and Dynamic Alarm options are selected for the metric.
Baselines are not calculated for static messages, so this value will always be zero for static alarms.
${level} - The numerical critical level of the alarm. Valid values are: 1 (critical), 2 (major), 3 (minor), 4 (warning), or 5 (information)
${operator} - The operator (>, >=, <, <=, ==, or !=) for the critical level of the alarm.
${qos_name} - The name of the QoS metric.
${source} - The source of the QoS metric that generated an alarm.
${target} - The target of the QoS metric that generated an alarm
${threshold} - Specifies the threshold upon which an alarm is generated.
${value} - Specifies the value contained in the generated QoS metric.
EXAMPLE: -customAlarmMessage ${qos_name} is at ${value}
customClearAlarmMessage: A custom alarm message generated when the alarm and the source of the alarm are returned to a normal
state. Variables include:
More Information:
Configuring Time Over Threshold
Configuring Time To Threshold
prediction_engine probe article
Automated Usage Metering and Billing is an automated billing process for customers who have:
Licenses based on quantity and minimum commitment (Min-Commit) subscriptions, or
Non-subscription contracts that require periodic auditing for renewals
In the past, customers installed a CA subscription file into their UIM monitoring environment, deployed the usage_metering probe across the
environment, and manually ran the billing probe to collect the information generated by usage_metering. This generated a monthly Billing Report
that customers emailed to CA Sales. Along the way, configuration issues were common, and customers often had to manually collect the storage
capacity information and enter it into the report.
The new Automated Usage Metering and Billing process automates these workflows and eliminates the manual intervention required to upload a
report to CA. To deliver reports, users can either "opt-in" to enable the automatic transfer of billing reports to CA via a secured web gateway
channel, or manually email the reports, which are always stored locally.
This article provides:
An overview of the automated billing process.
Set-up instructions:
1. Meet the prerequisites
2. Deploy webgtw and ppm
3. Configure webgtw
4. Deploy or upgrade usage_metering (if necessary)
5. Deploy or upgrade billing
Instructions on using this process with earlier versions of Nimsoft Monitor.
Prerequisites
Before you configure this process, you must:
Ensure your environment has been upgraded to NMS 7.5 or later, or to UIM Server 8.0.
Download the following probes from the web archive to your local archive:
billing 8.00 or later
This package includes the current and prior versions of the subscription files.
webgtw 8.00 or later
ppm 2.39 or later
usage_metering 8.00 or later
Although billing 8.00 is compatible with usage_metering 2.11, usage_metering 8.00 has device
extraction enhancements and is recommended for upgrade.
Obtain your Customer_ID from CA (a 4- to 8-digit number). You need this number to configure the webgtw probe.
Your Customer_ID was emailed to you with your introduction to this new process. If you have not received yours, please send a
request to nimsoftinquiries@ca.com.
If either webgtw or ppm does not automatically start, activate the probe manually.
Deployment is complete.
Configure Webgtw
Follow these steps.
1. Log in to Admin Console and open the webgtw probe configuration GUI.
2. On the Contact Information node:
Instance ID is your system-generated GUID (read-only).
Customer ID (provided in your initial email from CA), Company name and Contact email address are optional. If you enter the
information here, it is automatically included in your reports.
Terms of Use must be accepted (select Actions > View Terms of Use Agreement to display the terms).
Accepting the terms is the "opt-in" action that enables automatic report transfer. If Terms of Use is not checked, no HTTP
I/O is allowed, and you will need to manually email your reports to CA.
The usage_metering probe scans your UIM environment to identify the devices you are monitoring. The results are retrieved by the billing probe,
which must reside on the same robot as usage_metering.
If usage_metering is already deployed in your UIM environment, you do not need to upgrade. Older versions of the UM probe are compatible with
the new process and will provide consistent data for the billing probe. However, CA recommends that you upgrade all instances of the probe to
version 8.0 or later to take advantage of performance improvements.
If you need to:
Upgrade usage_metering, deactivate the billing and usage_metering probes before you upgrade.
Deploy usage_metering, consider the following:
An instance of usage_metering must reside on the same robot as the billing probe. This is called the primary instance.
The probe must perform DNS resolution to uniquely identify a probe's device references. In many cases, the DNS needed to correctly
resolve a device reference is not available to the primary instance. To address this, deploy a secondary instance to a hub that can
access a DNS.
Once usage_metering is deployed, no further configuration is required.
For more information on usage_metering, refer to usage_metering on the CA Unified Infrastructure Management Probes site.
Maintenance
As they become available, CA recommends you download and deploy new versions of these probes. This gives you the latest subscription files
and ensures that you benefit from improvements made to the probes.
The production robot now has a unique Instance ID and an encrypted password for webgtw. Because the terms-of-use are accepted, the opt-in
option is accomplished.
Optionally, you may want to adjust robot-to-hub communication. By default, the parent hub requests messages from a passive robot
every 15 seconds and allows up to 1000 messages per interval.
1.
The robot is now in passive mode, and its parent hub is configured to request messages from it. Before you deploy marketplace packages, you
must specify the marketplace user.
Ensure that the OS user that you use as the marketplace user has file access permissions to the marketplace probe directory. You want
to verify that the marketplace user does NOT have file access to other non-marketplace probe directories, for security reasons. The
marketplace probe is designed to operate as a sandbox with access permissions only to its own files.
Notes:
The password is only required on a Windows robot. The password is visible when you enter it. The username and password
are encrypted in the configuration file.
The Windows user must have "log on as batch job" permissions.
6. Click Save.
The robot configuration for marketplace probe deployment is complete. If you want to adjust the hub-to-robot communication (optional), you can
modify communication settings for passive robots.
Prerequisites
Time Over Threshold Workflow
Alarm Suppression During Time Over Threshold
Alarm Clear Conditions Using Time Over Threshold
Alarm Severity Changes During Time Over Threshold
Supported Threshold Types
Additional Time Over Threshold Scenarios
Best Practices for Time Over Threshold
Configure Time Over Threshold
Troubleshooting Time Over Threshold
I see Errors Regarding alarm_enrichment
The Time Over Threshold Configuration Parameters are Unavailable
Prerequisites
To use Time Over Threshold, you must have the following probe versions installed at each hub level where Time Over Threshold functionality is
desired:
alarm_enrichment 4.40 or later
baseline_engine 2.34 or later
nas 4.40 or later
Probe Provisioning Manager (PPM) 2.38 or later
prediction_engine 1.01 or later
Time Over Threshold (TOT) is an event processing rule that allows you to reduce the number of alarms that are generated when threshold
violation events occur. You can use Time Over Threshold to filter out data spikes and monitor problematic metrics over a set period. Instead of
sending an alarm immediately after a threshold violation has occurred, Time Over Threshold:
Monitors the events that occur during a user-defined sliding time window.
Tracks the length of time that the metric is at each alarm severity.
Raises an alarm if the cumulative time the metric is in violation during the sliding window reaches the set Time Over Threshold.
Example: Time Over Threshold in a Consecutive Block
This example uses the following settings:
The Time Over Threshold does not have to occur consecutively within a sliding time window. All of the time in a sliding window is counted toward
the Time Over Threshold.
Example: Time Over Threshold in a Nonconsecutive Block
This example uses the following settings:
Sliding Window: 30 minutes.
Time Over Threshold: 10 minutes.
Auto-Clear: Not set.
Alarm Severities Set: Clear, Information, Warning, Minor, and Major alarm thresholds are set in the probe GUI.
1. The baseline_engine probe evaluates QoS metrics from probes against static and dynamic threshold definitions.
2. The baseline_engine probe generates threshold violation messages when thresholds are crossed.
3. The nas probe implements the Time Over Threshold event processing rule to filter out data spikes. This event processing produces a
more accurate reflection of threshold violation behavior.
Note: Auto-clear times are retained when the alarm_enrichment probe is not active. If the alarm_enrichment probe stops and is then
reactivated, any running Auto-clear timers are restarted with either:
The time of the original Auto-clear, if it is still in the future.
One-minute, if the original Auto-clear time is in the past.
In this example:
1. Time 20 - A Time Over Threshold alarm is raised after ten minutes of Time Over Threshold event time is accumulated. The alarm
severity is set to 1, because the first Time Over Threshold rule condition that matches is 'event severity is 1 or greater'.
2. Time 25 - The severity is elevated to 2 because the Time Over Threshold rule condition 'event severity is 2 or greater' is now true
3. Time 30 - The severity is elevated to 3 because the Time Over Threshold rule condition 'event severity is 3 or greater' is now true.
Note: Time Over Threshold only evaluates on alarm severity levels that are set in the probe configuration GUI.
In this example:
1. Time 30 - A Time Over Threshold alarm is raised after ten minutes of Time Over Threshold event time is accumulated. The Time Over
Threshold alarm severity is set to 3, because the first Time Over Threshold rule condition that matches is 'event severity is 3 or greater'.
Example: Time Over Threshold With Multiple Severities
This example uses the following settings:
Sliding Window: 8 minutes.
Time Over Threshold: 4 minutes.
Auto-Clear: 4 minutes.
Alarm Severities: Clear, Information, Warning, Minor, and Major alarm thresholds are set in the probe GUI.
Alarm Suppression: On.
In this example:
1. Time 8 - A Time Over Threshold alarm is raised after four minutes of Time Over Threshold event time is accumulated. The alarm severity
is set to 1, because the first Time Over Threshold rule condition that matches is 'event severity is 1 or greater'.
2. Time 10 - The severity is elevated to 2 because the TOT rule condition event severity is 2 or greater is now true.
3. Time 16 - The severity is elevated to 3 because the TOT rule condition event severity is 3 or greater is now true.
4. Time 21 - The alarm severity decreases to 2 because there are no longer 4 minutes or more of severity 3 or greater within the 8-minute
sliding window, but there are 4 minutes or more of severity 2 or greater
5. Time 25 - The alarm severity decreases to 1 because there are no longer 4 minutes or more of severity 2 or greater within the 8-minute
sliding window, but there are 4 minutes or more of severity 1 or greater
6. Time 30 - The alarm is cleared because no new violations occur for four-minutes and the auto-clear condition is met.
In this example:
1. Time 8 -Three-minutes of time to first byte of 100 ms or greater is observed in the sliding window and an alarm of severity 2 is sent.
2. Time 14 - Three-minutes of time to first byte of 300 ms or greater is observed. The alarm increases to severity 3.
3. Time 20 - Three-minutes of time to first byte of 700 ms or greater is observed. The alarm increases to severity 4.
4. Time 25 - Three-minutes of time to first byte of 1000 ms or greater occurs. The alarm increases to severity 5.
Example: CDM Probe Metric Disk Usage
This example uses the following settings:
Sliding Window: 45 minutes.
Time Over Threshold: 5 minutes.
Auto-Clear: Not set.
Alarm Severities: The Critical alarm threshold is set to 80% in the probe GUI.
In this example:
1. Time Over Threshold only occurs for four-minutes and no alarm is sent.
Example: CDM Probe Metric Disk Usage (Modified to Send a Time Over Threshold Alarm)
This example uses the following settings:
Sliding Window: 15 minutes.
Time Over Threshold: 5 minutes.
Auto-Clear: 5 minutes.
Alarm Severities: The Critical alarm threshold is set to 80% in the probe GUI.
1. Time 15 -Five-minutes of disk usage at 80% or greater is observed in the sliding window and an alarm of severity 5 is sent.
2. Time 21 - The alarm is cleared after five-minutes of time below the set severity level.
are too large for your system can result in the suppression of alarms you may need to be aware of.
Symptoms:
I do not see the Time Over Threshold configuration parameters in the Admin Console GUI of my probe.
I do see the Dynamic Threshold configuration parameters.
I have received no additional error messages or alarms.
Solution:
Verify that the correct versions of nas, ppm, and prediction_engine are installed and activated at the Hub level.
More Information:
Configuring Alarm Thresholds
Prerequisites
To use Time To Threshold, you must have the following probe versions installed at the secondary hub level:
baseline_engine 2.34 or later
Probe Provisioning Manager (PPM) 2.38 or later
prediction_engine 1.01 or later
Time To Threshold is an event violation rule that sends an alarm if a QoS metric is predicted to reach a set value within a user-defined time
period. Setting a Time To Threshold alarm for any of the QoS-enabled probes allows the prediction_engine probe to gather trending information
used to calculate when a particular event might occur. The Time To Threshold settings are configured using the Admin Console.
At any time, the condition that triggers the Time To Threshold alarm can change (for example, files were deleted to free up space) or the Time To
Threshold alarm settings can be reconfigured.
When you configure the Time To Threshold settings on a QoS-enabled probe, the following actions occur:
data_engine stores raw QoS metric data from the QoS-enabled probes
prediction_engine computes trend information for QoS metrics that have Time To Threshold configured
baseline_engine probe evaluates QoS trend metrics from probes against static threshold definitions
baseline_engine probe generates threshold violation messages when the Time To Threshold trend timeframe and prediction thresholds
are crossed
Release Notes
Release Notes are organized alphabetically by probe name. Release notes contain:
Changes that are made within a probe release
Software and hardware requirements
Installation and upgrade information
Contents
Revision History
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
Revision History
This section describes the history of the revisions for the ad_response probe.
Version
Description
State
Date
1.61
Fixed Defects:
GA
July 2015
GA
November
2013
The probe did not change the information level in the log file even after changing the log level from 0. Salesforce case:
00145920
The probe did not use the selected authentication type while creating LDAP connection. Salesforce cases: 00144532,
00144708
1.60
1.53
June 2013
1.52
June 2012
1.51
December
2011
1.50
June 2011
December
2010
1.30
Updated the latest Nimsoft Dot Net API with SSL support.
June 2010
1.21
March
2010
1.20
February
2010
1.11
If controller is configured to use robot name for QoS, the probe will send configured name in the QoS source. Otherwise the
probe will send the host name in the QoS source.
September
2009
1.10
June 2009
Added fix to send source name / host name in lower case with alarm.
Updated bmake and package file for win64 support.
1.06
Updated the package file and added code to change the description of the probe.
Updated code to calculate object age if write mode is enabled.
Updated code to validate the provided domain.
Updated code to validate text inputs(connection) for blank data.
Updated code to change the test connection message.
Updated code to disable the write mode if GC is selected.
Updated code to modify the QoS SampleTime to UTC.
September
2008
1.05
Updated the package file and added default security permission to write.
July 2008
Updated the source code to use new Nimsoft Dot Net API.
May 2008
November
2006
Fixed an exception when a new objects found threshold was added to a new search profile.
Fixed a configurator problem, when the configurator was started from machines without a running controller.
Fixed test connection error message.
Replaced NimbusAPI.dll with a newer version.
Installation Considerations
The installation considerations for the probe are:
The robot where the probe is installed must be on the same domain as the AD server.
The probe requires remote user access with administrator rights so that you can query for any other user.
Upgrade Considerations
Upgrading from pre-GA Active Directory probe:
The pre-GA version of the Active Directory probe was simply called "active_directory". From the version 1.11 GA release, this probe has been
renamed to "ad_response". If you have the older version of probe installed, please perform the following steps in order to upgrade correctly.
1. Deactivate the "active_directory" probe.
2. Deploy the "ad_response" probe from local archive.
Note: New license keys are required because the probe name is changed.
Active Directory is the directory service included with the Windows servers to manage the identities and relationships for managing network
environments.
The active directory server (ad_server) probe monitors the selected counters on Active Directory (AD). These counters measure the availability
and response time of the active directory server and perform health checks to prevent outage and degradation conditions.
Contents
Revision History
Supported Locales
Threshold Migration Configuration
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrades and Migrations
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for the ad_server probe.
Version
Description
State
Date
1.80
What's New:
GA
June 2015
GA
April 2015
The probe can now be migrated to standard static alarm thresholds using the threshold_migrator probe.
Added support for 32 bit. Salesforce case 00141918
Fixed Defects:
The alarms did not change if a profile created in the IM GUI was edited in the Admin Console GUI. Salesforce case 001
55151
Metrics were not configured properly in the Admin Console GUI. Salesforce case 00155145
The probe did not test Health Monitors. The server returned a referral to the probe. Salesforce case 00165194
1.71
Fixed Defects:
The probe was automatically selecting existing counters for new profiles and also generating QoS for these counter. Sal
esforce case 00158381
New Feature
Added an option in IM GUI to enable or disable alarms.
1.70
New Features:
June 2014
Added localization support for Simplified Chinese, Japanese, Korean, Spanish, German, and B-Portuguese languages
from IM and Web GUI. For localization support through WEB GUI probe must run with NMS 7.6 or later version and
PPM 2.34 or later version.
Added support for health monitoring through VB GUI also.
1.61
What's New:
Added support for monitoring Active Directory Replication Subsystems.
Added support for monitoring Active Directory Lost and Found Objects.
Added support for monitoring Trust Relationships between AD Servers.
Added back the support for Windows Server 2012 R2.
March
2014
1.60
1.50
What's New:
Implemented a new feature through web GUI for monitoring the active directory server health from CA Unified Infrastructure
Management 7.5 onwards.
Fixed an issue where upgrading to version 1.50 failed to start the probe. Software Requirements section is updated with
correct version of .Net Framework, which is required to run the probe.
March
2014
December
2013
June 2012
1.42
December
2011
1.41
November
2011
1.30
June 2010
1.20
April 2010
If controller is configured to use robot name for QoS, the probe will send configured name in the QoS source. Otherwise the
probe will send the host name in QoS source.
Added restart callback.
Added fix for clear alarms.
September
2009
August
2009
Added fix to check the counters of process profile which are available in config file. Also fixed alarm and qos issues.
Added fix to send source name / host name in lower case with alarm or qos. Updated latest nimsoft dot net api with
version 1.6 which supports non standard ports. Migrated ad_server projects to visual studio 2008.
June 2009
Added fix to show local time of the robot on which probe is deployed.
Added fix for event user id if its value is NULL.
Modified code to show event log counters if only log type is selected.
Added fix to send alarm and QoS even if the event id is not specified.
Added fix to save event id in event profile and include subdirectories option in filesystem profile properly.
Fixed issue when event log vales are not correct.
Set default log level to 0.
Added fix to relogin when sid is expired.
Added fix for probe not running or load configuarion error.
Added fix to disable 'Apply' button on opening configurator.
Added fix not to save config file twice if 'Apply' button is clicked.
Added fix to read loglevel from config file and set it to probe .
Added fix to load / save configurator on tunneling setup properly.
Reverted nimsoft dot net api from version 1.5 to 1.1.
1.02
May 2008
Initial version.
Supported Locales
The ad_server probe now supports the following locale.
Simplified Chinese
March
2008
German
Italian
Japanese
Korean
Portuguese
Spanish
Note: .Net Framework 4.5 is required if the robot and CA Unified Infrastructure Management are installed on the same
machine.
The ad_server probe requires the following software environment to migrate with threshold migrator probe:
CA Unified Infrastructure Management 8.3 or later
CA Unified Infrastructure Management robot 7.5 or later (recommended)
Java JRE version 7 or later
Probe Provisioning Manager (PPM) probe version 3.21 or later
baseline_engine (Baseline Engine) version 2.60 or later
Installation Considerations
The ad_server probe is deployed with a default configuration, with a preconfigured profiles to be monitored. These might require some
adjustments, such as the path for the DSA file. The default WMI counters are configured for a Windows 2003 AD Server. They might not work on
Windows 2000 servers.
You can set Events, Files, Filesystems, Performance counters, Health Monitors, Processes, Services and WMI counters as required and suitable
for monitoring the AD server.
Note: You can configure only two thresholds in probe version 1.71 or later. However, if more than two thresholds were configured in the
previous version(s), they will be available in this version also.
B-Portuguese
Chinese (traditional and simplified)
French
German
Italian
Japanese
Korean
Spanish
Contents
Revision History
Installation Notes
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues
Revision History
This section describes the history of the revisions for the adevl probe.
Version
Description
State
Date
2.01
What's New:
GA
June 2015
GA
September
2014
GA
November
2013
GA
December
2012
Added support for factory templates with CA Unified Infrastructure Management 8.3.
Added functionality to allow user to select log name from the drop-down menu and enter log name in log field.
2.00
What's New:
Added the localization support for B-Portuguese, Chinese (simplified and traditional), French, German, Italian,
Japanese, Korean, and Spanish languages from both IM and Admin Console GUI. For localization support through
Admin Console GUI, the probe must run with PPM 2.38 or later version.
Updated the probe IM GUI and Admin Console GUI for specifying the character encoding in different locales.
Note: Do not use the Raw Configure GUI for updating the probe configuration in the non-English locales as it can corrupt the
configuration file.
1.60
1.56
1.54
GA
June 2012
1.53
GA
December
2011
GA
June 2010
GA
June 2010
1.41
1.40
Enhanced the probe to allow generation of variables from message body and also to send alerts on this variables.
Added support to raise an alert only after particular number of instances of an event, within the particular time frame.
1.30
GA
March
2010
1.23
Resolved the problem where only a partial event list was retrieved. The most obvious situation was on computer restart
GA
March
2010
1.22
1.21
Added a fix in evlWmi library for fetching InsertionStrings column value from WMI if Message value is not available.
Fixed a crash in evlWmi library.
GA
December
2009
GA
November
2009
GA
October
2009
GA
September
2009
GA
September
2007
GA
April 2006
GA
August
2004
Added a fix for replacing recurring hard returns with a single delimiter in description field.
Added a feature in the GUI to enable/disable removal of recurring hard returns.
1.20
Added fix in the probe and GUI for replacing hard returns with user-defined delimiter in event description field.
Fixed Day Light Saving time issue.
Stopped using regular expression comparisons to detect duplicate events.
1.10
Added support for adding custom event logs to monitor. Custom logs can be added using raw configure and assigning
appropriate key value paris in section of configuration file.
Added support for using variables like $event_id in suppression key.
Added support for using variables in custom alarm messages.
Added support for using localhost as computer name to get only local machine events.
The probe now supports option to enable/disable propagation of events.
The probe now supports 64 Bit Windows environment.
The probe now supports Windows 2008 operating system.
Updated WMI library for handling custom event logs.
Added key (wmi_timeout) in the setup section of configuration file. This key can be used to set the WMI query timeout in
seconds if there are huge number of events.
Added fix in Windows Vista running service pack version 1 or below to fetch the event indexes using WMI. Vista version
prior to SP2 had an issue where the probe was unable to fetch the event indexes properly.
Added a fix in the evlWmi library for handling computer's FQDN. In some windows platforms when a machine is in a
domain the computer field of event logs shows computer FQDN. Earlier the probe was failing when checking the
Computer field in the Exclude tab.
1.06
1.05
1.02
Installation Notes
The adevl probe monitors the event logs for new messages and generates alarm messages according to your setup. You can configure the probe
for triggering each time a new message is added to the event log or you can check the event log for new messages at a fixed interval, which will
reduce the system load generated by the probe. Consider the following points while installing the probe:
Restart the probe when the time zone is changed or when you select "Automatically adjust clock for Daylight Saving Time" option on the
system where the probe is deployed.
The Windows event log watcher probe version 3.0x uses WMI to retrieve the event logs. Accessing Windows event logs using WMI may
severely affect the performance of the Windows 2000 system. If the probe is deployed on Windows 2000 system, the probe will raise an
alarm and will stop execution.
Known Issues
The adevl probe has the following limitations:
With all CA Unified Infrastructure Management Versions:
The probe does not support monitoring of forwarded events .
Localization is not supported on Windows ia64 platform.
Do not use the same profile name for ntevl and adevl probes, when deployed on same robot.
Use either IM GUI or AC GUI of the probe to avoid any unexpected issues that can occur during probe configuration.
With CA Unified Infrastructure Management 8.0 onwards:
The probe GUI can stop responding when the Maximum Events to Fetch field value is more than 1000. In case, the probe GUI has
already stopped responding; follow these steps:
1. Open the Raw Configure GUI.
2. Update value of the fetch_number key (under setup section) to 1000 or less.
3. Restart the probe.
With CA Unified Infrastructure Management 7.6 or earlier:
The Raw Configure GUI of the probe is not supported for non-English locales because it can corrupt the entire probe configuration file.
The probe GUI can stop responding when the Maximum Events to Fetch field value is more than 1000. In case the probe GUI has
already stopped responding, follow these steps:
1. Open the IM probe GUI.
2. Update value of this field to 1000 or less (under the Properties tab).
3. Restart the probe.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
Known Issues
Revision History
This section describes the history of the revisions for the adogtw probe.
Version
Description
State
Date
2.71
Fixed a defect in which the alarm target was not setting properly.
GA
April 2013
Fixed an issue which the correct date/time format was not inserting in subscriptions as configured through adogtw
UI.
2.70
GA
December 2012
2.60
GA
June 2010
2.50
2.48
Configurator fix - alarm level in profile correctly mapped to color and alarm level name.
Added subsystem drop down list for alarm profiles
Fix: added code to expand column id in alarms in QoS profiles
Fix: problem when an alarm text was too long and got clipped by adogtw, and the last sign in the clipped text was a '.
GA
October 2007
2.47
Fix: GUI could cause a runtime error if you entered an integer value as name for a new profile or connection.
Added: hotkeys for common commands.
Added: dropdown for alarm severity instead of textfield in alarm profile dialog.
Data of type Numeric was sent as integer not floating point.
GA
September
2006
2.44
GA
February 2005
September
2009
Installation Considerations
On Windows XP and Windows 2003 64-bit platforms, the 64-bit ODBC drivers are not available by default. User needs to update the drivers from
the URL: http://www.microsoft.com/en-us/download
Post-installation of these drivers, the probe may still generate the following error logs:
COM Error [0x800a0ea9] Unknown error 0x800A0EA9 - [ADODB.Connection] Provider is not specified and there is no designated default
provider.
COM Error [0x800a0e7a] Unknown error 0x800A0E7A - [ADODB.Connection] Provider cannot be found. It may not be properly installed.
The workaround for this situation is to create the database connection using OLEDB instead of ODBC.
Upgrade Considerations
NIS (TNT2) notes
When upgrading from GA to TNT2 version, OR downgrading from TNT2 to GA version, the probe-specific file adogtw_alarm.txt should be
removed from the working directory in order to ensure a smooth upgrade or downgrade.
Known Issues
Revision History
Requirements
Environment
Java Requirement
Hardware Requirements
Contents of aggregate_alarm-1.0.1 Zip File
Download and Deploy
Known Issues
Revision History
This section describes the history of the revisions for the aggregate_alarm probe.
Version
Description
1.01
State
Date
Controlled Release
August 2015
Requirements
Environment
For the aggregate_alarm probe to function properly, verify that the following probes are deployed and running on the primary hub. These probes
are automatically deployed to the primary hub when you run the CA UIM Server Installer v8.31.
Admin Console 8.31
Alarm Server (nas) v4.73 and alarm_enrichment v4.73
Probe Provisioning Manager (ppm) v3.22
CA Unified Management Portal (UMP) v8.31 is optionally required to allow the drop-down lists in the GUI to be populated with data from the UIM
database.
Java Requirement
Java 7 (java_jre1.7) - the hubs where the required probes are running should have java_jre1.7 loaded in the Installed Packages in Admin
Console (typically installed with UIM v8.0 and later)
Hardware Requirements
The aggregate_alarm probe should be installed on systems with the following minimum resources:
Memory: 512 MB of RAM
CPU: 3 GHz dual-core processor, 32-bit or 64-bit
Known Issues
None.
The Apache HTTP Server Monitoring (apache) probe monitors all the Apache-based HTTP servers of an organization to detect any performance
issues. It performs HTTP GET queries to the specified Apache web servers and transforms the query result into alarms and Quality of Service for
Service Level Agreements (SLA).
Contents
Revision History
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.62
Fixed Defects:
GA
November
2015
GA
July 2014
When creating a host profile, the probe crashed when the user entered incorrect server status and server address for HTTP
response. Salesforce case 00170024
1.61
Fixed Defects:
The IM GUI help link was opening the old help document instead of the new online probe documentation. Salesforce case
00135559
1.60
June 2014
1.55
October
2013
1.54
June 2013
1.53
April 2013
March
2013
1.51
January
2011
1.50
1.41
1.30
1.20
1.10
December
2010
November
2010
June 2010
March
2010
September
2009
Error message from cURL library can now be used in the agent error alarm by using $curl_error.
Fixed a minor GUI bug. When the text field for hostname/ip lost focus, the content of the url field would update. The
same when you toggle the SSL on | off, the content of the url field would be overridden. Now behavior is that content of
url field is filled out if its empty, and only the http | https part of whatever is in the url field is updated when toggling ssl.
1.02
October
2008
September
2007
1.00
July 2007
Revision History
This section describes the revision history of the apmgtw probe.
Versions
Description
State
Date
2.0
What's New:
GA
September
2015
Beta
August
2015
Added support for configuration of the probe using Admin Console. The probe is now configurable only through
Admin Console
Bug Fixes
2.0
What's New:
Added support for configuration of the probe using Admin Console. The probe is now configurable only through Admin
Console.
1.1
What's New:
Corrected probe GUI and documentation by removing mention of alarms. The apmgtw probe only sends QoS data from
probes to CA APM. The probe does not send alarms.
1.0
Beta
November
2014
Prerequisites
Before you configure the probe, verify that you have completed these prerequisites:
CA APM is installed
CA APM Workstation is installed
Software Requirements
Note: The probe is supported for the following operating systems:
Windows 2008 Server
Windows 2008 Server R2
Red Hat Enterprise Linux
CentOS
The probe supports CA APM Enterprise Manager versions 9.6, 9.7, and 10.0.
The probe is compatible with CA Unified Infrastructure Management versions 8.1, 8.2, 8.3, and 8.3.5.
QoS message names might appear truncated in the Admin Console, Configuration page. It is recommended that you use CA Unified
Infrastructure Management 8.3.5 to view the complete names.
Deployment Considerations
The apmgtw probe can be only deployed on your CA Unified Infrastructure Management primary hub.
Before you use the Admin Console, Configuration page, ensure that PPM probe is configured to use the Web UI for the apmgtw probe.
Follow these steps:
a. Access the Admin Console.
b. Select Probes, PPM, Raw Configure.
c. From the Root folder, navigate to the adapter list folder.
d. Verify whether the apmgtw section exists with the apmgtw key and value 2.00.
If apmgtw section does not exist, use the Add section and Add Key buttons and specify the value 2.00.
Upgrading Considerations
When you upgrade or redeploy an existing version of the probe, the configuration files are migrated to the recent installation upon activation of the
Admin Console, APMGTW Configuration page. Deactivate and activate the probe for the configuration to take effect.
Troubleshooting
This section describes troubleshooting procedures for the CA APM Gateway (apmgtw) probe.
Problem:
I select a probe and its QoS, but QoS is not shown in CA APM Web View.
The length of the QoS may be too long. Check to see if the corresponding QoS format message in the qos.properties file is more than 150
characters. If the length of the QoS exceeds 150 characters, then apmgtw probe cannot send that QoS to CA APM.
Solution:
Navigate to the qos.properties file: Nimsoft > Probes > Gateway > apmgtw > qos.properties file.
In the metrics section of the qos.properties file, edit the description of the QoS message so that it is fewer than 150 characters.
Revision History
Probe-Specific Software Requirements
Revision History
Version
Description
Date
State
1.22
Mar 2013
GA
1.21
Minor changes
Sep 2012
GA
1.20
Minor changes
Mar 2012
GA
1.19
Aug 2011
GA
1.18
Changes in db library
Mar 2011
GA
1.17
Fixed: failed to handle strings with special characters (escape ' characters for Oracle db)
Nov 17 2010
GA
1.16
Nov 4 2010
GA
1.15
Sep 2010
GA
1.14
July 2010
GA
1.13
Fixed Oracle and MySQL support in GUI for when "other database" is used
Jun 28 2010
GA
1.12
Jun 11 2010
GA
1.11
Feb 2010
GA
1.10
Jan 2010
GA
1.03
Data administration parameters in the Setup dialog are now correctly read
Aug 2009
GA
1.02
May 2009
GA
1.01
Jan 2009
GA
1.00
Initial version.
Dec 2008
GA
Contents
Revision History
Revision History
Version
Description
State
Date
8.31
GA
August
2015
8.20
What's New:
GA
March
2015
GA
December
2014
GA
September
2014
GA
June 2014
Support for request.cfg deployment with robot v7.70. Upon startup, controller looks for request.cfg, a user-created text
file that enables automatic deployment of the specified probes to the robot. Previously, these requests could only be
facilitated by distsrv, which still handles them by default. To have a v7.70 robot direct the requests to ADE, use the Raw
Config utility to add the deploy_addr parameter to robot.cfg, and specify the UIM address of the ADE probe:
deploy_addr=/<domain>/<hub>/<robot>/automated_depl
oyment_engine
Build number validation for archive import (negative, non-numeric,0), which maintains consistency in the archive by
preventing ADE from importing packages with bad build numbers.
Package included in UIM Server 8.2 distribution.
Fixed Defects:
Improved reliability with large deployments.
Improved performance with remote robot deployment (deploying a robot whose specified parent hub is NOT the hub
where ADE is deployed).
8.10
Remote robot deployment reliability has been improved for slow networks.
ADE probe package dependency resolution has been fixed when using EQ as the dependency type.
Solaris native installer init script now returns correct error code when being called.
ADE restarts properly when called from Infrastructure Manager and Admin Console.
Package included in UIM Server 8.1 distribution.
2.00
ADE 2.0 is a Java-based redesign of distsrv with scalability, flexibility, and maintainability in mind.
Admin Console now uses ADE for package deployment, which is two to five times faster than with distsrv. (Infrastructure
Manager continues to use distsrv).
ADE uses an archive cache to do quick package lookups and file extractions. Files that have been extracted are
maintained in a cache to speed up package deployment. This cache is cleared at startup.
ADE Archive stays in sync with the file system, and the archive-sync solution is much more scalable than distsrv
package forwarding. Processing is distributed, rather than going through a single master distsrv.
Package included in UIM Server 8.0 distribution.
1.31
Note: The probe from version 2.01 and later is configured only through the Admin Console GUI.
Amazon Web Service (AWS): The AWS provides a decentralized IT infrastructure to multiple organizations. You can create an account
on the AWS cloud and can use its services as per your IT infrastructure requirements. The various capabilities of AWS include storage,
web-scale computing, database access, and messaging.
The probe provides monitoring of the following AWS services:
Health: The probe monitors the overall health status of the AWS services for all geographical locations. Alarms are generated based on
the status of all the AWS services.
Amazon Simple Storage Service (S3): This AWS service provides an interface for storing and fetching data at any time instance. The
probe generates QoS data based on the time consumed in storing and retrieving files.
Amazon Elastic Compute Cloud (EC2): This AWS service provides a flexible web-scale computing interface. The probe generates QoS
data and alarms that are based on the performance of various EC2 instances.
Amazon Elastic Block Storage (EBS): This AWS service provides a scalable storage volume facility for the EC2 instances. The probe
generates QoS data and alarms that are based on the operations that are performed on the storage volumes.
Amazon Relational Database Service (RDS): This AWS service manages relational databases that are stored in a cloud network.
AWS-RDS handles many database administration tasks and lets you perform other operations like setting up and scaling the database.
The probe generates QoS data and alarms that are based on the system metrics and database operations.
Amazon ElastiCache: This AWS service provides the AWS instances with an option of storing temporary data in a scalable cache
memory, and thus, increasing the processing speed. The probe generates QoS data based on the time consumed in accessing the cache
service and other parameters like amount of data stored and time taken to fetch the data.
AWS Custom Metrics: AWS provides some default metrics for all its services. Another feature of AWS is that you can create and
configure your own customized metrics, and store these metrics in the AWS CloudWatch for viewing, or monitoring purpose. These
metrics, which AWS does not generate, are called custom metrics. The probe lets you configure the custom metrics for QoS generation.
AWS Simple Queue Service (SQS): This AWS service lets you transmit data to other services using message queues. The probe lets
you configure the message queue properties for QoS generation.
AWS Simple Notification Service (SNS): This AWS service lets you manages the notification messages that a publisher sends and a
subscriber receives through a communication channel. The probe monitors the communication channel and generates QoS data based
on the status of the notifications.
AWS Elastic Load Balancing (ELB): This AWS service lets you route the traffic that comes from various applications across multiple
available EC2 instances. The AWS Monitoring probe monitors the ELB layer at group level and generates QoS data based on the status
of ELB layer.
AWS AutoScaling: This AWS service lets you accumulate different EC2 instances in a group. You can create an autoscaling group
according to the usage of the EC2 instances in various applications.The probe monitors the instance status at group level.
Important! Amazon charges the AWS account which the probe uses to monitor the AWS services. You must consider this fact while
configuring the probe for monitoring various AWS services.
Contents
Revision History
Prerequisites
Probe Specific Software Requirements
Upgrade Considerations
Alarm Threshold Requirements
Fixed Defects
Known Issues
Revision History
This section describes the history of the revisions for the aws probe.
Version
Description
State
Date
4.02
Fixed Defects:
GA
September
2015
The probe displayed a profile in Failure state despite of a successful credential verification. Salesforce cases 00169402,
70000878.
Updated the documentation to describe that Namespace and Dimension are required in the scripts to display the custo
m metrics on the probe GUI. Salesforce case 00167924
Note: For more information, see the Custom Metrics section in the aws AC GUI Reference article.
4.01
Fixed an issue where NULL QoS value were generated for S3 service.
GA
June 2015
4.0
What's New:
GA
May 2015
GA
December
2014
Added the ability to create monitoring configuration templates. The templates allow you to apply consistent monitoring
configurations across multiple profiles using filters. The monitoring configurations apply across multiple instances of a
service and multiple profiles of the aws probe.
Note: Template Editor is available from probe version 4.0 or later on UIM 8.3 and above. Health and Custom Metrics
are not supported on Template Editor.
Factory templates for all monitors are available for Unified Dashboard.
Note: Factory Templates do not support Simple Storage Service (S3).
Added the ability to create AutoScale Groups on USM groups.
3.51
Removed the ReplicaLag monitor for the RDS service from the GUI from version 3.51 onwards, for all databases except
MySQL.
Note: When upgrading the probe from version 3.00 ~ 3.50 to version 3.51, you must delete the ReplicaLag key from the Raw
Configure option to stop unwanted alarms for databases other than MySQL.
3.50
GA
December
2014
3.01
Changed default value of Statistics field from Maximum to Average for EBS, RDS, ElastiCache services, and Custom
Metrics monitoring.
GA
September
2014
3.00
GA
September
2014
GA
June 2014
GA
June 2013
Beta
December
2009
2.01
1.10
Added Regional support for gathering metrics for QoS and Alarm.
Added configuration capabilities for Message Alarms for 7 Cloud watch Metrics.
Re-structuring of QoS Names.
Added option to configure the Log level.
Added the t1.micro instance option for Deployment sampler.
1.00
Prerequisites
This section contains the prerequisites for the aws probe.
An AWS user-account with valid user-credentials, such as, Access Key and Secret Access Key.
EC2 Administrative Rights to allow the probe to access the AWS resource. When you assign the administrator access to a user, the
following policy is assigned, by default:
{
"Version": <version_number>
"Statement": [
{
"Effect": "Allow",
"Action": "*",
"Resource": "*"
}
]
}
Upgrade Considerations
This section lists the upgrade considerations for the aws probe.
The aws probe version 2.0x and later is available through Admin Console GUI only and not through the Infrastructure Manager (IM) GUI.
Upgrade from previous versions to version 2.0x and later is not supported.
QoS names of the following AWS metrics have changed in version 2.0x:
Old QoS Name
QOS_FileWriteTime
QOS_AWS_FILE_WRITE_TIME
QOS_FileReadTime
QOS_AWS_FILE_READ_TIME
QOS_DiskReadOps
QOS_AWS_DISK_READ_OPS
QOS_DiskReadBytes
QOS_AWS_DISK_READ_BYTES
QOS_DiskWriteOps
QOS_AWS_DISK_WRITE_OPS
QOS_CPUUtilization
QOS_AWS_CPU_UTILIZATION
QOS_NetworkIn
QOS_AWS_NETWORK_IN
QOS_NetworkOut
QOS_AWS_NETWORK_OUT
QOS_DiskWriteBytes
QOS_AWS_DISK_WRITE_BYTES
Note: If you have upgraded NMS 7.6 to CA UIM 8.0 then you do not have to perform the following procedure.
Fixed Defects
This section contains the fixed defects for the aws probe.
Added a note in the S3 node that the file, for which you want to generate the QoS data, must be present in the AWS base folder.
Added a note in the Overview topic that the probe from version 2.01 and later is configured only through the Admin Console GUI and the
Infrastructure Manager GUI for the probe is not available.
Enhanced the description of the Profile Name field in the Add New Profile section in aws node.
Known Issues
When you save the configuration the probe restarts and data collection for AWS services starts again. This re-discovery of services slows
the GUI processing. You must re-load the GUI after the probe restarts.
When migrating to version 4.0 or later of the probe, you must manually clear the AutoScale alarms in USM. The new alarms generated by
the probe are displayed in the created AutoScale group in USM.
Revision History
Prerequisites
Upgrade Considerations
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.10
What's New:
GA
June
2015
GA
Dec
2014
Beta
Dec
2014
GA
Aug
2010
On upgrading the probe from version 2.01 to 2.10 or later, only one Azure Subscription can be associated with a profile.
Ability to create monitoring configuration templates. The templates allow you to apply consistent monitoring configurations
across multiple profiles using filters.
Factory template for monitors available on the Azure Dashboard.
Support for dynamic variables using CA UIM 8.3 or later.
Fixed Defect:
Azure service health data was not populated on the probe GUI. Salesforce case 160988
2.01
2.00
Fixed an issue where the password was appearing as plain text in the probe logs.
First release of the probe for Admin Console GUI.
The probe is now available only through the web-based GUI and not through the Infrastructure Manager (IM) GUI.
Upgrade from previous versions to version 2.0x and later is not supported.
Added support for monitoring the health and performance of Microsoft Azure infrastructure and services including virtual
machines (VMs), websites and storage.
Note: Probe is supported from CA UIM 7.6 and later only.
1.01
1.00
Prerequisites
This section contains the prerequisites for the azure probe.
Jul
2010
Upgrade Considerations
This section lists the upgrade considerations for the azure probe.
The azure probe version 2.0x and later is available through Admin Console GUI only and not through the Infrastructure Manager (IM)
GUI.
Upgrade from previous versions to version 2.0x and later is not supported.
For viewing the metrics that are available in the azure probe version 2.0x and later, on the USM portal, you can perform any one of the
following actions:
Upgrade your NMS version to NMS 7.6 or CA UIM 8.0 or later.
Install the ci_defn_pack version 1.01 probe. You are required to restart the nis_server when you deploy the ci_defn_pack.
Important! You can install the ci_defn_pack probe from https://support.nimsoft.com
Note: The QoS of probe version 1.0 are no longer supported by the probe version 2.0x and later.
The 2.10 or later versions of the probe allow you to create configuration templates. The templates are applicable only to the specific
instance of the probe on the robot. Both new and existing profiles can be configured using templates.
Revision History
Requirements
Hardware Requirements
Software Requirements
Probe Dependencies
Supported Platforms
Installation Considerations
Upgrade Considerations
Known Issues
Revision History
Version
Description
State
Date
2.6
What's New:
The baseline_engine v2.6, prediction_engine v1.31, and ppm v3.22 probes are included in the CA UIM Server v8.31
installer and are installed on the primary hub during installation.
GA
August
2015
GA
March
2015
GA
December
2014
GA
September
2014
GA
June 2014
baseline_engine supports the use of variables in the Custom Alarm Message and Custom Alarm Clear Message fields.
These variables are probe-specific.
2.5
What's New:
When you install ppm v3.11, baseline_engine v2.5, prediction_engine v1.2 are also installed on the hub robot.
baseline_engine supports the use of Custom Alarm Message and Custom Alarm Clear Messages.
A user cannot select the Compute Baseline check box if the baseline_engine probe is not installed or not running on
the hub robot. Help text for the Compute Baseline check box indicates whether baseline_engine is not installed or not
running.
2.4
What's New:
Updated the hub queue names to address character limitation.
A user must select the Compute Baseline check box on a probe's GUI before a baseline can be generated.
baseline_engine v2.4, prediction_engine v1.01, and ppm v3.0 should be installed on hub robots to allow users to
configure dynamic, static (if applicable), and Time To Threshold alarm and threshold settings for monitoring probes. (nas
and alarm_enrichment must be deployed to the primary hub to allow users to configure Time Over Threshold alarm and
threshold settings.)
2.34
What's New:
baseline_engine v2.34, prediction_engine v1.0, and ppm v2.38 should be installed on hub robots to allow users to
configure dynamic, static (if applicable), and Predictive Alarm settings for monitoring probes. (nas and alarm_enrichment
must be deployed to the primary hub to allow users to configure Time Over Threshold alarm and threshold settings.)
You can configure the baseline retention period (retentionPeriod) to be in the range of 3 to 12 weeks.
The command used to create baselines and thresholds for probes was modified.
The Setup Folder has two new configurable key-values: projection and predictiveAlarmSubject.
2.2
2.1
Dynamic threshold alarms no longer have different subsystem IDs than static threshold alarms.
GA
April 2014
2.0
GA
March
2013
Requirements
Hardware Requirements
By default, the baseline_engine probe stores four weeks of historical QoS metric and baseline data on the local disk of the system that hosts it.
The following memory and disk allocations are recommended:
Number of Metrics to baseline
Memory Allocation
Disk Allocation
1 GB
100 MB
10,000 to 50,000
1.5 GB
200 MB
50,000 to 100,000
2 GB
400 MB
Software Requirements
The baseline_engine probe requires the following software environment:
CA Unified Infrastructure Management (UIM) 8.0 or later
ppm v2.38 or later
Robot version 5.23 or later
Java 7 (java_jre1.7) - the hubs where the required probes are running should have java_jre1.7 loaded in the Installed Packages in Admin
Console (typically installed with UIM 8.0 and later)
Probe Dependencies
The following table shows the versions of prediction_engine, ppm, nas, and alarm_enrichment probes that you should be running with the
different versions of baseline_engine. If the versions of these probes are mismatched, meaning you deploy baseline_engine v2.6 with an earlier
version of prediction_engine and ppm on a hub robot, the system will not be able to properly produce baselines for probes.
baseline_engine
prediction_engine
ppm
qos_processor
2.6
1.31
3.22
1.25
4.73
8.31
2.5
1.2
3.11
1.24
4.67
8.2
2.4
1.1
3.0
1.23
4.6
8.1
2.34
1.01
2.38
1.23
4.4
8.0
Supported Platforms
Refer to the Compatibility Support Matrix for the latest information about supported platforms. See also the Support Matrix for Probes for more
specific information about the probe.
Installation Considerations
The baseline_engine probe and qos_processor probe are distributed as part of CA UIM. The deployed versions of baseline_engine and
qos_processor should match the versions included with version CA UIM running on the UIM Server. The qos_processor probe saves the
baseline data to the UIM database. See the Probe Dependencies table in the article for more details.
If you install baseline_engine on secondary hubs, see the baseline_engine deployment information for more details about multi-tier
deployment.
The baseline_engine probe is available as part of NMS from version 6.50 and later, or UIM v8.0 and later.
The baseline_engine will not install or operate on versions of NMS earlier than 6.50.
Upgrade Considerations
For baseline_engine v2.2 and earlier, by default the projections key in the Setup folder is set to False. If you upgrade to baseline_engine v2.3 or
later on a hub where a previous version of baseline_engine was running, the projections key-value defaults to False. After the upgrade, if you
decide to change the projection key value to True, change the setting before the baseline_engine calculates any baselines. See the
"baseline_engine Raw Configuration" article for the version of baseline_engine you're running for more details.
Note: If a primary hub and a secondary hub are running different versions of the baseline_engine probe, for consistent behavior you
should set the projections key-value for all baseline_engines to False.
Known Issues
Restart Required After Upgrading to UIM 8.1 From UIM 7.5
Hub Queue Names Character Limit
java_jre1.7 Required
Note: When upgrading to UIM 8.2 from UIM 8.1, you are note required to restart the UIM Server when the upgrade is complete.
java_jre1.7 Required
Problem:
The prediction_engine and baseline_engine probes require the hub on which it is installed to have a Java environment pointing to java_jre1.7. If
there is a mismatch between the version of Java the secondary hub's environment is pointing to and the probe's Java dependency, some of the
java-based probes (including prediction_engine and baseline_engine) might not start.
In addition to the probe not starting, you might also see an error similar to the following:
Sep 19 17:23:10:955 [2624] Controller: Max. restarts reached for probe 'baseline_engine' (command = <startup java>)"
This error appears in the probe's log file:
baseline_engine.log file located at:
/Nimsoft/Probes/SLM/baseline_engine
prediction_engine.log file located at:
/Nimsoft/Probes/SLM/prediction_engine directory
In some instances, if prediction_engine and baseline_engine are already running on a secondary hub and you deploy another java-based probe
that requires the environment to point to a version of Java earlier than java_jre1.7, prediction_engine and baseline_engine might fail to start after
the deployment. In this case, no errors appear in the log files for prediction_engine and baseline_engine.
Solution:
In either situation, redeploy java_jre7 to the secondary hub and then restart the entire robot.
Important! If prediction_engine and baseline_engine were calculating baselines or new metrics and an error condition arises, these
calculations will be inaccurate from the time the error condition began. After prediction_engine and baseline_engine are restarted,
baselines generated by baseline_engine will be accurate after sufficient samples have been collected and the predictive alarm metrics
will begin again at the top of the next hour after restarting the robot or the prediction_engine probe.
The billing probe collects data from the usage_metering probe, analyzes it, maps it to CA Unified Infrastructure Management subscriptions,
calculates the billable items, and creates a billing report with summary and detail information.
Contents
Revision History
Hardware and Software Requirements
Deployment Considerations
Revision History
Version
State
Date
8.20
GA
April 2015
GA
September
2014
GA
June 2013
GA
March
2013
GA
November
2012
This probe version is intended to be installed in tandem with usage_metering v2.11 or v8.0.
8.00
2.11
Helps usage_metering v2.11 to address a SQL Constraint Violation that can occur during report generation if scans do
not complete before the start of the next day.
Fixed a defect that could cause report generation to take too long when a large amount of usage data was collected.
This version is to be used with usage_metering v2.11.
2.10
2.00
Probe collects and analyzes data from usage_metering probe and creates a billing report.
Deployment Considerations
The billing probe is typically deployed to the primary hub. The usage_metering probe must reside on the same robot as the billing probe, and
must be configured as the primary instance of usage_metering. See the topic v8.0 usage_metering Deployment for information on primary and
secondary instance types.
Revision History
Version
Description
State
Date
2.9
Initial version.
GA
June 2015
2.9.2
Updated version.
GA
December 2015
Security Requirements
The capman_da probe requires open TCP communication to the Data Manager. All firewalls between CA UIM and CA CCC should be configured
to allow TCP traffic to flow freely, from the CA UIM primary hub robot and the capman_da probe, to the Data Manager.
By default, the Data Manager uses thrift service port 8082. Verify that the Data Manager thrift service port on your system is set to 8082.
If the capman_da probe is not installed on the primary hub, it does not have visibility to all QOS_Messages published within the
UIM deployment.
Since the matching_metric_type is set to 5:1, you must enable the VMAX probe monitors, using the QoS mapping names.
The casdgtw probe is a gateway between the CA Unified Infrastructure Management and the CA Service Desk. The probe works by subscribing
to alarm assignments. If an alarm is assigned to the user specified in the probe's Setup, the alarm is entered as a Service Desk Call Request.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for the casdgtw probe.
Version
Description
State
Date
2.42
GA
November
2014
2.41
Fixed the activity log related defect, where the probe is saving the activity logs even when the option is not selected on the
probe GUI.
GA
December
2013
GA
December
2013
GA
April 2013
GA
December
2012
GA
September
2012
GA
February
2012
GA
December
2011
GA
September
2011
GA
June 2011
2.40
Added support for establishing connection between the probe and CASD Server over HTTPS protocol.
Enhanced probe performance while creating, updating, or closing tickets.
Earlier the probe was behaving abruptly as the number of processed alarms increased significantly resulting in huge cfg
size. Some of the commonly reported issues were: unable to create incidents, creating incidents but not displaying
incident Id, CFG is not getting updated properly, probe is creating duplicate tickets for one alarm and acknowledging the
alarm is not updating the ticket status.
Provided Backward Compatibility while upgrading probe for already created incidents.
2.34
2.33
2.32
Added capability to assign alarm generated tickets to specific group and reassign them to another group, on when alarm
severity changes.
Modified the probe to provide the ability to add CA Service Desk login activity during the creation, update and closure of
incidents generated from Alarms.
Fixed the problem of old CA Service Desk Incident Statuses not getting removed from closed ticket status list.
Made Incident ID custom field case insensitive.
Fixed issues with the CASD users not able to view the tickets even when they were having the same Incident Area as
the ticket.
2.31
Added support to configure and use a separate set of field mappings that can be used independently during create,
update, and closure of incidents.
Added support to automatically resolve text to associate UUID values.
2.30
Updated configuration section for addition of Owning System and Ticket Status fields.
Updated configuration section for Editing or Closing an Incident to reflect Owning System configured.
Added sections to map a customs value instead of a single Alarm Field.
Added support to select individual mapped field to sync Incident with updates to Alarm in NMS.
Added support to sync all mapped Incident fields with updates to Alarm in NMS.
2.22
2.21
2.11
GA
March
2011
2.10
GA
February
2011
2.01
GA
December
2010
GA
February
2006
SDgtw was renamed to casdgtw to avoid naming conflict with other providers' Service Desk products.
Added timeout for SD function calls to the config.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
General Use Considerations
Revision History
Version
Description
State
Date
1.0
Initial version
GA
March 2015
Installation Considerations
1. Install the package into your local archive.
2. Drop the package from your local archive onto the targeted robot.
3. Use Admin console to access the probe configuration GUI or raw configure options.
Upgrade Considerations
None
Revision History
Feature List
Requirements
Software Requirements
Hardware Requirements
Supported Platforms
Considerations
Installation Considerations
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.53
Beta
Apr 2011
1.52
GA
Dec 2010
1.40
Sep 2010
1.30
Jan 2010
1.26
Rebuilt with new pt library (increased timout to 300 sec on remote list_services).
Nov 2008
1.25
Mar 2008
1.24
Fixed potential problem with SNMPQueryFree not being called (could leave open UDP ports).
Feb 2008
1.23
Jan 11 2008
1.22
Fixed program failure (after upgrade from 1.0x or 1.1x to 1.20 or 1.21).
Jan 10 2008
1.21
Dec 2007
1.10
Nov 2007
1.09
Sep 2007
1.08
Improved UI support for environmental checkpoints. Fixed unit settings for 'CallManager Memory' object.
Fixed problems with QoS and USER OBJECTS.
Added ping timeout override possibilities where this may be a problem.
Fixed issues with UI update where profile-name and host settings differed.
Dec 2006
1.06
Fixed problems with SNMP detection. Fixed issues with CallManager 4.1.x performance objects.
Mar 2006
1.04
Initial version
Dec 2005
Feature List
This probe monitors a set of checkpoints on defined agents running the CCM software. The probe includes a set of predefined checkpoints
available on most hosts running the CCM software. These checkpoints are grouped in different classes, and you can choose to hide checkpoints
that are not available on the hosts.
The probe also includes a GUI, which can be used to:
Configure the general properties of the probe.
Define the hosts to be monitored. Group folders can be created to place the hosts in logical groups.
Activate the checkpoints to be monitored and set the monitoring conditions for these checkpoints.
Create alarm messages.
Monitor the different checkpoints. The measured values will be presented as a graph.
Requirements
This section contains the requirements for the ccm_monitor probe.
Software Requirements
Cisco CallManager 3.x, 4.x, Nimsoft probes: >=perfmon 1.11, >=ntservices 2.21.
Hardware Requirements
None
Supported Platforms
Please refer to the:
Compatibility Support Matrix for the latest information on supported platforms.
Support Matrix for Probes for additional information on the probe.
Considerations
This section contains the considerations for the ccm_monitor probe.
Installation Considerations
The ccm_monitor package will require the correct version of perfmon and ntservices in the local archive prior to the probe installation. Please
ensure that you have the correct version in your local archive. The distribution server will automatically install the required packages during the
installation of the ccm_monitor.
Warning messages from perfmon in the CCM event-log
To resolve this problem, run the following commands at a command prompt in the %SystemRoot%\System32 folder to unload and reload the IIS
performance dynamic-link libraries (DLLs). After you run these commands, the warning messages are not logged:
unlodctr w3svc
unlodctr msftpsvc
unlodctr asp
unlodctr inetinfo
lodctr w3ctrs.ini
lodctr ftpctrs.ini
lodctr axperf.ini
lodctr infoctrs.ini
After you run these commands, you must restart your computer for the changes to take effect. The problem is described in the following
knowledge base article: Q267831
How to disable perfmon AllObjectMode:
If the probe does not receive values of the performance counters Cisco CallManager.CallManagerHeartBeat and Cisco TFTP.HeartBeat during
the first poll interval, the perfmon AllObjectMode will be set automatically.
To disable this functionality:
The perfmon_all key needs to be set to -1 using Raw Configure. This key is located in the section /profiles/. If the key doesn't exist, create it and
set to -1.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrades and Migrations
Known Issues
Revision History
This section describes the history of the probe updates.
Version
Description
State
Date
5.61
Fixed Defects:
GA
December
2015
Beta
December
2015
Probe was consuming high CPU when disks were mounted during run-time. Support case number 70004776
On RHEL platform, the cluster disks were displayed as local or network disks on the cdm Infrastructure Manager (IM). S
upport case number 00160058
Note: You must use cdm version 5.61 with the cluster version 3.33 to view the cluster disks on the cdm Infrastructure
Manager (IM).
5.60
What's New:
Added two new alert metrics, CpuErrorProcesses and CpuWarningProcesses, to define the number of top CPU
consuming processes in the probe
Added support to enable the default QoS for Memory Usage in percentage.
Fixed Defect:
Updated the cdm IM GUI Reference to state that the probe uses the mount entries as in /proc/mounts file in Linux to
display the file system type of devices that are remounted to a different location. Salesforce case 246152
5.51
What's New:
GA
September
2015
GA
July 2015
GA
June 2015
GA
May 2015
GA
April 2015
GA
March
2015
Added support to ignore iostat devices from monitoring, using regular expressions on Linux, Solaris, and, AIX platforms.
Salesforce case 00168980
Added support to ignore the devices configured through super package. Salesforce cases 70000803, 00167576
Fixed Defects:
Added two new keys sliced_cpu_interval and sliced_memory_interval in the Raw Configuration section of the probe.
These keys are supported for AIX platform. Salesforce case 00160805
Note: See Options Configured Using Raw Configure section in the v5.5 cdm IM Configuration article for more
information related to these keys.
The probe generated the local file system disk missing alarm message for all type of disks (Network, Local, and,
Shared). Salesforce case 00162388
Note: The probe generates the updated message only on a fresh deployment of the probe.
The probe crashed when disk delta QoS were enabled and alarms were configured through custom profile. Salesforce
case 00169654
The probe calculated the value of Physical Memory Utilization metric incorrectly. Salesforce case 70000322
The probe generated false reboot alarms. Salesforce cases 00161914, 70001349, 00168047
5.50
What's New:
Three new Metrics QOS_LOAD_AVERAGE_1MIN, QOS_LOAD_AVERAGE_5MIN and QOS_LOAD_AVERAGE_15MI
N have been added on Linux, Solaris, AIX and HP-UX platforms.
Added a section in the Troubleshooting article to explain that using entitled capacity can lead to CPU usage above 100
percent on AIX platforms. Salesforce case 00164529.
Fixed Defects:
On Solaris platform, some robots sent reboot alarms on restarting. Salesforce cases 00157066 .
The probe was unable to generate data related to computer uptime. Salesforce case 00157375.
5.42
What's New:
Upgraded support for factory templates.
5.41
What's New:
Upgraded OpenSSL to version 1.0.0m.
Rolled back the alarm severity level changes made to the cdm version 5.30 to make these levels consistent with version
5.21.
Note: Refer to the Upgrades and Migrations section for more information related to alarm severity level changes.
5.40
What's New:
Added support to subtract buffer/cache memory usage from the used physical memory for HP-UX and Solaris platforms.
Fixed Defect:
The probe did not save the monitor configurations for a clustered disk when configured through the Admin Console GUI.
Salesforce case 00156473
5.31
5.30
Beta
March
2015
GA
February
2015
Beta
February
2015
GA
January
2015
GA
December
2014
GA
October
2014
Added a Filesystem Type Filter field to specify regular expression and automatically select matching file system(s) to be
monitored.
Added clustered environment monitoring to the Admin Console GUI of the probe.
Added a Enable Space Monitoring checkbox to allow the probe to monitor network disk usage metrics.
Fixed Defects:
Disk space alarms in custom profiles were generated with zero values when disks were not available or not mounted. S
alesforce case 00155117
The probe generated messages with different severity levels for the same alarm in Windows and Linux environments. S
alesforce case 00154404
The probe generated File system not available alarm in linux and solaris environments even if the file system was
ignored. Salesforce case 00152438
5.21
A new key QOS_DISK_TOTAL_SIZE has been added in fixed_defaults under disk in the Raw Configuration section of the
probe. The key is supported for Windows, Linux and AIX platforms. Salesforce case 00155861
Fixed defects:
The probe displayed the debug log even though the log level was set to 0. Salesforce case 00143837
The probe crashed on AIX platform when deployed for the first time. Salesforce case 00156478
5.20
Fixed an issue where the probe was publishing some host names with non ASCII characters. Salesforce cases: 00150465,
00153872
Two new Metrics QOS_IOSTAT_KRS and QOS_IOSTAT_KWS have been added on Linux platform and four new Metrics Q
OS_IOSTAT_RS, QOS_IOSTAT_WS, QOS_IOSTAT_KRS and QOS_IOSTAT_KWS have been added on AIX platform.
Note: These new Metrics are configurable only through Admin Console GUI.
5.11
Fixed Defects:
The probe was generating false boot alarms. Salesforce cases: 00148483, 00149480, 00137376, 00150365, 00146355,
00148673, 00148121, 00148737
Warning messages were shown in logs. Salesforce cases: 00151892, 00150149
Incorrect alarm messages were shown in logs for CPU data. Salesforce case 00151892
5.10
5.02
4.91
Fixed Defects:
July 2014
Probe was suppressing all alarms for different iostat metrics. Now, a different suppression key is used for different iostat
alarms of a given device. Salesforce case 00139484
Note: User has to manage the already suppressed alarms manually.
Probe was not generating QoS for any iostat metric when the Set QoS Source to robot name instead of computer
hostname option is selected in the controller probe. Salesforce case 00137858
Note: PPM version 2.35 or later is required for these fixes to work as the iostat feature is configurable only through
Admin Console GUI.
4.90
June 2014
4.81
Fixed an issue of QoS definition which were getting generated even if the respective QoS messages were inactive.
March
2014
4.80
March
2014
Device iostat monitoring functionality for Linux, Solaris, and AIX platforms through Admin Console GUI from NMS 7.5
onwards.
Support for monitoring the CIFS (shared Windows disk mounted on Linux) and GFS (clustered environment disk) file
systems.
4.78
4.77
Fixed the issue of alarms, which are generated through CPU custom profile and the cpu_total profile are having same metric
id though having different suppression key.
Fixed a defect for storing the password in encrypted format while mapping a new shared disk. Earlier the probe was
storing password in clear text format. You can delete and then map again the existing shared disks for encrypting their
passwords.
February
2014
January
2014
Fixed a defect of wrong subsystem Id when the probe is deployed on Linux environment. Earlier the probe was using
subsystem Id of 3.3.xx series, by default, which is reserved for the nas probe. Now it is using 1.1.xx series of subsystem
Id, by default.
4.76
October
2013
4.75
October 2
013
4.74
Fixed a defect by removing extra logs, which are being logged by the probe.
Updated default configuration of the probe.
4.73
Fixed an issue of sending a false alarm when cluster disk is out of scope.
September
2013
July 2013
Added fix to issue related to when edit alarm message show 0% threshold for memory alert.
Fixed a defect causing default values for low and high thresholds of 'Disk usage change and thresholds' are coming
incorrect.
4.72
Added fix to issue related to When editing CDM disk usage values, percentage jumping to MB.
May 2013
Fixed a defect causing probe to use 100% CPU in case of hot adding a CPU in a Linux VM.
4.71
April 2013
Added functionality for calculating CPU related statistics calculations considering LPAR in AIX.
June 2012
4.55
March
2012
August
2011
4.54
Fixed internationalization defects. Changed share authentication test order to 'user/password', 'impersonation', 'implicit'.
Fixed percent / inode conversion integer overflow situation on disk profile configuration.
June 2011
January
2011
December
2010
September
2010
4.40
September
2010
Added support for separate polling intervals for alarms and QoS.
Added support to configure target for Total CPU QoS.
Added support to send QoS source as short name (For Windows) or full name (For Linux).
Added support to ignore filesystems by giving regular expression.
Added a user interface to configure default values for discovered disks.
Added code to remove white space from all sections.
Added fix for memory leak.
4.30
4.22
Active state of disk missing alarm read from default disk settings.
June
2010
May 2010
Added support for sending alarms only after the required samples are received.
4.21
The 'ignore_filesystem' and 'ignore_device' are now also implemented for Windows systems.
Fixed the issue where custom disk profile uses the percent/MB setting from the global profile.
4.20
March
2010
February
2010
4.11
Fixed number of samples for disk monitoring not being read properly.
September
2009
4.10
September
2009
4.05
September
2009
4.03
Solaris: Fixes error situation that could occur if a parse error happens in the first sample collected.
June
2009
May 2009
March
2009
3.81
December
2008
3.80
Added connection_status callback function to support improved share status information. Implemented Ctrl+S in
configuration tool to save window setup.
Renamed Processor Queue Length to System Load for UNIX/Linux. Note that the same Quality of Service table
(Processor Queue Length) is still used.
Modified Processor Queue Length calculation for Windows - the Queue length is now divided by the number of
processors.
Enabled decimal point use for System Load alarm threshold and Quality of Service messages.
Changed usage display to convert to appropriate unit in disk table.
Added "rebooted" alarm option and alarm message.
Added the following alarm variables:
(for all) robotname, hostname.
(for disk) size_mb, size_gb, free_mb, free_gb, free_pc, limit_mb, limit_gb, limit_pc.
(for inodes) total_num, free_num, free_pc, limit_num, limit_pc.
Fixed disk_history problem with hidden disks.
Added option for sending Quality of Service message on network disk availability.
Added QoS for network disk availability.
Added option to monitor disk usage change.
Added log size option
October
2008
3.72
September
2008
3.54
May 2008
3.53
April 2008
3.52
3.51
Modified logic to determine when clear alarm should be sent for nfs mounted file systems.
When a new disk is detected the probe will do a full restart to correctly initiate the disk. UNIX: Added monitoring of
inodes on filesystems. Note: Linux systems with ReiserFS filesystems may show 0 inodes (the same result should be
visible with the command 'df -i').
3.43
March
2008
January
2008
October
2007
Modified QOS table name for memory paging when paging is calculated in pages per second. In previous versions
QOS_MEMORY_PAGING was used for memory paging regardless if the calculations where done in kilobytes per
second or pages per second. Now QOS-MEMORY_PAGING_PGPS will be used when pages per second is specified.
Note that for users already using this option old data may need to be moved to the new table and that SLA's and SLO's
may need to be modified.
3.42
May 2007
AIX: Fix bug which caused physical memory information in the get_info callback to be incorrect.
February
2007
3.31
AIX: Added support for flag 'mem_buffer_used' in setup. If set to 'yes' the used memory will include the file cache and be
consistent with the data from the 'vmstat' utility. Default is 'no' to be compatible with other platforms which to not report
file cache as used memory as it is still available to programs.
January
2007
3.25
July 2006
July 2006
Fixed problem with reporting physical memory over 4GB on Windows 2000/XP/2003.
Fix on Solaris, Tru64 and AIX: Allocated a bigger buffer for CPU monitoring data. The maximum number of CPU's
increased from 32 to 128.
3.23
June
2006
3.22
3.21
Tru64: Fix physical memory detection on systems with over 4GB RAM
June
2006
May 2006
3.12
Fixed bug where a specific configuration on multi-cpu systems would cause a segmentation fault.
February
2006
April 2005
Installation Considerations
For AIX 5.x users
The memory gathering routines use libperfstat , which must be installed. It is found in the bos.perf.perfstat and bos.perf.libperfstat file-sets. To
verify that you have the correct file-sets installed, you can run:
The system displays the following output depending on the installed versions:
bos.perf.libperfstat
bos.perf.perfstat
5.1.0.35
5.1.0.35
COMMITTED
COMMITTED
The cdm probe versions prior to version 5.30 had different severity levels for the same alarm on different platforms. Changes had been
made in version 5.30 to make the severity level same for an alarm on different platforms. These changes have been rolled back in the
cdm version 5.41. This version has the same configuration as the versions prior to 5.30 for the alarm severity levels. The following table
summarizes the changes made in the alarm severity levels on different versions of cdm.
Alarm Name
Platform
CpuWarning
Windows
Minor
Warning
Minor
PagefileWarning
Windows
Minor
Warning
Minor
PagingWarning
Windows
Minor
Warning
Minor
PhysicalWarning
Windows
Minor
Warning
Minor
SwapWarning
Windows
Minor
Information
Minor
DiskWarning
Windows
Minor
Warning
Minor
InodeWarning
Windows
Minor
Warning
Minor
BootAlarm
Windows
Warning
Information
Warning
InternalAlarm
Linux
Major
Minor
Major
Note: Upgrading to version 5.41 will have no impact on your existing cdm probe configuration irrespective of the probe version running
in your environment.
In case,
You are running cdm version 5.30, 5.31, or 5.40
Your severity levels have already changed on upgrading to any of these versions
You want to roll back to severity levels of the version prior to 5.30
Contact CA support for a customized probe package.
Known Issues
The 32-bit versions of this probe are unable to monitor terabyte (TB) sized disks.
On Windows platform, the first interval data is not retrieved for the Top CPU consuming processes when the probe starts executing. From
second interval onward, the probe displays the alarm correctly.
When running this probe in a clustered environment, you should not set the flag /disk/fixed_default/active to yes since this will cause
problems with the disks that appear and disappear with the resource groups. This flag is unavailable through the GUI, and only reached
through raw configure method or by directly modifying the cdm.cfg file.
The probe returns only the first eight characters of the system host name when deployed on HP-UX 11i v1 or earlier. For example, if your
system host name is hpuxsys123, then the probe returns hpuxsys1. The probe uses this trimmed host name as the QoS target.
Version 4.8x: The UMP GUI displays the consolidated list of the iostat QoS metrics for all the monitored devices. Each QoS name
contains the device name for locating the device-specific QoS.
Version 4.0x: Changed behavior when running in a cluster together with cluster probe version 2.2x. The probe will receive information
about cluster disk resources from the cluster probe and create monitoring profiles for these based on the 'fixed_default' settings. These
profiles are automatically registered with the cluster probe to ensure continuous monitoring on cluster group fail over. The cluster group is
used as Alarm and Quality of Service source instead of the cluster node.
Note: When upgrading to a newer version of the cdm probe, the old monitoring profiles for the cluster disks are overwritten with the new
ones.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Migration Considerations
Preconfiguration Requirements for Migration
NAS Subsystem ID Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
2.01
2.00
Description
Removed support for templates using Template Editor interface.
What's New:
State
Date
Beta
September
2015
Beta
September
2015
GA
April 2015
GA
October
2014
GA
July 2014
What's New:
Changed the heap size to: -Xms512m -Xmx1024m. Salesforce case 00156337
1.64
What's New:
Provided Source Override option to allow users to provide their own QoS source instead of using the default source.
Added support for VNX and VNX2.
Certified the probe for:
VNX Models: VNX5300 and VNX5700
VNX2 models: VNX5400 and VNX8000
1.63
Fixed Defects:
Fixed a defect where QoS Name field is shown as blank.
Fixed a defect where probe was sending clear alarm on every interval.
Fixed a defect where configured monitor was not assigned to the new file system. Salesforce case 00115057
1.62
Fixed an issue where the probe was not able to generate QoS data for Disk Groups.
April 2014
1.61
September
2013
1.60
September
2013
September
2011
1.40
Added "Total Count" metrics for storage capacity and number of disks, etc.
August
2011
1.15
May 2011
1.02
Bug fixes
March
2011
1.00
Initial release
February
2011
0.99
January
2011
Added fix to allow QOS measurements to be correctly captured and passed along to the Service Level Management
console.
January
2011
Changed the source of the QOS measurements being captured, from the Control Station IP address to the Resource
Name.
0.97
January
2011
0.96
January
2011
0.95
December
2010
CA Unified Infrastructure Management Server 7.5 to 7.6 or CA Unified Infrastructure Management 8.0 or later
Robot 7.05 or later (recommended)
Celerra Control Station Linux release 3.0 (NAS 6.0.36)
Java JRE 6 or later
ssh connectivity to celerra management node
Probe Provisioning Manager (PPM) probe version 2.38 or later (for Admin Console GUI only)
Migration Considerations
The migration considerations for the probe from the 1.65 to a later version are listed as follows:
Migration is only supported for profiles created in the Infrastructure Manager GUI of the probe.
Downgrade is not supported from version 2.00 to an earlier version of the probe.
Alarms generated in previous versions of the probe are not suppressed after migration to version 2.00 or later.
The Infrastructure Manager GUI of the probe is no longer available.
Templates and configuration done through templates in version 1.65 or earlier are not supported in version 2.00 or later.
Any custom QoS will automatically be reset to the default QoS for the monitor. You also cannot create custom QoS or alarm messages
for the probe.
update s_qos_data
set ci_metric_id = null
where probe = '?'
The query nulls out the ci_metric_id column in the s_qos_data table for the target probe entries.
3. Activate the data_engine probe.
As the probe publishes QoS messages, data_engine updates the null records with the correct keys.
4. Download and deploy the following packages on the robot with the probe:
ci_defn_pack 1.12 or later
wasp_language_pack 8.40 or later
mps_language_pack 8.33 or later
5. Restart the following probes:
baseline_engine
nis_server
wasp
service_host
6. Delete all the files in the util directory in the windows local temp directory.
Repeat this step for each instance of UIM accessing the robot.
7. (Optional) Delete the old alarms in the USM portal to stop viewing duplicate alarms.
8. (Optional) Set up new subsystem IDs in the nas probe, if you are using CA UIM 8.31 or earlier.
For more information, see NAS Subsystem ID Requirements.
9. (Optional) Deploy ppm version 3.23 or later to view the new subsystem IDs in the Admin Console.
You can skip step 9 if you skip step 8.
Note: You must deploy ppm version 3.23 or later to view the new subsystem IDs in the Admin Console.
If you are using celerra 2.00 with CA UIM 8.31 or earlier, you must add the following subsystem IDs manually using the NAS Raw Configuration
menu:
Key Name
Value
2.14.1.1
Control Station
2.14.1.2
Data Mover
2.14.1.2.1
Block Map
2.14.1.2.2
File System
2.14.1.2.3
System Stats
2.14.1.3
Network ICMP
2.14.1.4
Network IP
2.14.1.5
Network TCP
2.14.1.6
Network UDP
2.14.1.7
Storage System
2.14.1.7.1
Disk Group
2.14.7.1.2
Spindle
2.14.1.7.3
Storage Processor
2.14.1.8
Storage Volume
2.14.1.8.1
Volume Disk
2.14.1.8.2
Volume Group
2.14.1.8.3
Volume Meta
2.14.1.8.4
Volume Slice
2.14.1.8.5
Volume Stripe
To update the Subsystem IDs using Admin Console, follow these steps:
1. In the Admin Console, click the icon next to the NAS probe, select Raw Configure.
2. Click on the Subsystems folder.
3. Click on the New Key Menu item.
4. Enter the Key Name in the Add key window, click Add.
The new key appears in the list of keys with a blank value.
5. Click in the Value column for the newly created key and enter the key value.
6. Repeat this process for all of the required subsystem IDs for your probe.
7. Click Apply.
The Subsystem IDs are updated to the NAS probe.
To update the Subsystem IDs using Infrastructure Manager, follow these steps:
1. In Infrastructure Manager, right click on the NAS probe, select Raw Configure.
2. Click on the Subsystems folder.
3. Click on the New Key... button.
4. Enter the Key Name and Value
5. Click OK.
6. Repeat this process for all of the required subsystem IDs for your probe.
7. Click Apply.
The Subsystem IDs are updated to the NAS probe.
Note: Ensure that you enter the key names as-is including the period (.) in the end for correct mapping.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
3.37
What's New:
GA
January
2016
The unit for Cisco Buffer Misses monitor is changed to count in the probe and USM. Support case number 00115932
Updated Known Issues and Workarounds for an issue where the probe was unable to apply the default variables to new
profiles. Support case number 246752
3.36
Fixed the defect of cisco_monitor probe for displaying the host name of the device in the IP address field of the USM. This
issue happens when a monitoring profile is configured using the host name instead of the IP address. Now, the probe
automatically resolves the IP address for the given host name and displays it in the USM.
January
2014
3.35
September
2013
3.34
Fixed: cisco_monitor probe shows incorrect status color (yellow) when it should be green.
June 2013
3.33
April 2013
3.32
3.31
Fixed issues realted to pressing OK when Items in Array is 0 was giving warning on GUI.
For Fan State and Temperature State, Average use to come.
January
2013
August
2012
Unable to configure interval other than the 4 listed values via Bulk Configuration.
3.30
3.22
3.21
August
2012
June 2012
March
2012
3.20
March
2012
3.11
December
2011
3.04
October
2011
March
2011
March
2011
3.01
March
2011
3.00
January
2011
January
2011
2.91
October
2010
October
2010
2.80
June 2010
2.72
February
2010
2.70
December
2009
2.60
Updated get_oids to fix inconsistent GUI behaviour when using ping sweep.
Fixed an issue when the GUI hangs in host not responding case.
Added code to change the format of wordpad file similar to interface_traffic or net_connect probe.
Added fix to delete selected profile(s) from the right pane.
Added fix to only update the OID (checkpoint) that is being modified.
Added configured callback timeout in GUI.
December
2009
2.54
2.53
July 2009
April 2009
2.05
July 2008
2.04
Changed QoS definition used by variable "Memory Percent Free" (more information found further down).
May 2008
2.01
Fixed issue regarding dashboards (and Probe Utility) not working correctly when communicating with probe.
March
2006
GA
December
2005
The cisco_qos probe performs SNMP GET queries to Cisco SNMP devices supporting the Cisco class-based QoS MIB, transforming the query
result into alarms and/or Quality of Service for SLA purposes. Users can configure the profile to their requirements in order to integrate the device
seamlessly into the Nimsoft monitoring solution. The probe supports SNMPv1, SNMPv2c and SNMPv3.
Contents
Revision History
Probe Specific Software Requirements
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for the cisco_qos probe.
Version
Description
State
Date
1.23
Fixed Defects:
GA
February
2015
1.21
1.20
June
2012
GA
January
2011
GA
December
2010
1.10
GA
June
2010
1.10
GA
March
2010
1.08
GA
January
2010
1.07
Added QoS and Alarm Identification settings (Host Address or Profile Name)
GA
July 2009
1.06
1.05
Fixed discovery and configuration of service policies in cases where the same service policy is applied in both directions
(input and output) on the same interface.
November
2010
April 22
2009
GA
April 17
2009
GA
April 2
2009
Initial version.
1. Open the MIBs of the target SNMP agent in any MIB Browser tool.
2. Only those MIBs of the target SNMP agent that match the MIBs supported by CA UIM are displayed.
Probe v1.0 or later requires:
CA Unified Infrastructure Management Server 7.6 or CA Unified Infrastructure Management 8.0 or later
Robot 7.6 or later (recommended)
Java JRE 6 or later
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
List of Supported Products
Installation Considerations
Upgrade Considerations
Known Issues
Revision History
This section describes the history of the revisions for the cisco_ucm probe.
Version
Description
State
Date
1.84
Fixed Defects:
GA
January
2016
GA
February
2015
Updated the Known Issues section with limitation on backend database support to insert CAR details in NIS SLM
database. Support case number 246464
The thresholds were not updated on any of the monitors when modified templates were reapplied. Support case number
246048
For more information, see the Upgrade Considerations.
The probe displayed a yellow triangle and did not send alarms for missing checkpoints. Support case number 246088
For more information, see the Upgrade Considerations.
1.83
Fixed Defects:
Fixed a defect in which the probe was not able to overwrite the default value for the Logfile Size field. Salesforce case
00144671
Fixed a defect in which the default authentication message in profiles was incorrectly displayed as MsgError instead of
MsgAuthError. Salesforce case 00144674
Fixed a defect in which the profile status was always displayed green irrespective of the session state. Salesforce case
00144690
Fixed a defect in which the starting and stopping states in alarm messages were displayed as running. Salesforce case
00147705
1.82
Fixed a defect where the probe Infrastructure Manager (IM) GUI displayed AXL support and the document mentioned
AXL as a prerequisite. Salesforce case 00138332
GA
July 2014
1.81
Added an option for configuring the log file size through the IM probe GUI. Salesforce case 00131408
GA
July 2014
GA
June 2014
GA
January
2014
GA
November
2012
1.80
1.71
Fixed the defect of not resolving the server host name by the probe. Therefore, the probe generates an error that the
server is not responding.
Fixed the defect of not showing the Cisco RIS Data Collector service status on the probe GUI.
1.70
1.64
GA
June 2011
1.63
GA
May 2011
GA
February
2011
GA
December
2010
Added new feature to take input of the data engine address from the user.
1.61
1.60
1.50
Added a fix to check the existence of service on the host node(s) when deploying services on that host node(s).
GA
December
2010
GA
June 2010
1.41
Added fix in GUI to properly save the monitor key while using templates.
GA
June 2010
1.40
GA
June 2010
1.25
Added fix in GUI for proper saving of the monitor key properly while using templates.
GA
April 2010
1.30
GA
March
2010
1.24
Added code to create and deploy a template with wild card functionality based on instance.
GA
March
2010
GA
January
2010
1.22
Added fix to remove host node name from the monitoring object key.
GA
January
2010
GA
November
2009
1.20
GA
September
2009
1.11
GA
June 2009
1.10
Fixed potential program failure (on concurrent calls to GetServiceStatus and CollectSessionData)
GA
April 2009
1.09
GA
March
2008
GA
December
2007
1.05
GA
November
2007
GA
September
2007
Installation Considerations
The probe has the following installation considerations:
Install the Cisco Unified Communication Manager (UCM) Monitoring probe on the same system as the FTP server.
Specify valid user credentials with administrative privileges to the Cisco UCM server.
The following services are needed by the probe:
Cisco AXL Web Service: This service is required only for Cisco Unified Communication Manager version 6.x to 8.x.
SOAP Real-Time Service APIs
SOAP Performance Monitoring APIs
Cisco Unified Communication Applications other than Cisco UCM can also support AXL Serviceability interface. For more information,
refer the official Cisco support and documentation.
Upgrade Considerations
The probe upgrade from an earlier version to 1.84 or later has the following considerations:
The probe does not update the thresholds when modified templates are reapplied before upgrading the probe.
After upgrading the probe, reapply the required templates.
The probe displays a yellow triangle and does not send alarms for missing checkpoints.
Set the failureCounterValue key in the Setup section of Raw Configuration to 1.
Known Issues
The probe has the following known issues:
Reporting functionality
Deploy the probe on a Windows Robot to operate Custom CAR Analysis Reporting.
Only Microsoft SQL Server is supported as the backend database to insert CAR details to the SLM NIS Database. For more information,
see the Set Up General Configuration sections of cisco_ucm AC Configuration or cisco_ucm IM Configuration articles, as applicable.
Revision History
Requirements
Hardware Requirements
Software Requirements
Considerations
Installation Considerations
Upgrading Considerations
Upgrading to v2.30
Migration Instructions
Known Issues and Workarounds
Error with IPv6 Address
Delay in QoS Data Publication
Cannot Rename Resource Profiles in Admin Console
Subsystem ID Alarm Message Displays a Subsystem ID of 2.5.3 Instead of 2.5.3.1.
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.33
What's New:
GA
August
2015
GA
March
2015
Corrected an issue where available checkpoint metrics for Cisco UCS pools were not visible in the GUI. Salesforce case
00150525
2.15
Corrected an issue where device names with an "&" character were not displaying correctly.
GA
June
2014
GA
Dec
2013
GA
Jun
2013
Beta
Sep
2011
Corrected an issue where fault events that are less than 24 hours old were not received as alarms.
Corrected an issue where triggered alarms were not detected.
Corrected an issue with the display fault alerts.
2.14
Added the ability to set the number of retries for connecting to the Cisco UCS Manager server
Added functionality to handle API responses from the UCS Manager server which use special characters that are not escaped
properly.
2.01
1.40
1.30
Jun
2011
1.22
Feb
2011
Fixed template deployment (avoid creating extra monitor with dn = *) Fixed various minor GUI issues.
Fixed refresh of inventory (GUI) when creating new profiles.
Changed logical grouping of ports on Fabric Interconnects.
Added option for importing UCS Threshold Policies into probe template.
Added "Aggregated Bandwidth" for the various Port types on Fabric interconnects modules.
Added metric ID for Server Blade "Association" metric.
Decreased number of queries to the UCS Manager when creating automonitors.
Fixed display of automonitors when using static override.
Added options for setting source of Alarms/QoS (Host Address or Profile Name).
Added metrics for UCS Service Profiles.
Added metrics for MAC Pools, Server Pools, UUID Pools, WWNN/WWPN Pools.
Fixed metric IDs for operState and adminState of class etherPIo.
Fixed refresh of inventory tree (GUI).
Added metric "Association" for UCS Server Blades (indicates if a service profile has been associated with the blade).
Added optional alarm message variables for affected object and cause, available for UCS fault translation into CA Unified
Infrastructure Management alarms.
1.14
Jan
2011
1.13
Dec
2010
Fixed automonitoring issue when no alarm threshold is set and user activates monitoring.
1.12
Fixed display of Equipment Tree when number of Chassis objects exceeds 10.
Nov 9
2010
1.10
Nov 2
2010
1.11
Nov 2
2010
1.02
Sep
30
2010
1.01
1.05
Sep
20
2010
Sep 9
2010
Requirements
This section contains the requirements for the cisco_ucs probe.
Hardware Requirements
None
Software Requirements
Cisco UCS Manager: The probe is compatible with UCSM 1.4, 2.0, and 2.1.
The probe requires the following software environment:
CA Unified Infrastructure Management Server 5.1.1 or later
CA Unified Infrastructure Management robot version 5.32 or later
Java Virtual Machine version 1.6 or later (deployed as part of the probe package)
Infrastructure Manager v4.02 or later
.NET v3.5 on the hardware running the Infrastructure Manager application
Cisco Unified Computing System Manager (Cisco UCSM) v1.4 and later
Considerations
This section contains the considerations for the cisco_ucs probe.
Installation Considerations
The cisco_ucs probe is capable of monitoring the state of VMWare ESX Hypervisors and VMs installed on the UCS blade servers. This requires
that secure communication is set up between VMWare vCenter and Cisco UCS Manager (using the vCenter extension files). For more
information, see the Cisco documentation at http://www.cisco.com.
Upgrading Considerations
Upgrading to v2.30
Versions 2.30 is the first version of the probe to include support for applying monitoring with templates in Admin Console. To upgrade and then
apply monitoring with templates in Admin Console requires that all previous configuration be deleted. Because of this, we recommend that you
delete probe versions earlier than 2.30 and deploy a new v2.30 probe.
If you want to configure the probe using only Infrastructure Manager, you can upgrade from an earlier version to v2.30 without deleting any
previous configuration. However, not all features of v2.30 and later are supported in Infrastructure Manager.
Migration Instructions
Migration from versions of the cisco_ucs probe earlier than 2.01 to versions 2.01-2.10 is possible. The migration of configuration is limited to:
Migrating the templates (for Infrastructure Manager only)
Migrating the message definitions (for Infrastructure Manager only)
After migration to versions 2.01-2.10, resources configured in the previous version of the probe must be reconfigured (refer to the online help
topic Create a New Resource). After the resources are configured then migrated templates can be used to set up the monitoring of QoS metrics
and alarms.
Follow these steps to migrate from versions before 2.01 to the 2.01-2.10 version of the probe:
1. Download the cisco_ucs_migration probe from the internet archive to your local archive.
2. Download the cisco_ucs 2.01 or later probe from the internet archive to your local archive.
3. Drag and drop the 2.01 or later cisco_ucs probe to the robot where the previous cisco_ucs probe resides.
As part of this process the previous cisco_ucs.cfg file will be backed up as cisco_ucs.cfg.pre.2.00.
4. The probe is now ready to be configured with Resources and the templates from the previous probe will be available.
Known limitations/issues related to the version 2.01 or later migration process:
Resources will not be migrated.
Localized message tokens will not be available.
After reconfiguration of resources and monitors completes, any thresholds that are broken will trigger new alarms. It is a best practice to
remove any alarm messages from the previous cisco_ucs probe in the alarm console after the migration process.
Fabric Interconnect - Local Storage Monitors are not available.
Group configuration option does not exist in version 2.01.
Contents
Revision History
EMC Clariion Supported Versions
Probe Specific Software Requirements
Upgrade Considerations
Installation Considerations
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.11
Fixed Defect:
GA
January
2016
GA
June 2015
Beta
March
2015
Some metrics use multiple collection intervals to calculate metric value. The probe generated false alarms for these
metrics on each restart. Support Case Number 246156
2.10
What's New:
The probe version 2.0 and later is available only through the Admin Console GUI and not through the Infrastructure
Manager (IM) GUI.
The probe now includes the standard static alarm threshold parameters.
Note: On upgrading the probe from a previous version to 2.10, all the probe specific alarm configurations in the probe
monitors are automatically replaced by Static Alarms. The probe does not support rollback of these alarm configurations. The
following features are not yet supported in version 2.10:
Custom QoS creation and migration
Automonitors
Templates
For more information, refer to the Upgrade Considerations section below.
Added the following new metrics:
six LUN Metrics: Read IOPs, Write IOPs, Read Bandwidth, Write Bandwidth, Bandwidth and LUN Utilization.
one Mirror View Metrics: State Number
four SP Metrics: Blocks Read, Blocks Written, Total Reads and Total Writes.
Fixed Defect:
Updated Navisphere CLI version in Software Requirements. Salesforce case 00166569
2.00
1.65
Fixed Issues:
March
2015
Corrected the metrics value of: LUN metrics- LUN Service Time, Storage Processor metrics- Utilization, Total IOPs,
AverageBusyQueueLength, Service Time and Response Time. Salesforce cases: 00148680, 00155222
Disabled the following LUN metrics: LUN Response Time and LUN Queue Length.
Added one new LUN metrics: Utilization
The probe was unable to monitor LCC state on VNX 5600. Salesforce case 00149979
The heap size for the clariion probe has been changed to: -Xms512m -Xmx1024m. Salesforce case: 00152761
1.64
Fixed a defect where the Source Override feature was not working properly. Salesforce case 00099959
Added support for VNX and VNX2.
October
2014
Fixed a defect where the Infrastructure Manager (IM) GUI was not able to fetch and display the profile information when the
probe was opened from a remote machine.
July 2014
1.61
Fixed the defect of not displaying the disk metrics for a LUN. The metrics were available at device itself but not for the
LUNs, where are built from a pool. Now, the probe displays all disk metrics for LUN also.
January
2014
Fixed the defect of passing the Test button even when the NaviSecCli command fails. Now, the probe fails the
subsequent commands and the Test button when the NaviSecCli command fails.
Fixed the defect of not reflecting the updated message severity, when the Clariion system is not responding. Now, the
probe shows the updated message details when details are updated in the message pool irrespective of the Clariion
status.
1.60
1.60
Fixed Disk State for hot spare drives; now reports "Hot Spare Ready". Upgrading the probe to the 1.6 version overwrites the
existing (incorrect) Disk State configuration.
Added default templates.
Added simulation mode using recorded data.
December
2012
November
2012
1.30
1.20
September
2011
August
2011
May 2011
March
2011
December
2010
1.04
December
2010
1.03
December
2010
November
2010
October
2010
Upgrade Considerations
The upgrade considerations for the probe from versions 2.0 and later are listed as follows:
You must delete all the files in the util directory of the windows local temp directory.
This process must be repeated for each instance of UIM that accesses the robot. The process clears the cache and avoids opening the
Infrastructure Manager GUI.
The following features are not supported:
Custom QoS creation and migration
Automonitors
Templates
Installation Considerations
Navisphere CLI must be installed on the same system as the clariion probe.
Note: A cluster comprises of one or more resource groups and cluster nodes. A resource group is shared between the cluster nodes.
Only one cluster node has the control of a resource group at any given time. A resource group is a collection of shared resources like
disk drives, IP addresses, host names, and applications.
The cluster probe enables failover support for a set of CA UIM probes in clustered environments. The probe configurations that are saved in a
resource group remain the same, even when that resource group moves to another cluster node. The cluster probe sends alarms when a
resource group or cluster node changes state.
The probe version 3.30 and later provides Logical Volume Manager (LVM) support for HP-UX platform. If you upgrade to cluster version 3.30,
cdm LVM disk profiles are created in cluster.
Contents
Revision History
Supported Probes
Set up Cluster Monitoring in cdm
Upgrade Considerations
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
3.33
Fixed Defects:
GA
December
2015
On RHEL platform, the cluster disks were displayed as local or network disks on the cdm Infrastructure Manager (IM). S
upport case number 00160058
Note: You must use cdm version 5.61 with cluster version 3.33 to view the cluster disks on the cdm Infrastructure
Manager (IM).
When node service was stopped, cluster probe marked resources offline and kept sending _restart command to other
probes. Support case number 70002275
3.32
October
2015
Fixed Defects:
On Linux platform, the cluster disks were displayed as local disks on the cdm Infrastructure Manager (IM). Salesforce
case 00135028
GA
The probe displayed incorrect status of clustered disks on the IM. Salesforce cases 00169389, 00170460
Updated the Supported Probes section in Release Notes to describe the cluster configuration that is supported by
sqlserver, oracle, and ntperf. Salesforce case 00169432
3.31
What's New:
GA
June 2015
Beta
May 2015
What's New:
Added Logical Volume Manager (LVM) support for HP-UX platform
Added Log Size field
Provided an option to select the protocol key as TCP or UDP for raw configuration. By default, the protocol key is TCP.
Salesforce case 00160858
3.20
What's New:
March
2015
Added support for configuring the probe through the Admin Console (web-based) GUI.
September
2014
3.12
September
2013
3.11
June 2013
Fixed issue where the probe is unable to define any monitoring profile
Fixed an issue where the probe does not clear alarms after failover to next node
3.10
December
2012
September
2011
3.00
August
2011
2.72
Fixed group synchronization when not using node IPs in cluster.cfg (applied fix from v2.66 into 2.7x release).
March
2011
2.72
Fixed the group synchronization when not using node IPs in cluster.cfg (applied fix from v2.66 into v2.7x release).
January
2011
2.66
Fixed the group synchronization when not using node IPs in cluster.cfg
January
2011
2.71
January
2011
2.65
Fixed a potential program failure on SOLARIS (logging of NULL pointer terminates probe on SOLARIS).
January
2011
2.70
December
2010
2.64
Fixed a potential program failure on SOLARIS (no node IP in cfg causing failure).
September
2010
2.63
September
2010
2.62
August
2010
Fixed the issue of wrong IP in NAT environment in the GUI and the probe.
Fixed the issue of double cluster group devices listing in CDM
Fixed the issue of cluster drives being reported as local in CDM on non-owner nodes
Added a validation while adding shared sections or subsections in GUI
Removed whites spaces from the cluster names at the time of discovery
Version 2.62 withdrawn because of potential communication errors when configuring.
2.61
2.60
June 2010
April 2010
Added support for Resources Failed and Resources Not Probed in hastatus.
2.52
Fixed the issue of Drive reported as Disk3Partition1 in case the device is down on the cluster
March
2010
2.51
Fixed the CDM mount point handling issue in the Microsoft cluster plugin dll.
March
2010
2.50
Added support for merging configuration when configuration is done across different cluster nodes
Added support for configuring shared resources individually and in bulk.
2.30
2.21
March
2010
March
2010
July 2009
2.04
June 2008
Fixed association of same profile to multiple Service Groups (This is not allowed).
2.03
April 2008
September
2007
1.61
June 2007
June 2007
September
2006
Fixed issue with resource groups not having their states set correctly when alarms were turned off
April 2006
Fixed issues relating to synchronization between cluster probes, especially when adding new resources
Fixed security issue when synchronizing probe configuration between nodes
1.22
Cosmetic GUI changes. Added Refresh to menu. Fixed text for clear alarms
December
2005
Supported Probes
The following table lists the supported probes and their corresponding supported cluster environment:
Important! Use the cluster probe with other probes only when you configure the cluster probe on a robot in a clustered environment.
Supported Probe
cdm
The probe supports only disk profile monitoring on cluster version 2.20 and later.
Active/Passive
N+I node cluster
Note: For more information, see the Set up Cluster Monitoring in cdm section.
dirscan
2-node Active/Passive
N+I node clusters if the profile names are
unique
exchange_monitor
The probe supports only Microsoft Cluster Server (MSCS) monitoring on cluster version 1.61
and later.
logmon
Active/Active
Active/Passive
N+I node cluster
2-node Active/Passive
N+I node clusters if the profile names are
unique
nperf
2-node Active/Passive
N+I node clusters if the profile names are
unique
ntservices
2-node Active/Passive
N+I node clusters if the profile names are
unique
oracle
2-node Active/Passive
N+I node clusters if the profile names are
unique
processes
2-node Active/Passive
N+I node clusters if the profile names are
unique
sqlserver
2-node Active/Passive
N+I node clusters if the profile names are
unique
Upgrade Considerations
This section lists the upgrade considerations for the cluster probe.
Any cluster disk profiles that are created in the probe version 3.31 or earlier are not supported on probe version 3.32 or later. Configure a
cluster disk in the cdm probe and then create cluster disk profile in the cluster probe.
Restart the cdm probe to see the cluster disks.
Any cdm disk that is converted from local to cluster and is enabled for monitoring, is deactivated.
Note: To run the probe version 3.32 on Admin Console, you require:
CA UIM 8.31 or later
ppm probe version 3.23 or later
Note: Restart the nis_server when you deploy the ci_defn_pack probe.
Probe Provisioning Manager (PPM) probe version 2.38 or later (required for Admin Console)
Java JRE version 6 or later (required for Admin Console)
Known Issues
The cluster probe has the following limitation in Admin Console:
Cluster profiles for cdm, logmon, and processes probes can only be created on the node where the cluster probe is started.
Revision History
Version
Description
State
Date
7.80
GA
What's New:
Support for OpenSSL
When using TLS 1.1 or 1.2 cipher suites, include an alternative fallback to
prevented the controller from acquiring a new license for a probe from a distsrv on a remote hub. A new
license is needed when a probe with an expired license is restarted. Now, the remote controller is automatically
logged in to the hub and acquires a new SID when the SID expires.
The controller starts java probes on AIX.
On AIX systems, the controller failed to start java probes. The AIX system call to start java probes now allows them
to start. The process name now appears on AIX systems as
full_command_to_start_java_p
June
2015
7.70
Important: SSL communication mode options are more meaningful with the release of controller v7.70. The controller
creates the robot.pem certificate file during startup. The file enables encrypted communication over the
GA
March
2015
GA
December
2014
GA
November
2014
GA
June
2014
GA
March
2014
OpenSSL t
ransport layer. The treatment of the robot.pem certificate file has changed. For details, see Impact of Hub SSL Mode
When Upgrading Nontunneled Hubs in the hub Release Notes. Changes in treatment impact communication for:
Hubs set to mode 0 (unencrypted)
Hubs set to mode 2 (SSL only)
Hub managed components
What's New:
User tags are propagated by the hub and the controller. Hub and controller alarms and messages now contain user
tags. Previously, the hub and controller read user tags os_user1 and os_user2 from the file robot.cf
g. Now, the hub reads user tags from the file hub.cfg. The General Configuration section in the Admin Console
hub configuration UI allows users to specify os_user1 and os_user2. On a hub system, the hub spooler for
the hub probe adds user tags to probe alarms and messages. On a robot system, the robot spooler adds the user tags.
Note: The os_user_include option, which enabled the hub to read user tags from
was removed from the hub v7.70. Hubs and robots at v7.70 do not read user tags from
hub robot had defined user tags, they will remain in
robot.cfg,
robot.cfg. If the
user tags to hub probe alarms and messages, specify the user tags in
logmon configuration.
Issues when the robot mode was changed from passive to active.
7.63
OpenSSL 1.0.0m
Removed a potential crash condition during a shutdown
Improvements to port free checks under various circumstances
Package included in CA UIM 8.1 distribution
7.62
The get_info callback includes MAC address information for AIX, Solaris, and HP-UX (in addition to Linux and
Windows).
Resolved core dump issue on a controller shutdown
Fixed a defect that caused the robot to unregister from the hub on shutdown.
Resolved a socket leak in the
New native robot installers and ADE support for AIX and zLinux
Robot 7.60 for zLinux
Robot first probe port defaults to 48000
Package included in NMS 7.60 distribution
7.05
7.10
Added support for Microsoft Windows 2012 Server and Microsoft Windows 8.
GA
December
2013
Note: The support integration of Cloud Monitor (Watchmouse) in UMP 2.6 and cuegtw probe is for English only locale.
Contents
Revision History
Prerequisites and System Requirements
Preconfiguration Requirements
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for the cuegtw probe.
Version
Description
State
Date
1.06
Fixed Defects:
GA
April 2014
GA
February
2014
Fixed a defect in which different MetIds were getting generated for Alert, Reminder and Clear alarms for the same
profile.
1.04
Fixed Defects:
The cuegtw probe stops sending alerts after certain time. The user has to restart the probe frequently to receive the
alerts continuously.
The cuegtw probe was taking any random port number instead of the port number as configured in the port settings of
the Controller probe.
1.03
The Setup tab help button incorrect link defect has been fixed.
1.02
Different time zone issues where probe doesn't display messages when the API and probe are in different time zones.
Implementation of clear alarm when API sends the ok message for a profile.
Truncated Message issue where some alarms are truncated because of alarm message length.
Cursor Implementation in probe.
Alarm sequence issue: Alarms not displayed in the proper sequence in an earlier version.
GA
June 2012
1.01
This version resolve a critical proxy issue where user use automatic script files for configuration or http calls redirect through
proxy server.
GA
March 2012
1.00
Beta
December
2011
January
2013
Note: The support integration of CA Unified Infrastructure Management Cloud User Experience Monitor or CUE in UMP 2.6 and cuegtw
probe is only for English locale.
Preconfiguration Requirements
The Cloud Monitoring Gateway probe requires the CA Unified Infrastructure Management Cloud Monitor API user credentials to fetch RSS feeds.
app_pool_hit_ratio
app_avg_direct_read_time
app_avg_direct_write_time
app_cat_cache_hit_rto
app_pkg_cache_hit_rto
app_locklist_util
bp_pool_hit_ratio
bp_pool_avg_async_read_time
bp_pool_avg_async_write_time
bp_pool_sync_write_time
bp_pool_avg_write_time
bp_avg_direct_read_time
bp_avg_direct_write_time
bp_pool_sync_reads
bp_pool_sync_writes
bp_pool_sync_idx_writes
bp_pool_sync_idx_reads
ts_usable_pages_pct
ts_used_pages_pct
ts_free_pages_pct
ts_max_used_pages_pct
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Authorization and Environment Variable Requirements
Installation Considerations
Known Issues
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
4.09
Fixed Defects:
GA
October
2015
GA
October
2014
The probe was unable to generate an alarm when the current state of a monitored tablespace was not "NORMAL". Sale
sforce case 00150521
The probe was unable to restart after a new connection was created. Salesforce case 00154620
The size of the niscache was growing to around 1GB in a day. Salesforce cases 00165738,00148339
4.08
Fixed Defects:
Fixed the core dump issue on AIX that occurred each time the probe was deactivated. Salesforce case 00132849
4.07
Fixed Defects:
Fixed a defect in which the Status Tab did not show the correct color code of the profile.
April 2014
4.06
Added support for TNT2 compliance where Device Id and Metric Id are generated correctly.
March
2014
Added support for keeping probe alive in case of missing client software.
Fixed defect where probe alarm message value was missing in UMP alarm console.
4.05
Fixed Defects:
GA
June 2013
Fixed issue where the "Request Failed" and "Warning" message were getting displayed on probe GUI.
Fixed incorrect values coming on AIX platform for sql queries having columns with INTEGER data types.
4.04
Fixed: A spelling error "Time form" to "Time from" in the Checkpoint GUI.
GA
January
2013
4.03
GA
November
2012
GA
September
2012
4.01
Added "All Database Status" checkbox in setup tab in GUI which provide choice to customer to select default or all
database status. Added "Clear Alarm on Restart" checkbox which will allow users to control clearing of alarms at probe
startup. Replaced database password showing in cleartext with "********" in logs. Verified probe compatibility with
DB210.1 on AIX6.1 and AIX7.1 Fixed issue where few of the chekpoints were giving false positive and negative value
alarms.
August
2012
4.00
The connection configuration now supports ODBC DSN as well as manual configurations of connectivity parameters.
July 2011
The probe does not support connection and sql timeout alarms.
Added support for custom checkpoints.
3.41
March
2011
3.40
December
2010
Fixed a semaphore leak issue in the probe. The issue occurred when the probe tried to attach to db2 instance using
improper credentials.
October
2010
3.33
Added support for alarms & QOS' on all active databases in i_check_dbalive checkpoint. Earlier, the probe reported only
for default database configured in the profile's connection. Existing QOS series for i_check_dbalive checkpoint will no
longer work; instead, a new series will be created.
October
2010
Fixed an issue where the probe stops reporting checkpoint values and throws DB2 error SQL1092N.
3.32
Fixed issue related to alarm sending an alarm when DB2 server is down.
Fixed issue related to releasing DB2 database context resource after its use.
September
2010
Fixed an issue in db_status checkpoint, the issue is related incorrect status reporting.
3.31
September
2010
3.30
July 2010
July 2010
3.20
Added a new checkpoint i_pct_active_connections for monitoring percentage of active connections to total available
connections.
March
2010
3.13
Fixed an issue where the probe used to report incorrect value for db_since_last_backup checkpoint.
September
2009
3.12
Fixed probe crash when instance node selection button is clicked on the connection configuration page.
July 2009
Fixed UI validation issues when selecting instance node on the connection configuration page.
Added missing message variables for checkpoints
Fixed a crash in probe stop callback
Fixed an issue where probe was wrongly getting attached to dummy DB2 instance and throwing SQL1096N error
Optimized GUI performance when loading config file
Fixed minor GUI issues
Removed 64 Bit porting warnings
Object Viewer introduced
New checkpoints 'db_log_util_rto', 'db_since_last_backup'
3.04
3.02
3.01
December
2008
August
2008
July 2008
Note: Restart the Nimbus Watcher service to initialize the DB2 environment variables after installing DB2 client on Windows
platform.
Installation Considerations
Increase the DB2 instance configuration parameter MON_HEAP_SZ by at least 128. This change also requires you to restart the DB2
server.
You must follow these actions to access the DB2 server.
Set the DB2INSTANCE for default local instance
Catalog the remote or additional local instances in the DB2 node directory
Catalog the monitored databases in the database directory.
The probe supports DB2 client library versions greater than or equal to the target DB2 server version. For example, to monitor DB2
server version 9.5, you must use the DB2 client libraries of version 9.5 or higher on the machine where the probe is deployed.
Known Issues
The db2 probe has the following known issue:
db2 Limit on AIX
On AIX platforms, DB2 supports maximum ten shared memory segments per process when using a local connection. Since each profile requires
two memory segments, only a maximum of 5 profiles can run at a given time. When this limit is exceeded, the DB2 returns a SQL1224N error
message. The workaround proposed by IBM is to catalog the instance and the related databases as remote connections, as there are no such
limitations imposed for this connection type.
Important! This workaround changes the way you interact with your DB2 if you are using local session.
Revision History
Threshold Configuration Migration
Changes After Migration
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
Known Issues
Revision History
This section describes the history of the revisions for the dirscan probe.
Version
Description
State
Date
3.14
What's New:
GA
September
2015
GA
July 2015
GA
March
2015
GA
October
2014
What's New:
Added sendresponsetime key in Raw Configure > Setup to send QoS value in seconds instead of byes/seconds for
QOS_DIR_RESPONSE_TIME. Refer Upgrade Considerations for more information.
Fixed Defects:
The probe adds a '\' at the end of directory path and cannot identify the directory. Salesforce case: 00160487
The probe does not save pattern configurations for file monitoring. Salesforce cases: 00167616, 00165924, 00164986,
00163884
3.12
What's New:
The probe can be migrated to standard static thresholds using the threshold_migrator probe. The Admin Console GUI
will be configurable using the standard static threshold block and the Infrastructure Manager GUI will no longer be
available.
Added support for factory templates.
3.11
Fixed a defect where the probe was unable to generate the Used Space Delta alarm.
3.10
June 2014
3.09
Fixed the defect of scheduler by allowing user to clear specific hours of one particular day of a week. Earlier the probe can
select or clear the specific hours for an entire week.
April 2014
3.08
June 2013
3.07
Directory search with % parameters without directory exist check issue fixed
May 2013
3.06
April 2013
3.05
May 2012
March
2011
Fixed NULL QOS value for age of file when their is no file in the directory.
3.03
March
2011
January
2011
2.91
June 2010
2.90
March
2010
2.81
Fixed the problem which would sometimes cause the probe to crash when running multiple profiles.
March
2010
2.80
September
2009
2.73
Changed the behavior for the integrity profiles: If the MD5 signature for a file is not set, it will be generated and stored in the
configuration file during the probe startup or restart process.
July 2009
2.72
June 2009
Fixed clear for file size, file age and file response alarms.
Fixed crash situation on stop and restart.
Disabled 'Fetch' button in configuration tool for settings not supported by the test command.
Fixed token for 'DirAge' alarm message.
Cleanup of code for mapping/unmapping shares.
Added option for Quality of Service message on directory existance.
Reorganized configuration tool, separating directory scan and file integrity profiles.
Fixed directory scan problem on shares introduced in version 2.43.
Fixed file and directory variables for clear messages so that these are consistent with other alarm situations.
Added scan type for file size: largest, smallest or individual file.
Added scan type for response time: longest, shortest or individual.
Modified browse/doubleclick behaviour.
Improved error handling on memory allocation situations.
Added support for Windows on Itanium 2 systems (IA64).
2.53
December
2008
2.52
Problem with generated directory names containing the '/' character as directory separator on Windows is resolved.
Configuration tool fixed to disallow empty profile name.
September
2008
October
2007
This version turns off profile monitoring if directory does not exist.
Clear messages are now sent for file read response.
Added cluster probe support.
Clear messages are sent on response time within limit and also when the file no longer exists.
Modified logic in the directory browser so that the '[..]' entry will not appear at the share level.
Fixed handling of more than one directory specified and of directory recursion.
Added new alarm message used for alerting when unable to open a file for obtaining a response time.
2.42
May 2007
2.29
2.27
Age constraint alarm can now be sent either for each breaching file, or oldest/newest if only a single alarm is desired.
Alarm messages fix.
Added extra log information at loglevel 2 during file scanning.
2.26
Remove clear flag for the messages section in the cfx file which would clear out custom messages.
The flag was added in v2.22 because of a change in the message format.
See Installation Notes below if upgrading from a version prior to 2.22 for information on what to change in the cfx file to
upgrade the messages correctly.
November
2006
August
2006
July 2006
2.25
Windows: Fix problem with username/password not being set correctly when testing a connection
July 2006
Age constraint alarm is sent for each file breaching the threshold (not just oldest/newest). QoS data is still for the
oldest/newest however.
Solaris, AIX and Linux: Added utility (stat64) for reading file sizes over 2GB.
GUI: user and password information for shared drives would not show up when editing an existing profile.
2.22
April 2006
Added support for file integrity check using MD5 digest. Added support for using Date/Time primitives (strftime parameters)
when specifying directory names.Support for customizing alarms and "Clear" messages.
November
2005
2.20
Changed QoS definition for file age to use unit "s". Added support for file size alarms on large files on Windows.
November
2004
Important! The IM GUI of the probe is not available if the probe is migrated using threshold_ migrator. Opening the IM GUI then
displays a message that the probe can only be configured using the Admin Console and redirects you to the Release Notes of the
probe.
Memory: 2-4 GB of RAM. The OOB configuration of the probe requires 256 MB of RAM
CPU: 3-GHz dual-core processor 32, or 64 bit
Installation Considerations
Version 2.22 implemented a new layout of the message pool. If upgrading from a version prior to 2.22, you need to modify the dirscan.cfx file in
the package.
Follow these steps:
1. Get the latest version of the probe from the Internet Archive
2. Double-click the dirscan package in the Archive node of the Nimsoft Infrastructure Manager
3. Select the tab for your platform and click on dirscan.cfx
4. Right-click and select Edit file.
5. Locate the <messages> section and add the 'clear' keyword after the closing bracket. Click OK
6. Repeat for each platform you intend to distribute the updated probe to
7. Distribute the update to all systems running v2.22 or older
Note: All custom messages will be cleared and default messages in the new format will be inserted.
The description of the QoS for file age, QOS_DIR_AGE was previously defined as Age of oldest/newest matching file. In version 2.22, this has
been changed to Age of matching file.
Upgrade Considerations
The probe has the following upgrade considerations:
The 3.13 and later versions of the probe have a key sendresponsetime key in Raw Configure > Setup with the default value as '0'. You
can set this key to '1' to send the QoS value in seconds instead of byes/seconds for QOS_DIR_RESPONSE_TIME.
Known Issues
The dirscan probe has the following limitations:
The log file incorrectly displays a logon failed message when the probe fails to authenticate the specified user credentials. In this
scenario, the probe applies the user credentials used for probe deployment.
The dirscan probe has the following limitations in the Admin Console GUI:
Schedules are not supported.
Existing profile(s) cannot be copied to create new profiles.
Recalculation of checksum using Recalculate checksum is not available for pattern based files.
Non Windows remote directories cannot be monitored. (applicable to both Admin Console and Infrastructure Manager GUI)
Description
State
Date
8.20
GA
March
2015
Discovery Server has been enhanced to include callbacks that allow you to cleanly delete devices from both the UIM
and UDM tables in the UIM database. With the introduction of UDM, manual deletion of devices using SQL commands
was prohibited, as it can cause the UDM tables to be out of sync with the rest of the UIM database. Using the callbacks
introduced in this release provides you with a mechanism to delete devices without causing syncing issues.
Discovery Server has been enhanced to work in conjunction with the vmware probe to provide partial graph publishing.
To enable this feature, additional configuration in the vmware probe is necessary.
Fixed: Discovery Agent v8.2 contains fixes which improve the collection of ICMP ping discovery results.
Discovery supports subnets with a CIDR notation number of /16 or larger only. Entering range scopes larger than
65,536 addresses when defining a discovery range may result in one of the two following behaviors:
When using the snmpcollector probe to query for a list of devices, defining a range greater than /16 will generate an
exception with an error message indicating that the query is not supported.
When using the Discovery Wizard to define one or more subnets using the same CIDR notation, defining a range
greater than /16, or entering multiple /16 ranges will generate an out of memory exception error.
WMI and SSH discovery provide more detailed host system information than SNMP. We recommend that you use WMI
or SSH discovery in addition to SNMP.
A bug in MySQL 5.5 causes a slow restart of discovery_server. Upgrading MySQL to version 5.6.13 or later resolves the
issue.
If you are unable to find a device in USM by IP address, it may be that the device has multiple IPs, and discovery
identified it by a different address.
As part of 7.0 discovery server and discovery agent enhancements, a device with multiple IP addresses is now shown
as a single device in USM, and as multiple distinct devices per IP address. If you can not locate a device in USM by IP
address, try searching for it by name.
The NIS Manager link in Infrastructure Manager does not work because it has been removed from the product. Use the
Discovery Wizard, available in the USM portlet to configure discovery components.
SSH password authentication is disabled with OpenSUSE 12.x.
Discovery Agent uses password authentication to connect to a target device over SSH. Discovery Agent cannot
communicate with a device where SSH is configured for other authentication methods, such as keyboard-interactive.
Discovery Agent also does not support public key authentication or challenge-response authentication.
For details, see the CA Unified Infrastructure Management - 8.2 Release Notes.
8.10
GA
December
2014
8.00
GA
September
2014
GA
June 2014
GA
March
2014
GA
December
2013
GA
September
2013
USM is enhanced to display more detailed agent status during the discovery phase.
Discovery Wizard and Discovery Agent support seed device discovery. When entering Ranges in the wizard, if you
specify information about one or more seed devices, discovery agent can:
Automatically discover the local subnets that it collects from the seed devices.
Find other devices connected to the seed devices to accelerate the discovery of known devices in the network. See
Define Scopes for more information about seed device discovery.
Enabled integration with CA Capacity Management by providing more detailed processor attributes for host systems
when the discovery agent is configured for WMI, SSH, and/or SNMP discovery. Note that WMI and SSH discovery will
provide more detailed processor information than SNMP. The new host system attributes published by the discovery
agent are:
ProcessorDescription
NumberOfPhysicalProcessors
NumberOfProcessorCores
NumberOfLogicalProcessors
WMI and SSH discovery provide more detailed host system information than SNMP.
Unable to find device in USM by IP Address. Click here for workaround information.
A bug in MySQL 5.5 causes a slow restart of discovery server.
NIS Manager Link in Infrastructure Manager does not work because it has been removed from the product. Use the
Discovery Wizard, available in the USM portlet, to configure discovery components.
SSH password authentication is disabled with OpenSUSE12.x.
Fixed: Reintroduced population of CM_SNMP_SYSTEM and CM_SNMP_INTERFACE tables with discovery agent
results in order to restore pre-7.0 Service Oriented Configuration (SOC) functionality and enable ACE to automatically
configure the interface_traffic and cisco_monitor probes.
Fixed: Discovery Server and Relationship Service do not properly handle Discovery Agent and Topology Agent moving
from its primary hub to its secondary hub. Now, a single Discovery Agent instance is maintained in the database with its
configuration data.
Fixed: Discovery Server failed to get discovery information for a robot managing probes that did not have the group field
set. This information is now obtained.
Fixed: An alarm was issued if an attach queue was not created for the probe_discovery subject, even if a post queue
was configured. Now, an alarm is issued only if neither an attach queue nor a post queue is configured for
probe_discovery.
Fixed: Discovery Server correlation has been adjusted to account for Cisco ASA routers. New exclusions were added to
the MAC address correlation configuration to account for Cisco ASA internal data interfaces.
Fixed: Discovery Server failed to resubscribe to a hub due to malformed nametoip requests.
For details, see the CA Unified Infrastructure Management - 8.0 Release Notes.
7.60
7.50
7.10
7.00
6.50
GA
March
2013
GA
December
2012
GA
September
2012
GA
June 2012
GA
March
2012
GA
December
2011
3.41
3.40
3.31
3.30
3.29
November
2011
First plugin-based version of Discovery Agent to support probe based discovery and better enable service oriented
configuration in Unified Service Manager
2.39
GA
August
2011
2.20
GA
October
13 2010
2.19
GA
October 7
2010
2.00
GA
May 2010
Revision History
This table lists the known issues, fixed defects, and revision history for the discovery_server probe.
Version
Description
State
Date
8.31
GA
August
2015
Added AWS probe support for auto-scaling groups. More information about auto-scaling groups is available at Create
and Manage Groups in USM and http://aws.amazon.com/autoscaling/.
USM has been enhanced to allow users to set an alias for devices and network interfaces. To support this, the set_user
_properties probe command was updated to set the UserAlias property.
Improved support for nfa_inventory probe. For more information, refer to the nfa_inventory probe Release Notes on
the CA UIM Probes space.
8.20
GA
March
2015
GA
March
2015
Discovery Server has been enhanced to include callbacks that allow you to cleanly delete devices from both the UIM
and UDM tables in the UIM database. With the introduction of UDM, manual deletion of devices using SQL statements
was prohibited since it can cause the UDM tables to be out of sync with the rest of the UIM database. Using the
callbacks introduced in this release provides you with a mechanism to delete devices without causing syncing issues.
Discovery Server has been enhanced to work in conjunction with the vmware probe to provide partial graph publishing.
To enable this feature, additional configuration in the vmware probe is necessary.
Discovery supports subnets with a CIDR notation number of /16 or larger only. Entering range scopes larger than
65,536 addresses when defining a discovery range, may result in one of the two following behaviors:
When using the snmpcollector probe to query for a list of devices, defining a range greater than /16 will generate an
exception with an error message indicating that the query is not supported.
When using the Discovery Wizard to define one or more subnets using the same CIDR notation, defining a range
greater than /16, or entering multiple /16 ranges will generate an out of memory exception error.
WMI and SSH discovery provide more detailed host system information than SNMP. We recommend that you use WMI
or SSH discovery in addition to SNMP.
A bug in MySQL 5.5 causes a slow restart of discovery_server. Upgrading MySQL to version 5.6.13 or later resolves the
issue.
If you are unable to find a device in USM by IP address, it may be that the device has multiple IP addresses, and
discovery identifies the device by a different address.
As part of 7.0 discovery server and discovery agent enhancements, a device with multiple IP addresses is now shown
as a single device in USM, not as multiple distinct devices per IP address. If you cannot locate a device in USM by IP
address, try searching for it by name.
The NIS Manager link in Infrastructure Manager does not work because it has been removed from the product. Use the
Discovery Wizard available in the USM portlet to configure discovery components.
SSH password authentication is disabled with OpenSUSE 12.x.
Discovery Agent uses password authentication to connect to a target device over SSH. Discovery Agent cannot
communicate with a device where SSH is configured for other authentication methods, such as keyboard-interactive.
Discovery Agent also does not support public key authentication or challenge-response authentication.
Fixed: Multiple cases where devices are not showing up as duplicate devices in USM because they are not being
correlated.
Fixed: An issue in which discovery_server is not applying enriched origins to NIS cache devices. Salesforce case
00153429.
Fixed: An issue in which discovery_server fails to re-subscribe to a hub after the hub connection is dropped. As a result
of this error, changes to the hub and/or its robots are not persisted in the database.
Fixed: An issue in which SNMP devices from the discovery_agent are not fully persisted to the database if a field (e.g.,
system description) is greater than 255 bytes. As a result of this error, the device is not populated in the
CM_SNMP_SYSTEM table. Furthermore, the device is not persisted in UDM, which prevents it from being visible in
USM.
Fixed: An issue in which the Discovery Server probe does not restart. This error is experienced regularly in
Infrastructure Manager, and only intermittently in Admin Console.
Fixed: An issue in which NIS cache devices with an invalid IP address are not persisted, including their associated
configuration items and metrics.
Fixed: An issue in which configuration items and metrics are not persisted in the database.
Fixed: Multiple cases where elements are being persisted into the UIM database, but not to UDM.
See the CA Unified Infrastructure Management - 8.2 Release Notes for more information.
8.12
8.10
GA
December
2014
Device correlation in Discovery Server and Discovery Agent is improved to avoid false matches and incorrectly merging
non-matching devices into one device.
Discovery publishes details of each network interface.
UDM Manager is coupled with Discovery Server to integrate UDM as a key part of inventory management:
Discovery Server writes inventory to both UDM manager and NIS/TNT2 tables.
Discovery Server uses log4j logging to capture UDM log information.
Logsize is no longer specified in discovery_server.cfg. Logsize is set in the log4j.xml file.
Some data resides only in UDM (for example, USM relies on UDM as the sole source for some interface data views).
Improvements to the discovery of AIX and HP-UX systems make it possible to deploy robots to these systems using
USM.
Improvements to device correlation enable Discovery Server and Discovery Agent to avoid false matches and the
merging of non-matching devices into one device.
Enhancements to interface origins. By default, an interface inherits its origin from its associated device. If the interface
origin is enriched through the QoS processor, that origin can now be associated with the interface in UDM.
Discovery supports subnets with a CIDR notation number of /16 or larger only. Entering range scopes larger than
65,526 addresses when defining a discovery range may create an out of memory condition.
WMI and SSH discovery provide more detailed host system information than SNMP. We recommend that you use WMI
or SSH discovery in addition to SNMP.
A bug in MySQL 5.5 causes a slow restart of discovery_server. Upgrading MySQL to version 5.6.13 or later resolves the
issue.
If you are unable to find a device in USM by IP address, it may be that the device has multiple IP addresses, and
discovery identified it by a different address.
As part of 7.0 discovery server and discovery agent enhancements, a device with multiple IP addresses is now shown
as a single device in USM, not as multiple distinct devices per IP address. If you cannot locate a device in USM by IP
address, try searching for it by name.
The NIS Manager link in Infrastructure Manager does not work because it has been removed from the product. Use the
Discovery Wizard available in the USM portlet to configure discovery components.
SSH password authentication is disabled with OpenSUSE 12.x.
Discovery Agent uses password authentication to connect to a target device over SSH. Discovery Agent cannot
communicate with a device where SSH is configured for toehr authentication methods, such as keyboard-interactive.
Discovery Agent also does not support public key authentication or challenge-response authentication.
Fixed: Device correlation in Discovery Server and Discovery Agent is improved to avoid false matches and incorrectly
merging non-matching devices into one device.
See the CA Unified Infrastructure Management - 8.1 Release Notes for more information.
8.00
September
2014
7.60
June 2014
Expanded SNMP device characterization for AIX operating system information, Cisco Unified Communications devices,
and HP Tipping Point devices.
Added excluded_ip_addresses configuration parameter to IP address correlation. The parameter value supports a
comma-separated list of any combination of individual IP addresses, IP address ranges, and IP subnets in CIDR
notation (for example, 1.2.3.4, 10.1.2.3-99, 172.1.2/24).
Added excluded_fqdns and excluded_domains configuration parameters to FQDN (fully qualified domain name)
correlation.
Fixed: Discovery server includes a new version of the SDK to provide better performance on systems with multiple
network interfaces running on different subnets.
Fixed: In some cases the IP address was used as the computer name instead of the name provided by a probe.
Fixed: When a robot running a discovery agent moved from one hub to another, an extra discovery agent record was
created. Now the same discovery agent record (including the agents discovery configuration) is maintained after the
robot moves to another hub.
7.50
March
2014
December
2013
Added correct operating system name and version detection for Microsoft Windows Server 2012 R2.
Eliminated support for Microsoft Windows Server 2003 (all forms).
Reintroduced LUA script functionality to enable user customization of name, dedicated setting and OS info.
Expanded SNMP device characterization to better detect OS type, OS version and device role (the dedicated setting).
Fixed: Discovery Server failed on MySQL during startup when there were duplicate entries in DS_ROBOT_XREF.
Fixed: discovery_server 7.0 did not respond appropriately to origins with spaces/commas.
7.00
September
2013
New probe_discovery queue for collecting a richer set of element data from probes.
Enhanced device correlation.
Probe GUI discontinued.
Changed handling of expired systems to only delete systems with no QOS data.
Fixed: Discovery did not identify multiple IPs of network devices (e.g. routers or switches) as one device.
Fixed: Devices displayed multiple times in USM.
Fixed: Some devices listed two origins when the names match.
Fixed: Duplicate Origins on remotely monitored systems.
Fixed: CM_Computer_System: nimbus_type changed to 0 when robot moved to another hub.
Fixed: discovery_server failed to insert records as origin field size is exceeded.
6.60
June 2013
6.50
March
2013
Accepts imported device information from the cm_data_import probe and saves to the information to the NIS database.
Enhancements to support Discovery Wizard (in USM).
3.50
December
2012
3.41
September
2012
To better support the Automatic Deployment Engine (ADE) and UMP, enhanced discovery to get and save processor
architecture (32/64-bit) and Linux distribution information.
Updated to use new version of Nimsoft SDK to better handle running on a system with multiple IP addresses.
3.40
June 2012
Enabled up to 200 discovery agents to be controlled by the discovery server without requiring an increase in the number
of threads or amount of memory used.
Provided a default level of Service Discovery by adding pre-defined service definitions for well-known network services
such as HTTP, FTP, email, etc.
Enabled discovery server to run on a different robot other than the primary hub.
Reduced memory usage to support up to 5000 robots using the default maximum Java heap size of 1GB.
Added further improvements to reduce the number of sockets and database connections to help overall Nimsoft Server
performance.
Fixed database transaction deadlocks seen with Microsoft SQL Server; fixed other SQL error cases.
Fixed SOC defect.
3.29
March
2012
Only information from the top-level entity is saved (e.g. the router or switch). Information about lower level entities is not
saved (e.g. power supply).
Improved origin handling to better support custom origins in multi-tenancy environments.
Reduced number of threads and database connections to help overall Nimsoft Server performance.
Re-established regular interval for collecting data from hubs and robots to ensure that full updated data is collected on a
timely basis.
Improved accuracy and consistency of data written to the database.
Improved performance of database updates.
Fixed: Some customers no longer show their devices in USM/Dynamic Dashboard. Handle origin changes in discovery.
Fixed: Systems are missing from dynamic views due to many CM_DEVICE entries with null cs_id.
Fixed: discovery_server maps robots with same IPs incorrectly in data centers that have robots with different names, but
the same IP address.
Fixed: Help fix discovery agent waiting forever on waiting for configuration from discovery_server.
3.28
December
2011
Increased default maximum Java heap size to 1GB (from 256MB) to support up to 2000 robots.
3.25
November
2011
First plugin-based version of discovery_server to support probe based discovery and better enable service oriented
configuration in Unified Service Manager.
2.79
August
2011
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.07
Fixed Defects:
GA
January
2016
GA
September
2015
The probe was unable to expand the $resource_name, $asp_number, and $disk_unit_number variables in the alarm
message. Support case number 00245283
1.06
What's New:
Added support for IBM iSeries version V7R2.
1.05
Fixed configuration tool profile save problem caused by a problem with the 'Disk usage in %' QoS setting.
GA
August
2012
1.04
GA
January
2012
1.00
Initial version
GA
June 2011
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
5.30
GA
Sept
2013
5.30
Defect fixes
June
2013
5.22
For Windows: fixed problems for the probe when encountering 'problem' packages.
July
2012
Configuration tool: changed save operation to save all settings in one request to the controller.
5.21
New libraries
Dec
2011
5.20
Sep
2011
Added handling of SID expired situation on distribution over tunnels (while using remote distsrv).
Support for IPv6. (Future use.)
5.14
July
2011
5.13
Fixed intermittent problem on Linux platforms introduced in beta version 5.12 where distsrv could stop responding after some time.
Jun
28
2011
5. 12
Jun
22
2011
Changed field type of the server field in the forwarding profile dialog to allow unlisted values.
'attempt_number' cleanup done to avoid looping and added specific error message added for situations where the distribution
result can not be obtained.
Fixed package sort by date in configuration tool.
Fixed dependency checking on forwarded distribution.
Added a small delay between processing forwarding profiles.
Separated license read/write locks.
Added option to enable/disable forwarding.
5.11
Jan
27
2011
5.10
Jan
19
2011
5.04
Dec
2010
5.03
Parameter size limit for tasks increased. Also added limit checking.
Nov
2010
5.02
Oct
2010
5.00
Jul
2010
4.95
Dependencies in remote distributions did not behave well. Problems were found and fixed:
Jun
2010
4.91
Enhanced support for multiple Network Interface cards (NICs) on the computer. Nimsoft communication is now supported in other
NICs than the default Robot interface.
Mar
2010
4.90
Updated Libraries
Dec
2009
4.80
Oct
2009
Changed callbacks from secondary distsrv back to primary to ensure asynchronous behavior.
Added caching of IP addresses on robot name lookup.
Added option to use local archive on remote distributions.
Store information on completed distributions in embedded database.
Added option to trigger forwarding immediately on a detected change.
Fixed situation where package was repeatedly forwarded.
'All versions' option for package forwarding added.
Fixed potential crash situation on 'archive_list'.
Improved error message on dependency which is not satisfied.
Rewritten package dependency checking on distribution. Multi-level dependencies are handled with new (3.53) robots.
4.74
Aug
2009
4.72
Support for no_shutdown and temp_directory when creating a probe from an existing probe.
Dec
2008
4.71
Sep
2008
4.18
May
2008
4.17
Fixed problem with deleting package without version, where the package with the highest version would be deleted.
Feb
2008
4.16
Changed version recognition logic to allow mapping of empty version string to package without version. Modified
'archive_create_from_probe' and 'archive_config_from_probe' to write to the correct package version file.
Nov
2007
4.00
Added support for multiple package versions. UNIX port to Linux and AIX
Feb
2006
Installation Considerations
When distributing distsrv, it cannot detect its own completion, so you will see an 'Installation Aborted' message even if there was no error. You
can verify that the new version of the distribution server is running by starting its configuration window and looking at the version on the status bar.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.68
Fixed Defects:
GA
July 2014
Added an option for configuring the log file size through IM probe GUI. Salesforce case 00131070
1.67
Made configurable DNS type A or Any in case user has configured type A through GUI.
April 2014
1.66
June 2013
1.65
February 2013
1.64
March 2012
1.63
January 2012
1.62
December 2011
1.60
March 2011
Updated the probe to use c-ares DNS library v1.6.0. The probe no more uses host utility to perform dns queries.
September 2010
Added support for performing reverse lookup along with forward lookup from a single profile.
Added support for TXT DNS query type.
Added support for specifying DNS port.
Added support for CHAOS DNS query class.
Prohibited the use of white space in GUI configurator and fixed some minor GUI issues.
1.41
June 2010
1.40
May 2010
1.31
May 2010
1.30
September 2009
August 2008
June 2008
August 2006
1.21
1.22
June 2006
March 2006
The probe is now using the 'host' utility instead of 'nslookup'. This utility is distributed with the probe.
Fixed problem with "Non-English" operating system.
May now run on various unix-systems as well.
Installation Considerations
A 'host 'utility is distributed with the package. This utility is based on bind 9.6 package.
The Tru64 (OSF1_4 alpha) package contains host based on bind 9.3. The 'host' utility is only on 32-bit Windows platforms.
When upgrading from versions prior to 1.23, alarm messages must be cleared manually before upgrading. The probe will malfunction if these
messages are not cleared before upgrade.
Note: Upgrading from versions prior to 1.23, alarm messages must be cleared manually before upgrading. The probe will malfunction if
these messages are not cleared before upgrade. On Upgrade from GA to TNT2 version OR Downgrade from TNT2 to GA version, the
probe-specific file alarms.txt should be removed from the working directory to ensure smooth upgrade and downgrade.
Contents
Revision History
Probe Features
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
User Access Prerequisites
Google Chrome Support
Upgrade and Migration Considerations
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.24
Fixed Defects:
GA
November
2015
GA
July 2015
GA
March
2015
GA
November
2014
When viewing data on UMP, the probe did not override the default alarm source. Salesforce case 00165820
Updated the Software Requirements in the probe Release Notes to describe the ppm probe dependency. Salesforce
case 70003306
2.23
What's New:
Added support for executing web-based scripts on Google Chrome.
Note: This feature is supported only on NimRecorder 5.2 and later. Refer to the Google Chrome Support section for more
information.
2.22
What's New:
Added support for Static Threshold for custom metrics.
Added support for factory templates.
Fixed Defects:
The probe did not show up on the robot even though it was successfully distributed. This issue occurred on a Windows
7, 32-bit operating system. Salesforce cases 00152965, 00159264
2.21
2.20
Added NimRecorder 5.0, which is compatible with both 32-bit and 64-bit environments and removed Wintask64. The
scripts that are recorded with Wintask64 are compatible with NimRecorder 5.0 also.
March
2014
Added a feature to allow selecting multiple files during package building session.
Beta
June 2013
Added a feature to remember user preferences versus last used dir In package builder.
Added a feature to Packaging Screen Default Support.
Added a validation to select at least one script file to create package.
Wintask64 added to support 64-bit applications.
2.00
1.93
March
2013
November
2012
1.92
Fixed source override for script generated messages when a source is specified in the profile.
May 2012
June 2011
1.90
December
2010
1.82
September
2010
1.81
September
2010
1.80
June 2010
1.71
Resolved problem with multiple desktops (RDP connections) in e2e_starter. Turned off legal notices before automatic
login and reset on probe termination. Fixed crash situation on callback using missing data.
January
2010
Fixed problem with starting script before login was completed on Windows 2008.
1.71
1.63
1.62
January
2010
October
2008
August
2008
February
2007
1.60
January
2007
1.52
December
2006
1.51
October
2006
June 2006
May 2006
Clean up in nimmacro.dll.
1.00
Initial version. Functionality as for the replaced wintask probe. Subsystem set to 1.2.3.7.
Probe Features
The e2e_appmon probe provides the following features:
Alarms: You can specify threshold values to generate the alarms when these values are breached.
QoS: You can select the QoS option to generate the QoS messages on total run time for profiles.
February
2006
Scripts: The e2e_appmon probe runs scripts at specified intervals to monitor the availability and the response time of the target
applications. With the NimRecorder (shipped with e2e_appmon_dev probe), you can make your own scripts. Use the e2e_appmon API to
include checkpoints in your scripts. The scripts must be compiled on the machines where they are run.
Important! Do not run other applications or tasks on the monitoring computer, as it disrupts the probe monitoring and
measurement process.
Note: The e2e_appmon probe has limitations to use Optical Character Recognition (OCR) in the scripts. So, instead of using
OCR, you can use the bitmap synchronization (synchronization on an image) to use the text logo.
Sample Script: The probe is deployed with a sample script. Activate the probe and run the sample script. Compile the script on the
target computer before executing it. If you execute the script before compiling, an error message displays. For example, the error
message is: Error at line 348: Impossible to load the module.
A script is compiled based on each operating system configuration. Hence, the compiled script can run only on similarly configured
computers.
Probe Editions: The e2e_appmon probe is available in the following two editions:
The runtime edition of the probe enables you to run precompiled scripts.
The developer edition of the probe, the e2e_appmon_dev, lets you create scripts and include checkpoints in those scripts. The
e2e_appmon_dev probe measures intermediate time of each process along with the total runtime of the script. Some help files to aid
develop scripts are also included.
The NimRecorder: You can use NimRecorder, which is included in the probe package, to create your own scripts. The NimRecorder can
be launched from Start > All Programs > Nimsoft Monitoring > E2E Scripting. The NimRecorder has the following menu options:
Compile *
Help *
Open Script *
Run Script
Script Editor *
Spy *
Uninstall NimRecorder
Note: The options marked with an asterisk (*) are available only in the developer edition of the probe or after installing the
NimRecorder manually.
The e2e_appmon probe is now available with NimRecorder 5.2 to create and execute scripts for 64-bit platforms. The NimRecorder 5.2
also supports web-based scripting for Internet Explorer, Google Chrome, and Mozilla Firefox. Refer the NimRecorder 5.2 help for
supported platforms and browser versions and other relevant information. You can access the NimRecorder help from the Start > All
Programs > Nimsoft Monitoring > E2E Scripting > Help location. This help file is also available from the Help menu available in the
NimRecorder application.
Note: Google Chrome support is available only with NimRecorder 5.2 and later.
Probe Provisioning Manager (PPM) probe version 3.20 or later (required for Admin Console)
Java JRE version 6 or later (required for Admin Console)
Installation Considerations
The following points must be considered while installing the probe:
When migrating from the earlier wintask probe, disable the wintask probe.
On the first startup of e2e_appmon, the probe reads and takes the wintask probe configuration.
Note: The QoS object names are changed. The instances of the probe running on a computer with the earlier wintask probe now
generate QoS on a different object. The probe instances that are installed on a clean robot will have the new QoS object names. You
can change this option in the Variables tab of the probe.
The probe is initially inactive. Specify the user credentials with which the probe connects to the system. Once you provide the user
credentials and activate the probe, the current session exits, and the probe initiates a login as the specified user.
Important! Ensure that no screensaver is active as it interferes with the running of the scripts.
Notes:
Verify that the environment is properly set up during the first run. For instance, the 'internet connect' wizard may need to be
run.
No instances of Internet Explorer should be open during the installation.
Note: The Google Chrome support is available only with NimRecorder 5.2 and later.
Uninstall any previous version of the NimRecorder and ensure that the bin directory under the Nimsoft/e2e_scripting directory does not
exist anymore. Delete the bin directory, if exists, and then install the latest version of the probe. Take backup of all existing scripts before
uninstalling the NimRecorder.
In case, you are upgrading the e2e_appmon_dev probe from 2.0x to 2.2x and the probe does not upgrade the NimRecorder to
NimRecorder 5.2. Navigate to <Nimsoft Installation drive>/Nimsoft/probes/Application/e2e_appmon/install and double-click the nim
recorder52.msi for installing it manually.
Note: If you are upgrading the e2e_appmon from earlier wintask probe, refer the e2e_appmon probe 2.0 documentation or earlier.
Revision History
Probe Specific Software Requirements
Probe Specific Hardware Requirements
Alarm and Quality of Service Messages
Installation Considerations
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.44
Provided the option (via Raw Configuration) to disable GSSAPI Authentication with Exchange Server.
GA
August 2013
1.43
GA
February 2013
1.42
GA
January 2013
1.41
GA
November 2012
1.40
December 2010
1.30
September 2010
1.21
June 2010
1.20
May 2010
1.16
September 2009
1.15
December 2006
November 2006
March 2005
1.08
Note: For SOC functionality, NM Server 5.6 or later and UMP 2.5.2 or later is required.
Installation Considerations
When creating a profile to receive mail from Microsoft Exchange, in some cases the probe is unable to log on if the Exchange user has an alias
defined.
based on predefined criteria for alarms, such as, severity, origin, and time. For Windows robots, you can use the SMTP server or the Exchange
server to send emails. For robots on other operating system, you can only use SMTP to send emails.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Preconfiguration Requirements
Installation Considerations
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.74
Fixed Defects:
GA
January
2016
GA
September
2015
GA
December
2014
The probe crashed when you manually executed the script. Support case number 70005789
2.73
Fixed Defects:
The probe now includes the entire alarm string in the email messages. Salesforce case 00160416
Updated the preconfiguration requirements in the Release Notes about installing SMTP Server certificate for using TLS
functionality. Salesforce case 00154509
2.72
Fixed Defects:
Fixed a defect where the probe did not support characters more than 1024 in the Email field in the Profiles section for
the exchange server, and the SMTP server. Salesforce case 00145450
Fixed a defect where the probe was crashing frequently. Salesforce case 00148576
2.70
2.60
2.41
2.40
March
2013
December
2012
December
2010
September
2010
2.32
July 2010
2.31
September
2009
2.30
August
2009
2.21
Fix: In html mode a newline had gotten into the email body, causing some mail servers (seen on qmail) to return an
error.
January
2009
Fix: The AO-timestamp (which is only for internal use in the NAS) was erroneously a part of the text template, and has
been removed. The HTML template did not have this field and is not modified.
2.20
Fixed problem with subject being reset to default subject for some recipients in a group.
Added option to group recipients so that one email is sent with multiple recipients in the TO: field instead of a separate
email to each recipient.
January
2008
Content type will be set to text/html if a tag is found. If the use_html flag is set then this is added automatically.
Added new variables which display the time at the alarm sender's time (calculates timezone offsets and generates a
time string).
Note: Use of the new variables requires that Robot 2.68, Hub 4.20 and NAS 3.03 has been installed accross the
infrastructure as the required fields are not present in older component versions.
Fixed problem where sending to multiple recipients could cause a crash due to a memory error.
Retry with authentication if the mail server returns error '554 Relay access denied'.
2.10
Fix: Retry connection when connection to Hub drops. Send alarm when there are connection issues.
Fix: Bug in error handling could cause a loss of connection to the Hub.
2.08
Windows: Retry connection to server on connection errors. This is an amplification of the error handling for error code
10048 in the previous release.
October
2007
September
2006
Add option to ignore TLS for servers that announce the capability but don't actually support it. Windows: Retry
connection to server if error code 10048 is returned. This can happen on servers with a large number of sockets in use.
Fix: Strip newlines from subject before sending email, otherwise the mail header gets corrupted. This can cause html
mail to appear as plain text.
2.04
Fix: $ signs were treated as variables even though they did not match any known variable names and getting removed from
the message. Only known variables (see list in the Variable Substitution section below) will be blanked, leaving all other
strings containing a $ in the expanded message.
March
2006
2.03
GUI setup dialog has 'Use HTML format' checkbox added which sets the 'use_html' parameter in the config.
February
2006
2.02
Fix: SMTP Server test now verifies connection even if no authentication credentials are specified. Used to verify
connection to open SMTP servers.
December
2005
Flag use_html can be set to 'no' in RawConfigure to get cleartext mail. A cleartext template (template.txt is included)
should be set if use_html=no.
2.01
Added $count variable, which is $suppcount + 1, to match alarm list. Template.html uses $count in place of $suppcount
to avoid confusion.
December
2005
Added new variable $hostname_strip which strips out the domain part of the hostname. The template.html file uses this new
variable by default. Fixed a bug where a dot (.) as the 76th character on a line would be removed. This caused IP addresses
to be changed in some cases.
March
2005
Note: 64-bit version of Outlook must be installed on 64-bit machines. User access settings (UAC) must be at lowest level if you
are using Outlook 2010. This setting applies to Windows Vista and later versions.
Preconfiguration Requirements
This section contains the preconfiguration requirements for the probe.
The nas (Alarm Server) probe must also be configured to send alarms to the email gateway. For more information on the nas
configuration, see The Auto-Operator Tab article under the nas (Alarm Server) documentation.
The SMTP certificate must be installed on the host machine where the probe is deployed. This is required to use Transport Layer
Security (TLS).
The Linux robot where the probe is deployed must have OpenSSL certificate installed to use the TLS functionality to connect to the
SMTP server.
To configure the probe using IPv6 format, ensure that the following prerequisites are met:
The CA UIM system can run the secondary hub on IPv4 or IPv6 environment but the primary hub can only run on IPv4 environment.
To configure the probe on a remote hub that runs the probe on IPv6 environment, follow these steps:
Create a queue of type attach with subject *.
Create a queue of type get to fetch data from IPv4 hub by calling the queue that is created on IPv4 hub.
To configure the probe on a remote hub that runs the probe on IPv4 and IPv6 environment, follow these steps:
Create a queue of type get to fetch data from IPv6 hub by calling the queue that is created on IPv6 hub.
Create a queue of type post to push all the data to CA UIM IPv4 primary hub.
Create a queue of type post for pushing all data with Subject/Queue as an email to CA UIM secondary IPv6 hub.
Installation Considerations
To configure the probe with the Exchange server, Outlook client with an existing MAPI profile must exist on the same system where the
probe is deployed. The MAPI profile must be a user in the Exchange server that the probe uses to send emails.
Version 2.20 of this probe allows you to expand the time strings offset to the sender's local time if they are in a different timezone. This
requires Robot 2.68, Hub 4.20 and NAS 3.03 since the offset was unavailable prior to those versions.
Version 2.00 of this probe allows you to use the email field for a user in the CA UIM User Administration instead of the user profile. If you
want to override the default template and email subject, you will need to create a profile.
Version 1.60 of this probe allows you to send emails to defined users when an alarm is assigned to them. If you are using a queue on the
Hub, the Subject must be changed from "EMAIL" to "EMAIL,alarm_assign". If you are not using a queue, you can check the appropriate
box in the configuration utility and the probe will subscribe to the alarm_assign messages as well as the EMAIL messages.
Note: The profile name must be an exact match with the CA UIM user name for correct operation of email on assignment.
Version 1.41 of this probe requires version 2.03 of the Perl RunTime or SDK to run properly. Run the following perl script to know the
installed version:
#!perl
use Nimbus::API;
print "$Nimbus::API::VERSION\n";
Some SMTP servers validate the From: address. If this is the case with your SMTP server, ensure that the specified "Sender email
address" is valid.
Note: The probe only supports basic and NTLM (NT LAN Manager) authentication.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Consideration
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.03
What's New:
GA
July
2015
1.11
What's New:
GA
June
2014
Beta
June
2014
What's New:
Added support for monitoring Exchange Server 2010 SP2.
1.05
What's New:
March
2014
1.04
Fixed Defects:
January
2014
The Bounce Back emails were being sent back to wrong recipients. The probe was sending the Bounce Back emails to the
first email ID from the address list.
The default Roundtrip error message was displaying variable in place of value of the variable.
1.02
January
2013
1.01
Fixed Defects:
July
2012
Initial Release
June
2011
Installation Consideration
The ews_response probe requires access to a user account on the Microsoft Exchange Server to monitor connection and send test emails.
Revision History
Monitoring Support
Threshold Configuration Migration
Changes After Migration
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Upgrade Considerations
Installation Considerations
Preconfiguration Requirements
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for the exchange_monitor probe.
Version
Description
State
Date
5.20
What's New:
GA
September
2015
GA
June 2015
GA
October
2014
The probe can be migrated to standard static thresholds using the threshold_migrator probe. The Admin Console GUI is
configurable using the standard static threshold block and the Infrastructure Manager GUI is no longer available.
Removed support for Exchange Server 2003.
Removed support for exchange_monitor backend probe.
Updated units for some counters.
Note: For more details on the units and counters, see Upgrades Consideration section.
Fixed Defect:
Various metrics or Server Checkpoints were not being reflected from the probe configuration. Salesforce case 00152627
5.14
Fixed Defects:
Mailbox growth size and Mailbox growth alarms were displaying in NAS and the Alarm View on USM, but were not
visible on the device on USM. Salesforce case 00159622
The Admin Console GUI was displaying numerics, instead of logical symbols for operators. Salesforce case 00154187
The probe was not detecting the exchange server when only Transport roll was installed on the server. Salesforce case
00156643
Incorrect QoS values were generated for Messages Completed Delivery Per Second. Salesforce case 00158587
Note: The QoS values will only be generated on fresh deployment of the probe.
5.13
Fixed Defect:
Added new keys for allowing user to configure the Microsoft Exchange Server path for both 2010 and 2013 versions. Salesfor
ce cases: 00131700 and 00141867
5.12
Fixed Defect:
Added an option for configuring the log file size through the IM probe GUI . Salesforce case 00131514
July 2014
5.11
Added new thresholds for 2010 exchange server as per Microsoft recommendation.
Implemented new counters of Exchange Delivery Health Monitor for exchange server 2013.
September
2013
Implemented new counters of Exchange Transport Health Monitor for exchange server 2013.
Added support for monitoring remote systems over Internet Protocol version 6 (IPv6).
5.01
Fixed a defect where few counters are active on 2k10/2k13 in upgrade scenario even if they were inactive with previous
version.
March
2013
Fixed a defect where Ntevl profiles for 2k10/2k13 are getting displayed in status tab even for 2k7 servers.
"Mailbox Searches Per Sec- Store" and "Directory Access LDAP Searches Per Sec-2013" alarm_required has been
changed to yes.
For "Logical Disk Percentage Free Space" counter default alarm changed to PerformanceValueBelowLimit.
5.0
January
2013
Changed Perfmon Value not found alarms severity from Critical to user configurable with default as information.
4.01
4.0
August
2012
August
2012
Added functionality to consider dynamic instances of perfmon counters. This fixes perfmon-not found false alerts.
3.51
Fixed an issue where log level changes are not effective after reloading the probe's configuration changes.
Fixed an issue where activating counter in Profile Tab is not reflected in the Status Tab.
March
2012
Fixed an issue where the probe restarts every minute when qos=no.
Fixed an issue where initially only Status Tab is highlighted and other Tabs are disabled(for around 5min(approx). This
occurs as the counter list grows and probe checks presence of every counter instance.
Fixed an issue where incorrect clear alarms message were sent when no threshold is set.
Fixed an issue where incorrect alarm messages were sent when counter threshold is same as current value.
Fixed an issue where incorrect alarm messages were sent when counter threshold compare type is changed from
configuration.
Activated the counter User Count (Information Store) - 2003,2010.
Fixed an issue where incorrect clear alarms sent for Counters are giving wrong messages.
Fixed an issue where few Counters for 2k7 are activated by default when deployed on Exchange 2k10 Set up.
Fixed an issue where Probe is getting restart on putting login credentials (File Monitoring) and when directory path is left
empty while creating file monitoring profile.
Fixed an issue where Alarms/Qos coming for a deactivated profile (File Monitoring).
Fixed an issue where Messages are coming with a wrong Logic in 2k7 counters and some of the 2k10 counters Fixed
an issue where Alarm were getting cleared for wrong file (File Monitoring).
3.50
Support added for SOC. Added support for monitoring new performance counters on exchange 2010 server for Mailbox,
Client Access, and Common Counters. Added support for monitoring file/folder resources on shared cluster drives. Ported the
probe for native 64 bit environment. Fixed an issue where the probe was not able to register with the controller when
activated,the issue occurs when the reports cfg file grows in MB's.
December
2011
3.42
September
2011
3.41
August
2011
June 2010
Fixed an issue in pt library and probe, related to raising alarms on exchange events.
3.21
December
2009
3.20
Added code for packing origin and server roles when sending data to exchange backend probe.
Added code in GUI to disable 'Gather report information' checkbox and 'Force update of information' button if Mailbox
role is not enabled on the exchange server.
3.17
Fix: Due to changes in the underlaying LDAP libarary, some memory variables were improperly initialized and could
result in no LDAP data gathered. This fix applies to the data collector for the exchange server 2003.
December
2009
June 2009
April 2009
Added a fix for the error 'Input string was not in a correct format'. This resolves the issue of not retrieving all the
mailboxes available on the exchange server 2007.
January
2009
Added a fix which converts deleted_message_size_extended attribute from bytes to kilobytes for exchange server 2007.
Added a fix for not writing 'size' attribute of the mailbox with size greater than 1 GB for exchange server.
In rare situations, probe could come across distribution group with its exchange legacy key missing. This was
unexpected and could cause crash situation for the data collector on exchange server 2003.
3.13
3.12
Removed dependency to .NET 3.5. The new exchange_report data collector should now work with .NET 2.0.
The dependencies to the files 'Microsoft.Exchange.Data.tlc' and 'Microsoft.Exchange.Management.dll' has been
removed. The previous dependency caused problems on Exchange Servers 2007 with SP1 installed.
August
2008
July 2008
Bugfix: We have discovered a bug in the WMI asynchronous invocation method. This has been fixed.
3.11
Bug: A flag were not being properly reset, which could cause unwanted result. The key in /register/run_check controls if
the probe should enable/disable profiles based on the server it is being run on. This mode is only intended to be run the
first time you deploy the probe to a new exchange server. The flag were not properly set to off. The result would be that
profiles could be turned back on, even if you turned them off through the exchange_monitor GUI.
June 2008
Bug: The Exchange Server 2003 profile "VM Total 16MB Free Blocks" had a typo in it, cmd_type. It should have been
cmp_type. The result was that this profile would get wrong compare. Default threshold would be > 3, when it should
have been < 3..
3.10
New: Implemented data collector for exchange server 2007 servers, to use with existing exchange_monitor_backend
and exchange_monitor_reports.
Updated GUI to reflect the new data collector. GUI will query exchange_monitor probe, which will respond with server
type (2003 or 2007). The code bases for exchange_monitor 2003 and the exchange_monitor 2007 have been merged
together. This probe should run on both 2003 and 2007 servers.
The exchange_monitor probe will have 2 different data collectors shipped with it in the same package. One works with
2003 servers, using WMI, WebDAV and LDAP as previous version prior to 3.xx.
The new data collector is a .NET component which uses Exchange PowerShell cmdlets and LDAP to report similar
information as the old report collector.
Please note when setting up reports:
Exchange Server 2007 reports: Exchange Server 2007 with mailbox server role installed.
Exchange Server 2003 reports: Exchange Server 2003 backend servers.
Message tracking will need to be enabled.
Optimized: probe engine will ignore profiles which are not marked to be used for the server type.
Profiles shipped with the probe are all marked with server type. This means, some profiles are designed for Exchange
Server 2003, some profiles are designed for Exchange Server 2007 specific roles. Profiles can be run on more than 1
server type and more than 1 server role.
If the probe is running on a 2003 server, it will ignore 2007-only profiles even if they are marked active = yes. If it is
running on 2007 servers, it will ignore 2003-only profiles even if they are marked active = yes.
2003 server specific feature:
New feature: If you enable exchange_monitor_reports data collection, it will by default report the size of the EDB
exchange server database files (all versions prior to 3.10). You can now change it to use SLV/STM file, or use the sum
of both file sizes. This can be done from the GUI, on the reports tab.
June 2008
3.02
This Release Note covers both the 2.69 and 3.02 versions of the exchange_monitor probe and that
2.69 supports Exchange Server 2003 only
May 2008
Please note that this Release Note covers both the 2.69 and 3.02 versions of the exchange_monitor probe:
2.69 supports Exchange Server 2003 only
3.02 supports Exchange Server 2007 only
Please make sure that you, dependent on which version of the Exchange Server you want to monitor, download and
install the correct version of the probe.
Increased capacity of counters used to calculate traffic summary reports for a day. The previous counter were too small
at some sites, which could lead to negative values. See exchange_monitor_backend 1.11 release note for more details.
We have added a flag which controls the WMI invocation method. In all previous versions, synchronous method were
used. It is still default, but the value in /setup/wmi_method can be set to "async", in order to change method invocation
to asynchronous. We believe this will fix a rare WMI timeout problem. This key is only visible in the raw configurator. If
you click the Test button from the GUI, the value stored in the cfg will be used, as it is impossible to tune it from the
standard GUI.
Bugfix: The routine that was used to determine if it was time to run a complete report check had a flaw in it. It could
occur if a mailbox report generation had been started just before midnight local time to the probe, and the report were
finished collecting data just after midnight, it could cause the timestamps on the other fields (mailbox activity, traffic
summary, public folders etc) to be updated incorrectly. This will cause the exchange_monitor probe NOT collect any
new data for another 24 hours for these reports. If you are unlucky, this could last more than 24 hours, and then
suddenly the probe would recover by itself. This fault has been attempted fixed in the current version.
The exchange_monitor probe will now only read new data from a report collection, instead of reading the entire file
content again (e.g. traffic summary is only read once every 24 hours, but previously this information was being re-read
every 10 minutes by the exchange_monitor probe). This will result in less cpu cycles used.
Added an extra parameter to the callback "get_report", called "user_friendly_date", accepts string "yes", which will make
the datetimes returned by this callback to be formatted in user friendly format, like in the exchange_monitor_backend
GUI. This is only to help troubleshoot.
Bugfix: Attempting to fix memory allocation/deallocation mismatch problem, which could case program fault on some
systems.
Optimized: Attempting to filter out other exchange servers during the exchange_monitor GUI WMI Test button query, to
enhance execution time of this test
April 2008
2.67
Bugfix: When doing a base search for a delegate, who had a distinguished name which contained a comma (" , ") in the
name, this comma was not encoded into the request properly. This resulted in an invalid syntax LDAP error and also
caused the probe collector to crash.
January
2008
Bugfix: Report collector now utilize LDAP Paged searching, when querying active directory. This has been done to
support the default AD LDAP policy of max 1.000 records returned from a LDAP query. We page 100 records per query.
Your AD server(s) can't have any less sizelimit than 100 records.
Bugfix: Report collector now utilize ranged attribute retrieval, to overcome the default AD LDAP policy of 1.000 attribute
values (Windows 2000 domain controller) and 1.500 (Windows 2003 domain controller).
Bugfix: Report collector does not natively support Unicode. The SQL database supports Unicode. AD uses Unicode. Our
LDAP library returns UTF-8 encoding. There was a problem converting from UTF-8 to internal ANSI encoding.
Bugfix: The exchange monitor probe can now query multiple active directory domains for groups and user objects, that
belong to a given exchange server. This information needs to be entered through the GUI. One server must be chosen
as what we call "default", which means, that is the server we will be using when querying the
configurationNamingContext AD partition. This partition should be replicated equally to all servers. But we allow you to
choose the "default" one to query for configurationNamingContext.
Bugfix: Fixed a bug where users who were orphaned mailboxes, would appear on the "password never set" report.
Feature: In version 2.65 we introduced the optimization of re-using LDAP connections between searches during the
same data collection. We added an checkbox to enable/disable this feature. Disabling it means a new connection will be
used for each LDAP Query. We added this checkbox, because we are unsure of the LDAP timeout policies that exists
out there (how long one can IDLE before the connection is shut down by the server).
Optimized: The exchange monitor probe and exchange monitor backend probe can now communicate smarter with
each other, which may reduce the amount of network bandwidth used between the two.
Rewrote the code logic that parses group membership and delegate information. Because of the current limitations of
the database model we chose, there are some limits/issues. See below.
The data collector will still lookup delegates. Delegating to public folders or groups will not work. Delegating to exchange
users that reside (have mailboxes) on other exchange servers in your organization, should appear on the delegate
report, but you will not find any actual mailbox data (such as size, deleted items etc). To get this additional information,
you will need to setup an exchange monitor probe for those servers
2.65
If WMI impersonation error occurred, the test could still take up to "test_timeout" seconds before completing.
Optimized: Upgraded LDAP library to CLDAP from Novell for more security.
Optimized: Re-using LDAP connection between queries.
You can now change ldap ports, specify ldap search timeout and turn on or off ssl encryption for ldap connections.
When querying base DN of your domain for max password age and domain name, search scope has been changed to
LDAP_SCOPE_BASE.
Fix: Field introduced in 2.62 to optionally specify Conf.DN were not cleared if you were switching between 2 virtual
servers in a cluster on the same server.
Fix: Corrected loglevel on some logging information.
When reporting is turned on, the probe will try to query Active Directory for the attribute
"configurationNamingContext", in order to detect correct configuration tree DN.
GUI: Added a field to manually override the configuration tree DN, in case our query fails.
When the GUI checkbox named 'Filter stores and databases for other servers' was checked, the probe would supply an
incorrect filter when the exchange monitor probe was running outside a cluster. This could cause the
exchange_monitor_reports engine to filter data incorrectly.
The GUI would try to perform duplicate checks to see if the probe was active, in order to decide if the callback
"isCluster" should be run. One of these checks performed a check which would cause the GUI to crash if you were
running Nimbus server 3.21. The duplicate check has been removed
October
2007
2.60
Enhancement: We have implemented/enhanced the exchange monitor probe for better cluster support. But you will also
require cluster probe >= 1.60 if you wish to monitor exchange server running in a Microsoft Cluster. These two probes
will now communicate with each other, allowing for exchange monitor probe to "know" when it is in a cluster, and thus,
use the virtual exchange servers IP and name when generating both alarms and QoS messages, and then you will get
continued QoS data series when an virtual exchange server changes between nodes, and you will be able to view the
same dashboard, no matter where the virtual exchange server may be running.
April 2007
Probe should now be able to monitor exchange in any of the common cluster setups that we know of. For more
information, see cluster probe.
Fix: Report collector should now handle mailbox users that have delegates on other exchange servers within the same
domain (not only users on the same server).
Fix: We added default message templates that you can edit via the GUI for when message tracking is disabled, or when
we were unable to determine if message tracking is enabled or disabled. Message tracking logs are used to compute
the top 20 web reports.
Feature: Implemented code to support WebDAV requests over HTTPS. A checkbox is available in GUI to turn on/off
SSL, and it is also possible to change the default SSL port if your server requires another port for HTTPS.
Improved code logic when communicating with Active Directory over LDAP. We are now finding out where public folder
stores are located on IIS, and whether public folder instances (reported via WMI) are replicated locally or not. This
should give more accurate reports on public folders. Report collector now properly filters information and should no
longer report other exchange servers users and/or stores as if they belong to the server we are currently generating a
report for. Even when 2 virtual exchange servers are running on the same node in a cluster, this information is now
correct.
Fix: We have revised the QoS definitions for this probe. Some of the QoS data series generated from this probe use the
same counters as for example the CDM probe, but both probes reported different QoS definition names. We have
revised the QoS definitions in this probe, and they should now be consistent with those QoS definitions coming from
other probes
2.53
Optimized: GUI: Each of the 3 test buttons (for testing HTTP(WebDav), LDAP and WMI) are now being executed in
separate threads on the probe, which means the probe will not hang and still respond to controller etc, and it will also
still perform its other normal duties (if any), like checking services, processes etc.
December
2006
The GUI will make async. calls to the probe, meaning, it will not freeze up when its waiting for TEST output, you still
have to wait for it to complete, but can in case you don't want to wait, you can close the GUI. The GUI timeout value has
been increased from 60 seconds to default 300 seconds (5 minutes). If you for any reason need a higher timeout value,
it can be edited in the raw configure, the key "test_timeout" in /setup section will be respected by the GUI when asking
the probe to run a TEST.
Fix: The GUI can't turn on "gather report information" unless it has successfully run the 3 test buttons without failures.
You can however configure and use the test buttons even though "gather report information" is not turned on. This is
done to prevent the probe from starting report information before all the credentials are set up.
Fix: After running a full report generation, the exchange_monitor probe will not run the generation again until its next
scheduled run, even if one or more of the reports have missing data. You can however force the report to run again by
either clicking the "Force update of report information" in the GUI, or restarting the probe, or using the probe utility and
calling the command "reset_report". This has been done to prevent the probe report generator to eat system resources
if it for any reason can't get all the data it needs to make a full report.
Fix: More output has been placed in the exchange_monitor report engine program.
Fix: A bug that caused the database whitespace report calculations to be incorrect.
Fix: The checkbox to enable/disable monitoring of mailbox growth is now properly read back from the config file. There
was a bug that caused this checkbox to be turned on each time you loaded the GUI.
Fix: The input text-fields to type in LDAP, WMI and HTTP(WebDav) information has been slightly increased.
Fix: The test button for testing HTTP (WebDav) incorrectly had a header saying it was a WMI test.
2.52
Added clear alarms for the internal alarms occurring when exchange_monitor is not able to contact one of its supporting
probes.
September
2006
Added cluster support. This requires the cluster probe version >= 1.50. (See the cluster probe for more details).
July 2006
2.01
June 2006
2.00
June 2006
Monitoring Support
February
2006
From version 5.1 and later, the probe supports monitoring of:
Exchange Server 2013 SP1 on Windows Server 2012 and Windows Server 2012 R2.
Exchange Server 2013 SP3 on Windows Server 2012.
The probe also supports the following versions of Microsoft Exchange:
MS-Exchange Server 2003
MS-Exchange Server 2007 (Mailbox, ClientAccess, Hub- and Edge Transport roles)
MS-Exchange Server 2010 (Mailbox, ClientAccess, Hub- and Edge Transport roles)
MS-Exchange Server 2013 (Mailbox, ClientAccess roles).
Notes:
The Threshold Migration takes 3:30 mins for default probe configuration.
The IM GUI of the probe is not available if the probe is migrated using threshold_ migrator. Opening the IM GUI then displays a
message that the probe can only be configured using the Admin Console and redirects you to the Release Notes of the probe.
Note: You must restart the nis_server and service_host probe after you deploy these probes.
The exchange_monitor probe uses the following probes to provide data to check Exchange Server health. The following probes and component
requirements must installed on the robot where the exchange_monitor probe is installed.
perfmon probe 1.18 or later
The perfmon probe acts as a data repository. The probe fetches and holds performance object values from one or more computers,
requested by other probes on your system.
ntservices probe 2.32 or later
The ntservices probe monitors the list of installed services, detecting which of them are running.
ntevl probe 2.15 or later
The ntevl probe monitors the event logs for new messages.
processes probe 2.53 or later
The processes probe monitors the operating system for certain processes, defined by a set of profiles in the probe configuration file.
Note: These probes are automatically installed on the robot from the archive, else the exchange_monitor probe displays an error to
download and install the dependent probes.
Upgrade Considerations
From version 5.20, the units of the following counters have been updated.
Failures Due to Maximum Local Loop Count
Message Bytes Received per second
Message Bytes Sent per second
Inbound: LocalDeliveryCallsPerSecond
Inbound: MessageDeliveryAttemptsPerSecond
Inbound: Recipients Delivered Per Second
Hub Selection: Resolver Failures
Installation Considerations
Ensure that you review the following installation considerations:
Install the probe on each Exchange Server to be monitored.
The exchange_monitor probe generates QoS table names dynamically and the names may have more than 64 characters. This
character length can create issues when inserting data into SLM database.
When SLM database is created by a Nimsoft Server earlier than 3.35, the first column, name, in S_QOS_DEFINITION table of the SLM
database has a size of 64 bytes. The size must be 255 bytes. The QoS definitions with length greater then 64 characters will be
discarded by the latest data_engine probe or may cause earlier versions of data_engine to repetitively fail. This issue can be resolved by
changing the column size to 255 characters. You can change the size using the following query in a database tool.
You can also run a query on the database that is available with the CA UIM Technical Support.
Report information is gathered as follows:
2003: Using WMI, LDAP, and HTTP requests.
2007, 2010 and 2013 mailbox: Using LDAP, and Exchange PowerShell cmdlets.
The following identification methods are required for Mailbox Growth, Access Protocols and exchange_backend probe.
Exchange cmd-let identification
HTTP identification
LDAP identification
Note: LDAP is also required to login to the Active Directory server to enable the Collect Reports section.
WMI identification
The data collector for exchange mailbox server role requires Microsoft .NET Framework v2.0 and Microsoft Powershell v1.0.
Important! CA has ended the support for exchange_backend probe.
Preconfiguration Requirements
The probe accesses the Microsoft Exchange Server installation path to collect values of certain monitoring parameters. Configure the Microsoft
Exchange Server installation path for the probe to successfully collect values of the parameters being monitored from the server. If the server path
is not configured correctly, the probe will not be able to send any QoS data for DAG monitoring.
Follow these steps:
1. Open the Raw Configure GUI of the probe and navigate to the dag section.
2. Define the Microsoft Exchange Server installation directory path:
For 2010 - exchange_2010_path key value.
For 2013 - exchange_2013_path key value.
Important! The installation path must end with a forward slash (/).
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.51
What's New:
GA
September
2015
GA
May 2013
GA
January
2012
1.43
1.42
January
2012
1.41
January
2011
1.30
December
2010
March
2010
1.25
Changed message read algorithm to allow repeated reads. This is needed when keeping track of unanswered
messages in the situation when there are more newer messages than the configured 'messages to read' parameter.
November
2009
1.23
August
2009
July 2009
Stores away suppression keys of alarm sent to be able to clear those after a full restart of the computer.
Added support for additional message queues.
November
2007
January
2006
Added 'messages_to_read' internal parameter to be able to handle situations where the number of messages in
QSYSOPR is to large to be viewed normally.
Saves the key of the messages in the .last file to enable the probe to only read the newest messages.
Support added for the '@' character in comparison strings. Note that this requires robot version 2.41 also.
1.07
Enabled profile ordering and message answering fields when probe version is unknown.
Added option to limit messages handled by severity.
Added hidden options to set maximum length for message text (max_message_length) and maximum length of help text
(max_help_length).
April 2005
Considerations
Known Issues
Revision History
The file_adapter probe imports QoS data from the files that are defined in the profiles (for example, files that are generated by third-party
applications). When a matching file is found, it is parsed and one QoS message is produced per row.
The QoS message is formatted according to the profile defined for that file. Once a file has been processed, it is moved to a specified directory
and the probe is ready for a new data file.
Considerations
This section contains the considerations for the file_adapter probe.
Known Issues
The file browse function in the configuration tool does not work for UNC paths when user authentication is required.
Input values are accepted when starting with a digit event if other characters are present in the field. The value is determined from the digits up to
the other character.
Revision History
This section describes the history of the revisions for this probe.
Date
Description
State
Version
Dec 2010
GA
1.40
Jun 2010
1.30
Oct 2008
1.21
Sep 2008
1.20
Jul 2007
1.10
Oct 2005
1.02
Revision History
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.24
Fixed Defect:
GA
March 2015
Probe was generating alarms for zfs file systems in Solaris for the incorrect storage location etc/vfstab. Salesforce
case 00142827
1.23
GA
November
2014
1.22
GA
January
2013
1.21
GA
August 2012
1.20
GA
March 2011
1.11
GA
June 2010
1.10
GA
March 2010
1.02
Check if (v)fstab file has been modified and rescan if it has been
GA
March 2010
1.01
GA
June 2008
Google Apps is a set of applications that Google provides and maintains, such as, Google Drive and Google Mail. You can own a Google Apps
domain where you can create many user accounts. A Google Apps domain comprises of a set of end-user accounts, features, applications
available to the end-users, per-user pricing, and resource quotas. Google Apps currently offers three types of domains or editions- free,
educational, and premier.
For more information, refer to http://www.google.com/apps/intl/en/business/index.html
The google_apps probe gathers status information about the general Google applications from http://www.google.com/appsstatus and raises
alarms when any application service is unavailable. The status information is represented through the following codes:
0: App Normal
1: App Information Available
2: App Service Disruption
3: App Service Outage
4: App Status Unknown
The google_apps probe also monitors the Google Apps domain reports. The probe generates QoS data that describes all the metrics like the daily
performance of the domain-specific services and the operations that are performed on each user account.
Important! The google_apps probe is now available through Admin Console GUI only and not through the Infrastructure Manager GUI.
Upgrade from previous version to version 2.00 is not supported.
The google_apps probe collects data on the Google-published application status, available from http://www.google.com/appsstatus.
The probe is also capable of measuring and alarming on aspects of a specific domain. Google provides a set of domain reports from which the
probe gathers metrics. The probe is also capable of performing various end-user operations, like creating a document, and measuring the latency
of the operations.
Contents
Revision History
Probe Specific Software Requirements
Upgrade Considerations
Revision History
This section describes the history of the revisions for the google_apps probe.
Version
Description
State
Date
2.00
What's New:
GA
January 2015
1.01
GA
July 2010
1.00
Initial release.
Beta
March 2010
Upgrade Considerations
This section contains the considerations for the google_apps probe.
The google_apps probe version 2.00 and later is available through Admin Console GUI only and not through the Infrastructure Manager
(IM) GUI.
Upgrade from previous versions to version 2.00 is not supported.
You must deploy Probe Provisioning Manager (PPM) probe version 2.38, or later for configuring the Google_Apps probe version 2.0x.
For viewing the new metrics that are introduced in the google_apps probe version 2.0, on the USM portal, you can perform any one of the
following actions:
Upgrade CA Unified Infrastructure Management 7.6 (or earlier) to CA Unified Infrastructure Management 8.0
Install the ci_defn_pack version 1.00 probe. you are required to restart the nis_server when you deploy the ci_defn_pack.
Important! You can install the ci_defn_pack probe from https://support.nimsoft.com
The Hadoop Monitoring probe handles all the common monitoring and data collection tasks (collecting QoS measurements and topology
information) on a Hadoop cluster through a namenode, and gathers reports about all nodes in the cluster. The probe collects and stores data and
information from the monitored system at customizable intervals. The probe generates alarms when the specified thresholds are breached. You
can view the metrics, alarms, and reports in CA Unified Infrastructure Management.
Contents
Revision History
Version
Description
State
Date
1.0
Initial version.
GA
December 2014
Installation Considerations
1. Install the package into your local archive.
2. To ensure a successful installation of the probe package (drag and drop), it is required that a java.exe (version not critical) exist in the
PATH.
3. Drop the package from your local archive onto the targeted robot.
4. Double-click the probe for initial configuration. At first-time probe configuration, initiated by double-clicking the probe in Nimsoft
Infrastructure Manager, the installation wizard automatically will be launched. The wizard will prompt you for the path to the correct
version of IBM JVM and other environmental files required by the probe (see probe documentation).
Upgrade Considerations
None.
Important!
You can install the ci_defn_pack probe from https://support.nimsoft.com
Revision History
Hardware Requirements
Software Requirements
Considerations
Installation Considerations
Upgrade Considerations
General Use Considerations
Revision History
This section describes the history of the revisions for this probe.
Version
1.45
Description
Added support for a tunnel between the primary and secondary hub (the ha probe resides on the secondary hub).
State
Date
GA
June 2014
GA
March
2014
Added support for a wait interval before the probe begins failback after re-establishing communication with the primary hub.
Added Admin Console GUI.
1.41
Fixed startup sequence so it checks if a state change is required in the initial run.
GA
March
2011
1.40
GA
December
2010
1.30
GA
September
2010
1.25
Probe now caches IP and port of remote address to avoid repeated lookups on the "static" data.
GA
September
2009
GA
June 2008
1.23
1.20
Changed how queues are activated/deactivated to avoid potential problems with Hub restarting in the middle of the operation.
GA
April 2008
Added option to take a probe down when failing over with the new section "probes_down".
Fixed minor memory leak when restarting the probe.
Added configuration tool.
Changed name of section from "queues" to "queue_up".
Added section "queue_down" for queues which need to be deactivated when failover occurs. This is useful where the
secondary hub has a post queue for e.g. QoS data to the primary hub. To avoid duplicate entries this has to be deactivated. It
is reactivated after the primary hub comes back online.
Port to Linux, Solaris 8 (sparc) and AIX 5. No functional changes.
Changed control mechanism to active heartbeat checks. Queue is no longer required.
Initial Release.
Hardware Requirements
This probe has no additional hardware requirements.
Software Requirements
When installing of a 64-bit Linux platform, these 32-bit libraries are required:
Debian/Ubuntu -- ia32-libs
Redhat/CentOS -- glibc-2.12
Considerations
Installation Considerations
The probe must be installed on the standby Hub.
The probe is not activated after distribution. It must be configured, then activated manually.
If your NAS does not have the subsystem ID 1.2.3.8 defined, add it to the subsystems list in the nas or change the messages configurations to
use the string "HA" in place of the subsystem ID.
Upgrade Considerations
When updating to version 1.20 the old "queues" section is renamed to "queues_up".
To take advantage of the spooler address change, the configuration must be saved from the configuration tool after probe update.
reset_nas_ao - This allows you to specify whether or not to (de)activate the nas AutoOperator on the failover system. Specify 'yes' or 'no'.
The default is 'yes'.
In the probes_up section, you can specify a list of probes that are to be activated on the local Hub when a failover occurs. When the remote_hub
comes back online these probes are deactivated again. The keys are of the form probe_0, probe_1 and so on while the values are the names of
probes to be started/stopped.
In the queues_up section, you should specify the queues which are to be started during a failover. The same queue definitions must be set on
both the primary and secondary Hubs. The keys are of the form queue_0, queue_1 and so on while the values are the names of queues to be
started/stopped.
In the queues_down section, you should specify the queues which are to be stopped during a failover. The keys are of the form queue_0,
queue_1 and so on while the values are the names of queues to be started/stopped.
In the Messages section, you can change the alarm messages and their severities that are sent when a problem occurs. The severities are
numeric values from 0 (clear) through 5 (critical).
Take the following steps to ensure that the fail over process occurs correctly.
Revision History
Version
Date
State
7.80
June 2015
GA
7.70
March 2015
GA
7.63
December 2014
GA
7.62
November 2014
GA
7.60
June 2014
GA
7.05
March 2014
GA
7.10
December 2013
GA
Revision History
Requirements
Hardware Requirements
Software Requirements
Health Score Requirement
Installation Considerations
Known Issues
Unified Service Manager Graph Could Show Time Intervals Missing Health Scores (Fixed in health_index v1.11)
Revision History
This section describes the history of the revisions for the health_index probe.
Version
Description
State
Date
1.11
Fixed defects:
Policy data is properly transmitted to the health_index probe over a secure http connection when UMP is
configured to use https.
GA
August
2015
Controlled
Release
June 2015
GA
March
2015
Minor fixes to the log file and the configuration request timeout.
1.10
1.0
Initial release.
Requirements
Hardware Requirements
The health_index probe should be installed on systems with the following minimum resources:
Memory: 512 MB of RAM
CPU: 3 GHz dual-core processor, 32-bit or 64-bit
Software Requirements
health_index v1.11
Robot version 5.7
Java 7 (java_jre1.7) - the hubs where the required probes are running should have java_jre1.7 loaded in the Installed Packages in Admin
Console (typically installed with UIM v8.0 and later)
Admin Console v8.31
Alarm Server (nas) v4.73 and alarm_enrichment v4.73
Policy Engine (policy_engine) v8.2
Probe Provisioning Manager (ppm) v3.22
Verify CA Unified Management Portal v8.31 is installed and running on a hub or robot
health_index v1.0
Robot version 5.23 or later
Java 7 (java_jre1.7) - the hubs where the required probes are running should have java_jre1.7 loaded in the Installed Packages in Admin
Console (typically installed with UIM v8.0 and later)
Admin Console v8.2
Alarm Server (nas) v4.6 and alarm_enrichment v4.6
Policy Engine (policy_engine) v8.2
Probe Provisioning Manager (ppm) v3.11
Verify CA Unified Management Portal v8.2 is installed and running on a hub or robot
health index feature is enabled for all policies by default. For each policy you create a policy filter (on the Filters tab) to include all the devices that
you want to generate a health score as targets of the policy. See the Enable Health Index article for more information.
Installation Considerations
The health_index probe is installed on the primary hub by CA UIM Server installer v8.2 or later.
Known Issues
Unified Service Manager Graph Could Show Time Intervals Missing Health Scores (Fixed in health_index v1.11)
It's possible that from time to time, in environments with a large number of robots (over 40), health_index may reach an internal timeout condition
when retrieving configuration information from the UIM database. If this happens, health scores will not be calculated for that calculation interval. If
gaps are noticed in the health_index data, one potential work around is to increase the number of times health_index executes a calculation cycle.
To increase the frequency of calculation cycles, you could set the Policy Refresh Interval to 15 minutes and the Calculation Interval to 30 minutes.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues
Revision History
This section describes the history of the revisions for the history probe.
Version
Description
State
Date
1.06
What's New:
GA
September
2015
1.05
Fixed a defect in which the profile did not save the Description and Use Message option on saving the probe
configuration.
GA
May 2014
1.04
GA
January 2012
1.02
Fixed the issue in configuration tool where profile parameters were not saved correctly.
GA
December 2010
GA
December 2010
Beta
November 2010
1.01
1.00
Known Issues
The known issues for the probe are:
Messages which remain in the queue for longer duration are not detected by the probe until after the next flush, hence the probe triggers
alarms long after the event was originally sent to QHST.
Revision History
Preconfiguration Requirements
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Enable Performance Metrics
Upgrades Considerations
Known Issues and Workarounds
Large Config File May Cause Startup Problems
Revision History
This section describes the history of the revisions for the hitachi probe.
Version
Description
State
Date
2.05
Fixed Defects:
GA
October
2015
GA
May 2015
The probe generated incorrect values for the IO counters. Salesforce Case: 00141985
The Arrays did not expand if hub/domain or robot name contains "(" OR ")". Salesforce Case: 00155090
Updated the Add a Monitor section in the IM Configuration article to state that you can configure the templates using
drag-and-drop from the right pane only. Salesforce Case: 0070000610
2.04
What's New:
Added support for Hitachi Virtual Storage Platform (VSP) systems.
2.03
Added QoS for Used Managed Space and Percent Used Capacity.
October
2014
2.01
What's New:
Added functionality to configure the probe using the Admin Console and SNAP UI.
Limitations:
Removed the option of managing QoS list by the user.
User is not able to drag-and-drop entire template, user has to drag-and-drop individual monitor of the template.
December
2013
1.11
September
2013
1.10
Added support for AMS and HUS. Fixed LUN statistics correlation.
August
2013
1.03
December
2012
1.00
November
2012
Preconfiguration Requirements
The hitachi probe requires:
Hitachi Storage System USP-V, USP-VM, AMS, VSP, or HUS series.
SMI-S Provider
The SMI-S Provider service is supplied with the Device Manager software from Hitachi with the Hitachi Command Suite 7.
Hitachi Storage Navigator user account privileges for VSP
The user account must belong to one of the following built-in user groups:
Storage Administrator (View & Modify) User Group - Users have full read and write permissions to access the SMI-S function from the
management software.
Storage Administrator (View Only) User Group - Users have read- only permissions to access the SMI-S function from the
management software.
If the probe is deployed in a non-English locale, change the Startup key value to -Xms512m -Xmx1024m -Duser.language=en
-Duser.country=US.
Update the Startup key value through Raw Configure. The updated key value removes the issue of adding an extra digit while reporting
disk size and percentage.
You must enable statistics reporting on the Hitachi array to receive data for performance metrics from the Hitachi system. Statistics reporting
affects performance metrics for array, LUN, and port. If the statistics reporting is not enabled for the Hitachi array, the probe does not show the
performance metrics in the GUI. Monitoring storage arrays with a large number of logical volumes and disks, from a single probe can affect the pe
rformance of the probe. In such cases, increase virtual memory appropriately.
Some of the performance metrics are as follows:
Statistic Time
Total IOs
Kbytes Transferred
Read IOs
Read Hit IOs
Write IOs
Write Hit IOs
Upgrades Considerations
Consider the following points when upgrading the probe to 2.0 series:
The probe migrates only Resources and Templates.
The probe does not migrate Auto-monitors and static monitors. Therefore, the probe generates alarms and QoS only for resource
availability before configuring the checkpoints.
The Backup file contains the following information:
The old and new QoS definitions.
The template and template key.
Merged values of Setup and Startup tags.
Other keys which are already present in the old CFG file.
The probe includes the standard static alarm threshold parameters using CA Unified Infrastructure Management 8.2 or later.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Preconfiguration Requirements
Enable and Disable the CIM Server for HP 3PAR
NAS Subsystem ID Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.11
What's New:
GA
September
2015
June 2015
Preconfiguration Requirements
The probe has the following preconfiguration requirements:
The CIM server is required as a data collector for interfacing with the HP 3PAR storage system using the SMI-S provider. You must start
the CIM server on your system to enable communication between the probe and the HP 3PAR storage system using the SMI-S provider.
In UIM 8.2, the NAS probe must be updated with the correct subsystem IDs.
For more information, see the applicable section.
To enable the CIM Server via the CLI, use startcim command:
# startcim
To disable the CIM Server via the CLI, use stopcim command:
# stopcim
To display the overall CIM Server status, use the showcim command:
# showcim
Value
2.10.2.1.
Resource
2.10.2.2.
Physical Disk
2.10.2.3.
Logical Disk
2.10.2.4.
CPG
2.10.2.5.
Virtual Volume
2.10.2.6.
Port
2.10.2.7.
Controller Node
To update the Subsystem IDs using Admin Console, follow these steps:
1. In the Admin Console, click the icon next to the NAS probe, select Raw Configure.
2. Click on the Subsystems folder.
3. Click on the New Key Menu item.
4. Enter the Key Name in the Add key window, click Add.
The new key appears in the list of keys with a blank value.
5. Click in the Value column for the newly created key and enter the key value.
6. Repeat this process for all of the required subsystem IDs for your probe.
7. Click Apply.
The Subsystem IDs are updated to the NAS probe.
To update the Subsystem IDs using Infrastructure Manager, follow these steps:
1. In Infrastructure Manager, right click on the NAS probe, select Raw Configure.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Preconfiguration Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.01
GA
September 2013
1.00
Initial Release.
Beta
September 2013
Preconfiguration Requirements
The preconfiguration requirements for the HP Service Manager Gateway probe are:
The user credentials to access the HPSM application for working with incidents.
The CA Unified Infrastructure Management user to assign alarms, which triggers a new incident in the HPSM application.
Note: We recommend that you create a separate user in Infrastructure Manager for assigning the alarms. Refer to the Infrastructure
Manager guide for creating a user.
To install or upgrade a hub, see Installing on the CA Unified Infrastructure Management wiki space.
Contents
Revision History
Best Practices
Known Issues
Impact of Hub SSL Mode When Upgrading Nontunneled Hubs
SSL Communication Mode
Changes to SSL Communication in Controller v7.70
Communication Issues Between v7.70 and v7.63 (and Earlier)
Revision History
Version
Description
State
Date
7.80
GA
What's New:
Added support for OpenSSL
When using TLS 1.1 or 1.2 cipher suites, include an alternative fallback to
hub.cfg file
_tag_2.
If this option is set to 1, the behavior of hub v7.70 applies.
Note: User tag changes do not affect robot alarms that come directly from the robot. Hub
generated robot alarms, which occur when a robot or probes start or stop, are affected.
Removed ability to set SIDs with pu command. In past releases, the -S option of the pu command could be used
to set an explicit session identification (SID). This capability has been removed to prevent a security bypass through
SID injection.
Output character limit extended in pu executable. In the pu executable before v7.80, field output from callbacks
was limited to around 35 characters. A long output string might become unusable. To resolve the issue, the output
limit is extended to 300 characters.
June 2015
7.71
Hub v7.71 fixes an issue in hub v7.70 with assigning ports for tunnel client connections. Before v7.70, the tunnel client
connections would consistently use the 48xxx port range (based on the controller default first_probe_port setting of
48000). An issue in hub v7.70 caused the tunnel client connections to use a system assigned port number. System
assigned port numbers do not reliably fall in the 48xxx range. This caused issues with firewalls where the tunnel ports
were explicitly allowed and expected to be in the specific range.
GA
March 2015
With hub v7.71, the default port range for tunnel client connections again falls in the 48xxx range. As in previous
versions, the specific ports for tunnel connections can be overridden by enabling Ignore Controller First Probe Port (wh
ich enables the hub to use its own setting) and by specifying the desired port setting tunnel/ignore_first_probe_port = 1
and tunnel/first_tunnel_port = portnumber in the hub configuration file, hub.cfg:
In the Admin Console hub configuration UI, navigate to Advanced, Tunnel Settings. Under Tunnel Advanced
Settings, enable Ignore Controller First Probe Port. Specify the desired First Tunnel Port.
In the Infrastructure Manager hub configuration UI, navigate to Tunnels. Enable Ignore first probe port setting
from controller, and specify the desired First Tunnel Port.
7.70
Important: We recommend that you connect hubs with tunnels (see Best Practices for Hub-to-Hub Communication).
Hubs require tunnel connections. Without tunnels, hubs set to mode 0 (no encryption) cannot communicate with hubs set
to mode 2 (SSL encryption). See Impact of Hub SSL Mode When Upgrading Nontunneled Hubs.
New Features:
December
2014
Note: CA UIM administrators can override the default value by defining the origin in the robot
configuration file robot.cfg. In multi-tenant environments, an admin can specify the origin for
Note: The os_user_include option, which enabled the hub to read user tags from
robot
.cfg, has been removed. Now, the hub does not read user tags from robot.cfg. If the hub
robot has defined user tags, they remain in robot.cfg after the upgrade, but the tags are ignored.
To add user tags to hub probe alarms and messages, specify the user tags in hub.cfg.
User tags are propagated by the hub and controller. User tags are now propagated in alarms and messages by
hub system, the hub spooler adds these values to probe generated alarms and messages. On a robot system, the
robot spooler adds the tags.
Hub v7.70 can be configured to send an alarm for dropped messages. Probe messages use the subject for
routing in the message bus. The spooler drops messages If a subject is not configured in a hub attach or post
queue.
If a subject is not configured in a hub attach queue or a post queue, the spooler drops the message. Hub v7.70 can
send an alarm when a message is dropped. This behavior is disabled by default. To enable it, specify the following
parameter in hub.cfg:
subjects_no_destination_alarm_interval=seconds
mode=0, the controller does not create a robot.pem file at startup. Any
SSL connections.
Proxy_mode routes all callbacks to the probes through the controller port. Robots which are configured for p
roxy_mode, return to the designated parent hub after failover.
7.63
August 2014
tunnel_get_info callback.
June 2014
Improved socket management between two hubs that are connected by a tunnel
Long-running callbacks over a tunnel connection cause fewer communication errors
March
2014
Tunnel and queue connect and disconnect alarms are retroactively reset to the Information level matching their
actual meaning and impact.
The hub detects when the total resources in use threaten tunnel or hub viability. The hub reacts by either resetting
the tunnel or restarting the process, without data loss.
The origin for probes local to a hub can be set independently from the origin of the hub.
Enhanced LDAP and user level security and improved support of LDAP environments with large numbers of groups
Improved tunnel stability
Fixed defects:
A core dump on Solaris when hubup response contents are malformed
A significant number of sockets are temporarily left in the
with many child robots
Duplicate tunnel connections between the same client and server permanently disconnect the tunnels due to the
exhaustion of threads
7.11
January
2014
Best Practices
While most hubs perform well with little interaction from the administrator, you can modify various configuration settings for better
performance.
Hub-to-hub communication
Use tunnels to keep the communication connectivity intact between hubs.
Tunnels
Caching the SSL sessions can significantly speed up the server to client connection time.
To get tunnel alarms, increase the alarm level that is sent if a connection is lost or cannot be made.
Queues
Increasing the Bulk
Size of the queue allows the hub to transfer multiple messages in one packet. Increase the Bulk
Size when:
The size of a get or a post queue never shrinks to zero
Too many messages are queued
Known Issues
The ppm probe provides functionality for the Admin Console probe configuration UIs. The ppm probe does not run on AIX hubs. To
configure robots and probes on AIX hubs, use the Raw Configure utility in Admin Console, or use Infrastructure Manager.
If the communication with a robot fails in Linux, review your network configuration:
A valid entry for the local system in the /etc/hosts file for a robot, hub, server, or UMP system
The entry for the local system must be a fully qualified host name and IP address.
If only the loopback address is defined, for example,
address.
SSL communication is enabled through the UIM_HOME/robot/robot.pem certificate file. The controller creates the robot
.pem file during startup. The robot.pem file contains the key to decode encrypted CA UIM messages.
Changes to SSL Communication in Controller v7.70
SSL communication modes are more meaningful in controller v7.70 because of changes to the treatment of the robot.pem certificate
s.
Note: The following information about controller v7.63 also applies to previous versions.
Controller v7.63 always creates robot.pem and always acknowledges receipt of encrypted communication, regardless of the
parent hub mode.
The first SSL encrypted request from a v7.63 controller in mode 1 to a v7.63 hub in mode 0 succeeds. The hub uses the r
When hubs that are upgraded to v7.70 communicate with earlier versions, and hubs are set to the same mode:
Hubs set to mode 0 communicate unencrypted
Hubs set to mode 1 use SSL encryption
Hubs set to mode 2 also use SSL encryption
The following diagram illustrates communication when all hubs are at or below v7.63, and when all hubs are v7.70. In the diagram:
Blue lines in the diagram represent SSL communication
Red lines in the diagram represent unencrypted communication
Solid lines in the diagram indicate successful communication
Dashed lines in the diagram are unacknowledged
Arrow direction indicates the initiator and receiver relationship.
A v7.63 hub in mode 0 cannot initiate communication with a mode 2 hub. Two-way communication is enabled once the relationship
is established.
Corresponding hypervisor system (Windows 2008/2012/2012 R2 Server + Hyper-V / Windows 2008 Server Core + Hyper-V)
Virtual Machines configured on the Host OS.
Note: Version 3.0 or later of the probe is available only through the web-based GUI. The Infrastructure Manager (IM) GUI is only
available for version 2.2 or earlier.
The probe allows you to define alarms and their corresponding threshold values. You can compare the actual data at customizable intervals using
generated QoS messages. The probe then generates alarms when the corresponding threshold values are breached.
The probe monitors the following entities on the host:
CPU
Memory
Disk
Network
Resource Pool
The probe also monitors the following entities of each Virtual Machine (VM) on the host:
CPU
Memory
Disk
Network
The 3.10 and later versions of the probe allow you to create configuration templates. The templates are applicable to only the specific instance of
the probe on the robot. Only existing profiles can be configured using templates.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Migration Considerations
Known Issues and Limitations
Revision History
This section describes the history of the revisions for the hyperv probe.
Version
Description
State
Date
3.10
What's New:
Beta
May 2015
Added the ability to create monitoring configuration templates. The templates allow you to apply consistent monitoring
configurations across multiple profiles using filters.
Added factory template for monitors available on the Hyper-V Unified Dashboard.
Added support for dynamic variables using CA Unified Infrastructure Management 8.31 or later.
Note: The following features are not yet supported in version 3.10:
Localization
Monitoring NT Events and Services
Custom QoS creation and migration
Groups
3.00
Beta
March
2015
GA
September
2014
The probe is now available only through the web-based GUI and not through the Infrastructure Manager (IM) GUI.
The following features are not supported in version 3.00:
Localization
Monitoring NT Events and Services
Custom QoS creation and migration
Groups
Templates from the Infrastructure Manager GUI.
The static standard threshold block is available for the probe.
Note: Probe is supported from CA Unified Infrastructure Management 8.0 and later only.
2.20
Fixed an issue where GUI response time was very high with large number of monitors.
Fixed a defect where Resource Pool name was appearing blank. The value of Resource Pool is now similar to the value
of caption field.
Fixed a defect where probe was generating null value for the instances in network counters of Legacy Network adapter
in Windows 2012 R2. The value is now similar to the value of instances in network counters of Virtual Network adapter.
Added support for alarm generation of Events.
2.11
2.10
June 2014
June 2014
Added support for the following locale: Simplified Chinese, Japanese, Korean, Spanish, French, German, and
Portuguese.
2.00
September
2013
1.12
April 2010
1.11
September
2009
1.10
June 2009
Migration Considerations
The migration considerations for the probe from the 2.20 to a later version are listed below:
Downgrade is not supported to version 2.20 or earlier from any later version of the probe.
You must delete all the files in the util directory in the windows local temp directory.
You must perform this process for each instance of UIM accessing the robot.
The following features are not supported in version 3.0 or later:
Localization
Monitoring NT Events and Services
Custom QoS creation and migration
Groups
Templates created in version 2.2 or earlier are not supported in version 3.0 or later. However, you can create new templates from the Te
mplate Creator interface in hyperv version 3.10 and later using the Admin Console.
The following static text alarms have been discontinued:
Host Name
Resource Pool Name
Resource Pool Status
Physical Network Adapter Name
Virtual Machine Name
The CPU Load Percentage QoS has been discontinued.
Alarms from previous versions of the probe must be manually cleared.
ibm_ds_next (IBM Disk Storage System 8000 Series Monitoring) Release Notes
The IBM Disk Storage System 8000 Series Monitoring (ibm_ds_next) probe monitors IBM DS8xxx storage systems and stores the monitoring
information at specified intervals. You can specify thresholds and define alarms that are generated when the specified thresholds are breached.
The probe uses an SMI-S interface to communicate with the IBM DS storage system. Install the SMI-S provider on the storage system to enable
communication with the probe. The SMI-S is a standard for managing heterogeneous storage systems in a SAN (Storage Area Network).
The probe can monitor the following components:
Arrays
Disks
Pools
Ports
Ranks
Volumes
Contents
Revision History
Prerequisites
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.0
What's New:
Beta
December
2015
Prerequisites
The prerequisites for the ibm_ds_next probe are as follows:
A user account with Monitor role in the IBM DS8000 Storage System
The probe uses SMI-S provider and Command Line Interpreter (CLI) to retrieve status, configuration, and statistics data of the IBM SVC storage
server. The probe supports user authentication through SSH only. So, you require a password or a valid SSH key file to access CLI.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Enable Performance Metrics
Known Issues and Workarounds
Help Button in IM Redirects to Legacy Document
Revision History
This section describes the history of the probe updates.
Version
Description
State
Date
1.05
Fixed Defects:
CR
October
2015
The probe generated incorrect alarms for Mdisks. Salesforce cases 00151468, 00156360
The probe connected to the wrong port and was unable to retrieve the resource data. Salesforce case 00164563.
The probe was unable to limit the log file to the specified size. Salesforce case 00169660
Updated the overview section in the Release Notes and the ibm_svc (IBM SVC Monitoring) article to state that the
probe supports user authentication only through SSH. Salesforce case 70002124
1.04
Fixed an issue where the IO, Latency and Throughput QoS metrics for MDisks and VDisk were displayed as 0. Salesforce
case 00134834
GA
July 2014
1.03
Added option to select IBM SVC or IBM V7000 storage system for monitoring in the Probe Configuration section.
GA
June
2014
1.02
Beta
May 2014
1.01
Initial Release
December
2013
Important! Performance may be affected when monitoring storage arrays with a large number of logical volumes and disks from a
single probe. In such cases, try increasing virtual memory appropriately.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Upgrade Considerations
General Use Considerations
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.07
What's New:
GA
December 2015
GA
September 2014
GA
July 2014
GA
January 2014
Added QoS for Used Managed Space and Percent Used Capacity for array.
Fixed a defect in which the probe was displaying the following errors when tested with correct information of the
IBM_DS resource:
The hostname or IP address is wrong or illegal
The port number is wrong
The handler class is wrong
Salesforce case 00130456
2.03
Fixed a defect in which the probe was running only in Demo Mode.
2.01
December 2013
Added support for IBM DS5xxx. Renamed the probe to ibm-ds. Many usability improvements and defect fixes.
May 2013
1.01
December 2011
1.00
November 21
2011
1.00
Initial release.
November 10
2011
Upgrade Considerations
Consider the following points when upgrading the probe to version 2.0 and later:
1. The probe migrates only Resources and Templates.
2. The probe does not migrate Auto-monitors and Static monitors. Therefore, the probe generates alarms and QoS only for resource
availability before configuring the checkpoints.
3. The Backup File contains the following information:
The old and new QoS definitions.
Templates and their keys.
Merged values of Setup and Startup tags.
All ibm_ds.cfg keys.
In 2.05 and earlier versions of the probe, the Use SSL field is available, but not functional.
If the ibm-ds.cfg file exceeds 2 MB, then start up problems may occur. Each array contains its own monitor configuration section. So if
each array has multiple monitors, the configuration file can grow quickly. CA recommends you to maintain only active monitors on an
array to stop the configuration file from growing unnecessarily, as inactive monitors use up configuration space. The default auto
configuration template contains the most popular active monitors, and provides a good starting configuration.
The probe does not support IPv6 internet addresses. You can use IPv4 internet addresses, wherever required. Some operating systems
(such as Windows 2008 R2) might be configured to resolve hostnames to IPv6. To avoid this issue, provide the IPv4 internet address
instead of a hostname.
Revision History
Probe Specific Environment Requirements
Probe Specific Software Requirements
Probe Specific Hardware Requirements
Upgrading Considerations
Upgrading to v2.30
Known Issues and Workarounds
Unknown Status for VMs Displays in USM
Error with IPv6 Address
Delay in QoS Data Publication
Probe is Unresponsive
Network Interfaces Do Not Appear to be Detected by the Probe
Revision History
This table describes the history of probe updates.
Version
Description
State
Date
2.33
What's New:
GA
August
2015
Limited
Availability
July 2014
Improved structural and topological architecture of the probe to improve performance and to more accurately reflect
PowerVM environments.
Added the ability to apply monitoring with templates in Admin Console.
Added documentation about how to format IPv6 addresses when used for a Uniform Resource Identifier (URI); use
the Java convention of enclosing an IPv6 address in square brackets.
For more information, see the v2.3 ibmvm AC Configuration and v2.3 IM Configuration guides.
Changed the following metrics to now be grouped with the HOST_DISK resource rather than with a separate
resource HOST_DISK_UTILIZATION:
Disk Bandwidth Used
Disk Data Transfer Rate
If you have custom reports and dashboards using these metrics, you will need to update their resource from
HOST_DISK_UTILIZATION to HOST_DISK.
For more information, see the ibmvm Metrics.
Fixed Defects:
Fixed a defect in which self-monitoring alarms for data collection problems were sent with the wrong subsytem ID
and auto-monitors for LPARs were created against inactive LPARs resulting in alarms being sent due to
unavailability of data. Salesforce case 00161243
Fixed an issue in which delta values were incorrectly calculated.
Note: Monitors of the enumerator type only support current values and cannot calculate delta values.
2.1
Added additional QoS metrics. For more information, see ibmvm Metrics
2.00
GA
March
2014
1.26
GA
August
2012
1.23
Beta
June 2011
1.21
April 20
2010
1.20
April 1
2010
1.12
Fixed CPU Pool metrics to reflect the average value during the last interval, instead of the average value since system
start.
September
2009
1.10
Commercial Release
Added a Physical Processors Consumed checkpoint to indicate how many physical processors an LPAR has used
Fixed the percent entitlement used checkpoint to use the difference between last two samples
Fixed the percent entitlement used checkpoint to gather cycle times for all LPARs
Applied checkpoints filtering based on the host type and version. This is done to disallow showing checkpoints which are
not supported for specific host and version in the GUI.
Currently the filtering is applied on checkpoints for HMC version 7, following are the checkpoints:
1. Current LPARs Supported
2. Managed System to Firmware LPAR ratio
3. Maximum LPARs Supported post next restart
4. Managed System to Firmware LPAR ratio post next restart
5. Uptime.
June 2009
Note: Version 2.30 of the ibmvm probe supports up to Power7+ with Hardware Management Console (HMC) version 7.9.0.
Advanced accounting enabled on Shared Ethernet Adapters (SEA) before the IBM Virtualization Monitoring probe can report any
statistics. To enable advanced accounting on the SEA, enter the following command:
The HMC server enabled for SSH access, and the SSH user must have the HMC Operator level of access. This allows you to use the
viosvrcmd command on the HMC server to gather data from the Virtual I/O Server (VIOS) for many of the checkpoints.
The Processor Entitlement Consumed and Physical Processors Consumed checkpoints require the lslparutil command. Run the chlparutil
-r config -s 300 command on the HMC server.
Note: Older HMC versions might only support -s 3600 to collect utilization metrics every hour. The data will remain the same
for the duration of the hour, until a new sample is recorded by the HMC server.
Upgrading Considerations
Upgrading to v2.30
Versions 2.30 is the first version of the probe to include support for applying monitoring with templates in Admin Console. To upgrade and then
apply monitoring with templates in Admin Console requires that all previous configuration be deleted. Because of this, we recommend that you
delete probe versions earlier than 2.30 and deploy a new v2.30 probe.
If you want to configure the probe using only Infrastructure Manager, you can upgrade from an earlier version to v2.30 without deleting any
previous configuration. However, not all features of v2.30 and later are supported in Infrastructure Manager.
For example: The input string [f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace error that
includes the exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
Probe is Unresponsive
Symptom:
The IBM Virtualization Monitoring probe may be unresponsive under heavy load, during scheduled re-discovery, or immediately after applying a
monitoring template to multiple large devices.
Workaround:
If the probe is unresponsive, wait several minutes and try again.
Workaround:
This is not an issue. A node for network interfaces will only appear if the device can be monitored.
Connect time
Session time
Login time
Logout time
Startup published application time
Run macro script time
ICA ping
Total profile time
Note: The upgrade from the probe version 3.01 to version 3.02 is seamless. The macro functionality also works fine on upgrade.
Contents
Revision History
Prerequisites
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues
Revision History
This section describes the history of the revisions of this probe.
Version
Description
State
Date
3.03
Fixed Defect:
GA
December
2015
GA
December
2015
GA
December
2014
Beta
December
2014
The Session QoS generated null value as the probe initiated multiple Logoff events. The probe also created multiple
QoS with null value for Macro Failure and Macro Timeout events. Salesforce case 245373
3.02
Fixed Defects:
The probe did not connect to the Citrix server in some instances. This issue occurred when the Published Application optio
n was enabled from the Connection section in the General tab for a profile. Salesforce case 00155845
3.01
What's New:
Automated the process of registering the 'nimbus_ica.dll' library on the system.
3.00
What's New:
Added support for TNT2 compliance where Device Id and Metric Id are getting generated.
Added support for Citrix receiver 4.1
Added Support for Windows 7 and Windows Server 2012 R2
2.52
Fixed issue: Fixed an issue in which the probe could not find the file specified (while running the exe).
GA
October
2012
2.51
Support for Macro recording and positive value for Macro QoS added.
GA
June 2012
2.50
Support for citrix online plugin 12.3 for desktop and web client. Feature added to support ICA Client 12.3.All existing features
should work with this client except macro recording.
June 2012
2.42
Added fix for blue screen problem. Improved logging to know if unwanted processes are killed Fixed screen shot taking
functionality Improved logging to know about calling logoff method. Also improved logging for success and failure of
Logoff method.
GA
February
2012
GA
October
2011
Fixed issue related to show Logoff value (more than 20000), after mentioning logoff delay greater than zero
Fixed run time error (while browsing ica file and macro file).
2.41
2.40
Added fix for ica_response probe does not send clear messages.
October
2011
GA
October
2009
GA
October
2008
GA
August
2008
GA
April 2007
Added fix to send logon failure alarm if the user credentials are wrong.
Added fix to logoff ICA client properly on probe restart and stop.
Added alarms for timeout and connection disconnected.
2.27
Added QoS Total profile time that measures time taken for the total operations for the profile. Gives NULL if:
Connect fails
Logon fails
Macro fails
Published application fails
ICA file fails
Logoff fails
Session exit fails
2.19
Added ICA client user to run the ica_response_poll executable in a verified ica client environment. More information in
the ica_response nimbus documentation.
Set fixed resolution to 640x480 for macro recording and playback.
Fixed issue with macro recording and playback.
Added unique SessionGroupID for each profile. Used LogoffSessions with SessionGroupID to terminate hanging
sessions on the server.
Corrected bug in IcaResponse, in state WaitLogOff, see use of m_waitUntil.
The time when screen dumps are taken before a session timeout can be set by manually adding
start_capture_before_session_timeout = , where n is the number of seconds before a session timeout.
Added function for screen dump of client area when session times out. Screen dump can be configured to manually add
save_screenshot_on_timeout = true key in the profile section.
Known issues:
Minimum screen resolution for the macro recorder is 1280 x 1024 pixels.
Screen resolution does not work in Win2000 (does not resize properly and the user must use scrollbars), verified ok
on XP.
2.16
Fixed long delay when ICA Macro file browser is launched. This method is dependent on a new version of the controller,
with support for drive listing. With an old controller, the delay will be unchanged.
Fixed problem with address settings when use ICA file is selected.
Known issues:
Minimum screen resolution for the macro recorder is 1280 x 1024 pixels.
Screen resolution does not work in Win2000 (does not resize properly and the user must use scrollbars), verified ok
on XP.
2.14
GA
Added messages for logon response warning and connect response warning.
March
2007
Replace server/farm address radio buttons with browser address and browser protocol fields.
December
2006
Note: Minimum screen resolution for the macro recorder is 1280 x 1024 pixels.
2.10
GA
November
2006
GA
September
2006
GA
July 2006
2.07
2.05
Default QoS configuration turned off in default profile and 'Template profile'.
GA
June 2006
GA
December
2004
1.75
Prerequisites
The Citrix Client Response Monitoring probe requires one of the following Citrix client software:
For Desktop Client:
Citrix Receiver 14.1.X or later
or
CitrixOnlinePluginFull File Version 12.3 is installed on client system.
For Web Client:
CitrixOnlinePluginWeb File Version 12.3 is installed on client system.
Notes:
If the IM and Robot are installed on two different systems, then the Citrix client software is installed on both the systems.
The same version of Citrix Receiver or Citrix Plugin should be present at both places: where the probe is installed and where
the probe GUI has to be opened.
Known Issues
The known issues of the probe are:
There is an issue with Citrix Receiver due to which the server connect process hangs sometimes. The probe handles this situation by
automatically killing the Receiver processes in background. As a result of this, you might get an error message pop-up for Citrix Receiver
on the machine where the probe is installed. You are recommended to ignore the pop-up by clicking OK.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Adjust icmp Memory
Installation Considerations
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for the icmp probe.
Version
Description
State
Date
1.2
What's New:
GA
December
2014
GA
May 2014
GA
May 2014
Added the ability to create monitoring configuration templates. These templates allow you to apply consistent monitoring
configurations across multiple devices by using filters.
Added discovery filters to control the list of devices that are retrieved from the discovery server which governs the
available monitoring targets that appear under the Profiles node.
1.1
What's New:
Added the Apply Filter functionality to view a particular resource from existing resources.
Added the Static Alarm Thresholds option to let the user set static thresholds for particular monitor.
1.0
Note: It is recommended that the CA Unified Infrastructure Management server and Unified Management Portal (UMP) are the same
version but not required.
Installation Considerations
The icmp probe is downloaded and installed just like other CA Unified Infrastructure Management probes by downloading the probe from the
Internet Archive into your local Archive.
After you download the probe to your local Archive, you only need to install (drag from the Archive to the Robot) the icmp probe.
Note: On Linux/Unix systems the /etc/hosts file should contain an entry with the FQDN for the installation system.
Supports a DLL add-on for server-side monitoring of individual requested resources, and displays running and statistical data.
Can display hanging and problematic IIS scripts.
Delivers a variety of system-related and IIS-related performance data.
Custom OS monitoring objects can be set.
Can deliver server side status data when installed to the IIS server machine.
Can monitor application pools.
Contents
Revision History
Threshold Configuration Migration
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Supported Platforms
Installation Notes
Installation Notes for IISRequest.dll
Installing ISAPI filter on a 64-bit OS
Activating the ASP Counters on IIS 7 or above
Installing Metabase Compatibility Component on IIS 7 or above
Installation Considerations
Notes on IISRequest.dll
ISAPI Filter on a 64-bit OS
Activating ASP Counters on IIS 7 or Above
Metabase Compatibility Support on IIS 7 or Above
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.71
What's New:
GA
June
2015
GA
Jun
2014
The probe can be migrated to standard static thresholds using the threshold_migrator probe.
Added a new device ID key (useRobotDevIDForLocalhost). This key can be set as Yes to migrate the probe to standard
static thresholds. This option is available only through Raw Configuration GUI to monitor local IIS servers. The value of this
key is No by default.
Note: Refer to Set Device ID Key Using Raw Configure section in the v1.7 iis AC Configuration article for the impact of
setting this key value.
1.70
What's New:
Added support for monitoring application pools. Perfmon 1.53 or later is required for this functionality.
Note: The Application Pool monitoring feature is applicable for IIS 7.0 or later.
1.60
Nov
2013
1.55
Jun
2013
1.54
Implemented Probe defaults, which works only when the probe is deployed on IIS server machine.
Updated copyright information.
Feb
2013
1.53
Fixed issue of run time GUI crash while uncheck monitoring of checkpoint.
Jan
2013
1.52
Dec
2011
1.51
Added support for "IIS Status Value"; having multiple IPs on the host
Aug
2011
1.50
Dec
2010
1.40
Oct
25
2010
1.32
Oct
19
2010
1.31
Oct
14
2010
1.30
Jun
2010
1.30
Mar
2010
1.24
Fixed Server data (requests) not shown after adding ISAPI Filter.
Fixed potential probe crashing issues.
Fixed initialization of curl library.
Fixed initialization of variables
Sep
30
2009
1.22
Sep 4
2008
1.21
QoS and Alarm source for 'localhost' profiles changed to actual host name.
Monitoring interval defaults to profile interval instead of global probe interval.
Changed QoS series name for disk and memory usage because of inconsistencies with QoS data supplied by the 'cdm' probe.
Server side DLL add on now available for 64 bits versions of Windows. IIS6 no longer need to be run in IIS 5.0 isolation mode.
The DLL is now available for 32 and 64 bits versions of Windows Vista, with IIS7.
Jan
2008
1.12
Added keyword 'localhost' for hostname, will cause performance collection without authentication.
Added possibility of setting IIS filter port number in cfg file.
Added support for all custom performance objects
Changed key separator from . to ,
Feature: Http server authentication added.
Fix: Removed ptPerfmonInstances call.
QoS definition fix (IIS)
Jul
2007
1.00
Initial version
Oct
2006
Notes:
If you want to migrate the probe to standard static thresholds for local IIS servers, using the threshold_migrator probe, you
must set the device ID key, useRobotDevIDForLocalhost, to yes in the probe Raw Configuration.
The changes in the probe after migration are:
The Infrastructure Manager (IM) GUI of the probe will not be available and the probe will only be configured using Admin
Console (AC).
Probe specific alarm configurations in the probe monitors will be replaced by Static alarm, Time To Threshold, and Time
Over Threshold configurations.
The variable syntax will change from $<variableName> to ${<variableName>}.
The alarms will be sent by the baseline_engine probe.
Supported Platforms
Please refer to the:
Compatibility Support Matrix for the latest information on supported platforms.
Support Matrix for Probes for additional information on the probe.
Installation Notes
You can install the IIS Server Monitoring probe for monitoring the IIS server while considering the following points:
Provide admin access to the probe in Admin Console by using the Settings > Probe Security option.
Alternatively, use the Security > Probe Administration option in the Infrastructure Manager.
Restart the Performance Collector probe after installing or upgrading the IIS Server Monitoring probe.
Use the 'localhost' as the hostname for monitoring the local IIS server to skip the performance authentication.
Ensure that the remote registry is enabled on the IIS server for remote monitoring.
Installation Considerations
1. Install the package into your local archive.
2. Drop the package from your local archive onto the targeted robot.
3. Add IIS probe in Infrastructure manager, security, probe administration with admin access and * mask.
4. A restart of the perfmon probe is needed if this was an IIS probe upgrade, and perfmon still has handles running.
5. If there are authentication problems, and if the IIS probe is running locally on the IIS server computer, you can use 'localhost' as the
profiles hostname to skip performance authentication.
6. Double-click the probe for initial configuration.
7. For remote monitoring, make sure that the remote registry is enabled on the IIS server.
Notes on IISRequest.dll
IISRequest.dll is the iis ISAPI filter add-on for iis probe.
The iis probe request add-on supports several versions of the IIS server and Operating Systems. After probe installation, a readme.txt will
typically be found in the C:\Program Files\Nimsoft\Applications\iis directory. Please see the readme.txt file for supported versions, and how to add
this functionality.
probe also supports Quality of Service (QoS) messages for the Service Level Agreement (SLA) family.
Note: Probes that support SNMP on Linux (interface_traffic, snmptd, and snmpget) use an SNMP library. This library can cause newer
Linux systems to issue the following message in the Linux console log:
The SNMP library supports older versions of glibc which require the flag for sockets to work correctly. The network portion of the glibc
library sends this message. The message shows that an unsupported flag is being sent to the setsockopt function. The library ignores
this flag so you can also ignore it.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Upgrade Considerations
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for interface_traffic probe.
Version
Description
State
Date
5.45
Fixed Defects:
GA
October
2015
GA
July 2015
GA
April 2015
The probe generated false alarms for the operational state of interface. Salesforce cases 70000904, 70000825
The probe used the default values instead of the specified timeout and retry values in a profile. Salesforce case
00168659
The probe incorrectly calculated the current percentage of interface errors when monitoring errors and discarded
packets. A key, incerrdistototal, has been added in Raw Configuration to calculate this value correctly. Salesforce case
00159962
Note: For more information about this key, see the interface_traffic Troubleshooting article.
5.44
Fixed Defects:
Incorrect utilization was displayed for some interfaces. Salesforce case 00152619
Probe crashed when editing virtual interfaces. Salesforce case: 00162089
Incorrect alarm displayed for IndexShift in USM. Salesforce case: 00161338
5.43
What's New:
Added support for pure IPv6.
Fixed Defects:
Fixed an issue in which the probe did not save multiple changes to an existing interface. Salesforce case 00157651
Improved the formula to remove negative values of QoS/alarm. Salesforce case 00145339
Fixed an issue in which the probe was displaying the default name in the alarm message and not the updated one. Sale
sforce case 00154506
5.42
Fixed Defects:
GA
January
2015
GA
August
2014
GA
June 2014
GA
September
2013
GA
July 2013
GA
March
2013
GA
March
2013
GA
December
2012
GA
December
2012
GA
September
2012
GA
September
2012
Fixed an issue in which, when the probe was configured to use the host address as the QoS identifier instead of profile,
the metrics were generated using the profile name as the QoS identifier. Salesforce case 00148492
Incorrect percentage was calculated for the alarm messages. Salesforce case 00145415
Enhanced the logsize upper limit to 2 GB. Salesforce case 00150857
5.41
Added a functionality to display the interface description in the Name column and interface alias in the Description column on
the probe GUI. The feature also gets the interface description as QoS target in SLM database and USM view (like version
5.32) along with SNMP_v3 support. Salesforce case 00129782
Note: Refer to the Considerations section to understand the upgrade scenarios.
5.40
Added a functionality in the probe to avoid the issue of getting different entries in SLM and different graphs for the same
interface. This issue was identified while upgrading from version 5.32 to version 5.33. Salesforce case 00122371
Note: On upgrading the probe from version 5.33 to version 5.40, for backward compatibility with version 5.33 behavior which
uses ifname value instead of ifdescription value on SLM, you must uncheck the Use IfDescription for CI Name check box. By
default, this checkbox is enabled.
Fixed a defect were the clear alarm token is changed from #network.interface_traffic.in_octets_fail to
#network.interface_traffic.in_octets_ok. (Salesforce Case: 00125910)
5.33
5.32
Added issue related Network devices show up twice for cisco_monitor and interface_traffic.
Fixed: Major issue the alert for lower threshold value in traffic tab got disabled, for all the interfaces while upgrading.
Fixed: RunTime Overflow Error "6". Fixed minor issues in SOC conguration.
5.31
5.30
5.26
5.25
Fixed an issue where $ifDescr and $ifName values are not correct in Dr.Nimbus.
Changed from "Using Both values and % of the maximum speed" to "Using Bytes/second and % of maximum speed".
Unchecking Include Alarm/Qos settings still save the settings.
5.24
Community strings are not encrypted on selecting Encrypt community string in Setup >Advanced tab.
Save inactive interface definitions check box has no effect.
Set Default from Interface do not save for Default General
5.23
Fixed the problem of highlighting Send alarm option(after unchecking Enable Monitoring). The newly added profile now
has all the default settings.
Fixed UMP configuration issues.
5.22
GA
August
2012
5.21
Added functionality to ignore alarms when interface op state is down/not present/lower layer down.
GA
July 2012
5.20
GA
June 2012
GA
March 27
2012
GA
March 16
2012
GA
December
2011
GA
October
2011
GA
April 2011
Added a callback get_profile_status which accepts regular expressions in profile name and returns information of all the
active profiles matched and it's interfaces.
Added support to AIX for 64 bit.
Fixed "Temporarily out of resources" issue in get_system_info callback.
Added functionality to ignore operational state alarms when admin state of an interface is not as expected.
Removed autocold start of probe after 1 week in Unix like OS.
GUI fix: QoS Settings for Interface Traffic are not saved when Publish QoS is not active.
GUI fix: Cannot clear Low or High Threshold values on traffic tab.
5.11
GUI fix: Added a functionality to remove the inactive interfaces from the list.
Number of interfaces are now showing the actual number of interfaces displayed in the list of a particular host.
The help button will now display online help instead of CHM.
5.10
GUI fix: Added a functionality to save Alarm / QoS settings for inactive interfaces.
Added interfaces multi-edit functionality on right click.
Added the functionality to apply the default settings when they are saved using Set Default button in the interface
definition dialog box.
5.01
5.00
4.95
4.94
GA
March
2011
4.93
GA
January
2011
4.92
Fixed get_samples callback (The issue was that GUI was not able to fetch samples from probe on 64-bit UNIX Robots).
GA
January
2011
GA
December
2010
4.91
Added support for handling of extreme values in Error and Discarded Packets section.
Fixed minor bugs in the probe GUI.
4.90
Added new feature for handling of extreme values in Traffic and Processed Packets section. Added fix to make "does
not exist in the MIB" alarm message configurable.
GA
December
2010
GA
November
2010
GA
June 2010
GA
January
2010
GA
January
2010
GA
December
2009
GA
July 2009
GA
June 2009
GA
April 2009
GA
April 2009
GA
November
2008
GA
May 2008
Added support to specify the traffic limit required to trigger "No Traffic" alarm.
Added support to monitor (Alarm and QoS) Error packets and Discarded packets as percentage (%) of total processed
packets.
Added fix to allow one threshold value (both thresholds not mandatory now) in Traffic section.
Added support for a callback to get total number of active and inactive interfaces..
Added support for all known interface operational status (total 7 as per IF-MIB).
Added code to remove white space from all sections.
Added fix in alarm threshold field to show the correct value.
Added fix to avoid reloading of host profiles in the tree on expanding the group.
Added fix to uncheck 'alarm when no traffic is detected' checkbox on unchecking 'Enable Monitoring' checkbox of
'Traffic' tab.
Added fix to set interface speed after every interval of timer.
Added support for SNMP V2 and V3 credential details to perform bulk discovery.
Added option in bulk configuration window to remap interfaces after an index shift.
Added code to allow decimal numbers in threshold fields of 'Traffic' tab.
Added a checkbox 'Send QoS in Kbps' in Setup tab.
Added button in toolbar to execute interface status callback.
4.80
4.70
4.64
4.62
4.60
4.52
4.51
4.50
4.40
4.36
4.35
4.34
Fixed 'Rediscover Interfaces' and 'Query Agent' for SNMPv3 AuthPriv agents.
GA
April 2008
GA
March
2008
GA
October
2007
GA
September
2007
GA
September
2007
GA
March
2006
GA
May 2005
Fixed fetching of operstate on inactive (in probe config) interfaces for SNMPv3 AuthPriv agents.
Fixed 'Monitor' window for SNMPv3 AuthPriv agents. Added logging of thread id's.
4.32
4.25
4.23
4.22
Fix: Do not save alarm and QoS settings of inactive interfaces (when inactive interfaces are set to be saved) Fixed fetching of
operstate on inactive interfaces when using SNMPv3.
Fixed fetching of High Performance counters when using SNMPv1 (bug introduced in version 4.20)
Added support for SNMPv2c and SNMPv3.
Added retries setting for SNMP Get requests.
Added timeout setting for SNMP Get requests.
Fixed community string issue when longer than 20 characters.
Improved Message Pool Manager to handle SubsystemIds.
Added possibility to send Traffic QoS messages as % of max interface speed.
Added possibility to send Packet QoS messages as total packets counted.
Added possibility to set default interface settings per interface type.
Added possibility to override outbound and inbound speeds.
Added possibility to hide community strings from GUI.
Added NoTraffic on 'Any interface' alarm option.
Rediscover/Merge has been enhanced with UI feedback when interfaces has different names than configured.
Fixed samplemax on Traffic QoS messages when no traffic was detected on interface.
Fixed bulk-configurator.
Fixed alarm not being sent when configured to alarm in UP state.
Fixed dashboard filter settings.
Dashboard discovery template enhanced to use 'friendly' names of devices and interfaces.
4.12
4.02
Added NULL QoS values for traffic metrics when interface was reported 'down'.
Added support for more operational states.
Fixed problem when configuring probe through a NimBUS tunnel.
Upgrade Considerations
This section contains the considerations for the interface_traffic probe.
Starting with version 5.43, the probe supports IPv6. The probe must be deployed in pure IPv6 environment to monitor IPv6 interfaces.
The Setup window in version 5.41 or later version of the probe has the Use IfDescription for CI Name and Use Alias checkboxes. You
must consider the following scenarios for these checkboxes:
Use
IfDescription
for CI Name
Use
Alias
Functionality
Not Selected
Not
Selected
The probe GUI displays the interface name in the Name column and interface description in the Description colum
n. The interface name is visible as the QoS target in USM view and SLM database. The current version of the
probe will function as version 5.33 of the probe.
Selected
Not
Selected
If you have upgraded from a previous version, the settings of the previous version remain intact until you
rediscover the interfaces. After you rediscover the interfaces, the default settings of version 5.41 or later are
applied.
The probe GUI displays the interface name in the Name column and interface description in the Description colum
n. However, the interface description is visible as the QoS target in USM view and SLM database. This is the
default configuration in version 5.41 or later.
Not Selected
Selected
The probe GUI displays the interface description in the Name column and interface alias in the Description colum
n (like version 5.32). The interface name is visible as the QoS target in USM view and SLM database (like version
5.33) along with SNMPV3 support. Hence, it is not recommended to select only the Use Alias option.
Selected
Selected
The probe GUI displays the interface description in the Name column and interface alias in the Description colum
n. The interface description is visible as the QoS target in USM view and SLM database (like version 5.32) along
with SNMP_v3 support.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Revision History
This section describes the history of the revisions for the jdbc_response probe.
Version
Description
State
Date
1.23
Fixed Defects:
GA
December
2015
GA
April 2015
Probe created new connections with the spooler without closing them. Support case number 246871
Probe was unable to create a connection to the database. Support case number 246114
Note: If you still face any issues with the connection, see Unable to Create JDBC Connection .
Response time QoS was in increasing order. To address this issue, a new key, exclude_connection_time is
introduced in the probe configuration file. Use Raw Configuration to set the value of this key to yes. Salesforce case
246886
1.22
Fixed a defect in which the probe did not generate any QoS data on the Linux systems even though the alarms were
triggered. Salesforce case 00158270
1.21
Fixed a defect in which the connection error alarm did not clear after the connection is successfully established. Salesforce
case 00150260
GA
January
2015
1.20
Upgraded the probe to support IBM DB2 and IBM Informix databases.
GA
September
2014
Provided Source Override option to allow users to provide their own QoS source instead of using the default source. Salesfor
ce case 00141979
1.15
GA
January
2014
1.14
Fixed Defects:
GA
December
2013
GA
June 2012
Fixed SOC issues for Password Added clear alarm for connection error.
1.11
Fixed Defects:
March
2012
Provided a fix to ensure that each profile runs at a specified time interval.
Provided a fix to ensure that Alarms and QoS are generated as per Jdbc Response time.
Provided a fix for Clear alarms to be generated on selection of Clear Severity for Jdbc Response Time , Jdbc Row
Count and Jdbc Value.
Provided a fix to remove default Sample Connection and Sample Profile
Provided a fix to generate Sample Rate in QoS messages
Provided a fix to change QoS target in database as profile name and connection name
1.10
Fixed Defects:
GA
February
2012
Provided a fix to ensure that each profile runs at a specified time interval.
Provided a fix to ensure that Alarms and QoS are generated as per Jdbc Response time.
Provided a fix for Clear alarms to be generated on selection of Clear Severity for Jdbc Response Time, Jdbc Row Count
and Jdbc Value.
Provided a fix to remove default Sample Connection and Sample Profile.
1.05
GA
December
2011
1.02
GA
December
2009
1.01
Initial release.
GA
September
2009
Beta
September
2009
Please note that the QoS names have been changed since the Beta version as follows:
The QoS definitions have been changed from QOS_SQL_RESPONSE, QOS_SQL_ROWS, and QOS_SQL_VALUE to
QOS_JDBC_RESPONSE, QOS_JDBC_ROWS, and QOS_JDBC_VALUE
1.00
Probe Provisioning Manager (PPM) version 2.38 or later (required for Admin Console)
Java JRE 6 or later (required for Admin Console)
Installation Considerations
The jdbc_response probe, by default, installs the JDBC drivers for Microsoft SQL Server and Oracle databases. To monitor other databases, the
appropriate driver, as a JAR file, must be downloaded and stored on a system from where the probe can access it.
For making connections and running SQL queries, a User name and Password is required for the databases being monitored.
Download the drivers for the following databases and store at the default location of the driver files. The default location of the JDBC driver JAR
file is [CA UIM Installation Directory] \probes\database\jdbc_response\lib folder.
IBM DB2: Download the latest version of IBM DB2 JDBC driver JAR file from http://www-01.ibm.com/software/data/db2/linux-unix-windo
ws/downloads.html
IBM Informix: Search for and download the latest version of IBM Informix JDBC driver JAR file from http://www-01.ibm.com/software/
Note: You must register on the IBM website before you can download the IBM DB2 or IBM Informix driver JAR files.
MySQL: Download the latest version of MySQL JDBC driver JAR file from http://www.mysql.com/products/connector/
PostgreSQL: Download the latest version of PostgreSQL JDBC driver JAR file from http://jdbc.postgresql.org/
Important! Restart the probe after storing the driver files in the probe installation directory.
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.04
Fixed Defect:
GA
April 2015
GA
July 2014
Fixed an issue in which the probe was not expanding the column names with string-numerical characters in alarm
messages. Salesforce case 00152463
1.03
Fixed Defect:
Fixed a defect in which the source was not overridden by Source of Sender.
1.02
Fixed Defects:
GA
May 2014
GA
January
2014
Beta
September
2010
The QoS definitions were getting generated after every elapsed time interval.
QoS and alarms were getting generated with incorrect DevID and MetID.
Runtime error was being thrown when a new connection profile was created with a name same as an existing
connection profile.
1.01
Fixed Defect:
Defect fixed for the query alarm message. The alarm messages were getting truncated after using a variable.
1.00
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.14
What's New:
GA
September 2015
Fixed an issue in which the same Metric Id/Metric Type Id was generated for different QoS.
GA
June 2014
1.12
GA
January 2011
GA
Deember 2010
1.11
1.01
GA
November 2010
GA
April 2009
GA
November 2007
1.04
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for the jobs probe.
Version
Description
State
Date
1.37
What's New:
GA
September 2015
GA
June 2011
Added quality of service message for IO request rate; added support for SOC.
Revision History
Preconfiguration Requirements
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
Version
Description
State
Date
1.25
What's New:
GA
September
2015
GA
September
2014
What's New:
Added support for Static Alarm Thresholds.
Added support for configuring the probe through the Admin Console (web-based) GUI.
Fixed Defects:
Fixed a defect where DevID and MetID are not matching for alarm and QoS.
1.23
GA
August
2012
1.21
GA
January
2011
1.20
Modified behaviour for collection of QoS data: A QoS message should be sent only once for a completed schedule run
instead of on each probe interval
December
2010
1.17
GA
Updated documentation
Added resize of alarm message list
Various updates and fixes found during initial probe testing
New base library used
Initial version.
Preconfiguration Requirements
The preconfiguration requirements for the iSeries Job Schedule Monitoring for enabling Static Threshold Alarm is as follows:
Set "standard_static_threshold" key to "true" in Raw Configure.
November
2007
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for the probe.
Version
Description
State
Date
1.08
What's New:
GA
September
2015
Added support for configuring the probe through the Admin Console (web-based) GUI.
GA
September
2014
1.06
GA
January
2012
1.05
Support for raw_journal_code and raw_entry_type flags in the profile and test callback to allow matching against
encoded versions of these variables
March
2011
Advanced option in the profile dialog of the configuration tool to allow the raw journal code and entry type field values
Fixed GUI problem where profile dialog indicated changes even when there were none introduced in version 1.04
Added a 'Fetch' button and an 'Immediate fetch' option to determine if messages should be fetch immediately on journal
or time restriction selection. This setting is save together with window size and default journal/time restriction.
1.04
March
2011
Added additional fields in the message list for the encoded Journal code and Entry type fields
Made available the encoded Journal code and Entry type fields as 'JC' and 'ET' variables
Reduced the Entry type drop down list in the profile dialog to contain only entry types relevant for the selected Journal
code
Added tool tip to the Journal code and Entry type fields in the profile dialog to show the encoded field value.
1.03
February
2011
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Prerequisites
Verify Java
Verify Java on Unix System
Verify Java on Windows System
Known Issues
Upgrade Considerations
Revision History
This section describes the history of the revisions for the jvm_monitor probe.
Version
Description
State
Date
1.47
Fixed Defects:
GA
December
2015
GA
April 2014
GA
March
2014
QoS definitions were not configured when Auto Configuration was used to configure multiple profiles simultaneously. Su
pport case number 00162768
At alternate intervals, the probe generated the clear alarm for Key Not Found instead of the alarm that was configured
for threshold breach. Support case numbers 246676, 245746
The probe did not rotate the logs after the maximum log size was reached. Support case numbers 256320, 00149443
The probe displayed exponential notation for monitored values instead of the actual values. Support case number 2457
93
Updated a Known Issue where the probe does not display baseline and dynamic thresholds in monitors in CA UIM 8.31
or earlier. Support case number 246759
Updated value of max heap size in document. Support case number 00161358
Note: For more information about modifying the heap size, see jvm_monitor Troubleshooting.
Updated the jvm_monitor IM GUI Reference and metrics documents with information on the measuring units of
monitors. Salesforce case 00148759
1.46
Fixed Defect:
Fixed a defect in which no counter values were being displayed by live monitors.
1.45
What's New:
Added support for JRE 1.7x.
Fixed Defect:
Fixed an issue of communication error while fetching mbeans (minExceptions,host=localhost,path=/examples,
name3,host=localhost,path=/examples) of the Environment, Context node and their corresponding monitors.
1.44
Fixed the defect of not releasing the configuration file lock, when the Configuration file locking option is selected in the
controller probe. The probe is locking the configuration file permanently when accessed for the first time. User has to
restart the robot for releasing the lock. Now, the probe releases the lock when user closes the probe GUI.
1.43
October
2013
GA
January
2013
Fixed: Error "Key Already Exists" for different monitors when added in template.
1.42
GA
June 2012
1.41
GA
March
2012
GA
February
2012
GA
December
2011
Fixed the issue of "Auto monitor functionality" while monitoring from wildcard search. The monitors that were not
activated are also visible in auto monitors and the monitor that was activated was unchecked in auto monitors.
Fixed the no responsiveness issue of the probe when updating the version.
Added support for updating the cfg file by updating JVM_Monitor Probe Version
Added Resource Name facility while defining new Resource.
1.40
1.30
1.20
GA
January
2011
1.10
GA
June 2010
1.02
Fix for password encryption and tree node name collisions which caused missing tree nodes and data.
GA
December
2009
1.01
Initial version.
GA
September
2009
Installation Prerequisites
The probe is distributed using drag and drop. To ensure a successful distribution of the probe you must do the following:
A Java Virtual Machine (JVM) of version 6 or later must be installed on the system running the jvm_monitor probe.
The JVM path must contain the java executable.
For more information, see Verify Java.
The JVM environment must be set up on hosts that you want the probe to monitor. For more information, see the Set Up the Hosts secti
on in jvm_monitor Preconfiguration.
Verify Java
You must verify that Java Runtime Environment (version 6 or later) exists in the specified path on the system where the probe is installed. Check
the probe log file if the probe does not start. The following line appears in the log file if java is not installed or is not included in the path:
For more information about updating the Java path in the probe, see jvm_monitor Troubleshooting.
Note: If the command does not work, you must manually place the java.exe in the path. This is done in the Control Panel >
System > Advanced > Environment Variables. Run the java - version command again.
2. Open the Service Controller from the CA UIM Program Group in the Start menu.
3. Click Force Stop and then Start to enable CA UIM to include java in the path.
4. Open the Controller probe GUI in the Infrastructure Manager and click the Robot environment button in the Status tab.
5. Double-click Path in the list and verify that CA UIM has java included in the path.
Known Issues
The 1.47 and earlier versions of the probe have the following known issues:
The Find functionality is not available in the Infrastructure Manager GUI. Find allows you to directly navigate to the position of a monitor
in the tree.
Baseline and dynamic thresholds for monitors in Admin Console GUI are only available for CA UIM 8.35 or later. In earlier CA UIM
versions, the probe displays an undefined field, which must be ignored.
Upgrade Considerations
The probe has the following upgrade considerations:
Older versions of probe cannot be upgraded to version 1.3x.
You can upgrade a probe from version 1.20 to 1.40 as the configuration values are compatible.
The following default monitors of probe version 1.20 have been removed in probe versions 1.30 and later:
ClassLoading.LoadedClassCount
Memory.HeapMemoryUsage.used
OperatingSystem.CPU Usage
Threading.ThreadCount
Revision History
Supported Locales
Threshold Configuration Migration
Preconfiguration Requirements
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues
Revision History
This section describes the history of the revisions for logmon probe.
Version
Description
State
Date
3.55
Fixed Defects:
GA
January
2016
GA
September
2015
GA
August
2015
GA
June 2015
The probe did not correctly convert variable characters to defined file encoding in URL mode. Support case number
00245330
The probe did not support white spaces in paths for batch files. Batch files include commands that can be monitored. Su
pport case numbers 00270650, 00246042
3.54
What's New:
Added support for IBM iSeries version V7R2.
Fixed Defects:
The probe was unable to read UTF-16LE files. Salesforce case 00169492
The probe was crashing with exit code functionality. Salesforce case 00169505
The probe was unable to retrieve complete command output in the command mode. Salesforce case 00169863
Updated document regarding localization support. Salesforce case 70002007
Updated the document regarding alarms in the url mode. Salesforce case 00163388
3.53
Fixed Defect:
The probe restarts when a variable was added to the sub-system id. Salesforce cases 00170003, 00169384, 00167502,
00168440, 00168601, 00166738, 00168537
3.52
What's New:
Upgraded support for factory templates.
Removed localization support on AIX platform.
Fixed Defects:
Fixed an issue in which the probe was unable to detect File encoding when File encoding was selected from GUI. Sal
esforce case 00167536
3.50
What's New:
June 2015
What's New:
March
2015
The probe can now be migrated to standard static alarm thresholds using the threshold_migrator probe.
Added support for factory templates
3.48
Fixed Defects:
December
2014
Entries for the QoS variable having multiple targets were getting overlapped on the USM. Salesforce case 00149257
No Exit code alarm was generated on Windows OS in case command was not found. Salesforce case 00150798
View option in the GUI did not show updated file content. Salesforce case 00147856
Probe stopped working when the number of characters in the match expression is greater than 1020. Salesforce
case 00145499
No alarms were generated when the threshold applied on a watcher variable is breached. Salesforce case 0013715
5
The probe did not identify the UTF-16 log files. Salesforce case: 00139268
3.47
3.45
Added a timeout option for the Command mode profiles to kill the command process and all its child processes after a
defined time limit.
What's New:
November
2014
October
2014
What's New:
September
2014
Added the localization support for Simplified Chinese, Japanese, Korean, Spanish, German, French, Italian, and
B-Portuguese languages from VB and Admin Console GUI. For localization support through Admin Console GUI probe
must run with NMS 7.6 or later version and PPM 2.34 or later version.
Added the support for zLinux environment.
Updated the probe VB GUI and Web GUI for configuring the format interval and for specifying the character encoding in
different locales.
Note: Do not use the Raw Configure GUI for updating the probe configuration in the non-English locales because it can
corrupt the entire probe configuration file.
June 2014
3.32
Enhanced the probe for making file missing/open alerts user-configurable with its clear alarms on probe restart.
February
2014
3.31
Fixed the probe functionality issue when both the abort on match and the match on every run options are selected
together.
December
2013
3.30
Enhanced the Format Rule feature for making it functional across check intervals. The number of intervals is
user-configurable.
December
2013
Implemented a new alarm when the log file is missing or not readable.
Fixed Defects:
Fixed an issue for not over writing the alarm subject.
3.27
September
2013
3.26
June 2013
3.25
March
2013
3.25
March
2013
Fixed issue where Text profiles returns "0" instead of matching string as it used to.
3.24
February
2013
3.23
February
2013
3.22
December
2012
Alarm display in Japanese character in IM alarm sub console and UMP alarm sub console.
Regular Expression in Japanese.
View File having Japanese character correctly.
Open file with Japanese character in file name.
Fixed a defect when probe contains more than one watcher and format rules
3.21
August
2012
3.20
August
2012
June 2012
3.12
March
2012
GUI fix: Test profile screen will now be opening even if the watcher contains a numeric name.
The help button will now display online help instead of CHM
3.11
March
2012
3.03
January
2011
3.02
December
2010
Added fix to read a new file from beginning for the first time when "Updates" mode is selected and "Match on every run"
option is enabled. For example, when files are monitored based on time/day using %m,%M etc.
October
2010
3.00
September
2010
September
2010
2.91
June 2010
2.90
May 2010
2.85
Fixed the suppression key problem that was introduced in version 2.82.
May 2010
2.84
March
2010
Fixed problem with large suppression keys in max alarms alarm situation.
Enabled Source override for max alarm situation.
February
2010
Added a feature to test a profile or individual watchers within a profile for regular expression.
Added a fix to set proper timeout.
December
2009
2.72
Fixed issue in underlying library where the probe would fail to find the correct file location when the file had been both
truncated and appended to.
September
2009
2.71
September
2009
April 2009
2.54
Fix problem assigning variable from regex which contains just one character.
Fix problem expanding date primitives in path. Fix problem with last line in a multi-line format rule being skipped.
Bring 2.5x into line with changes made in the 2.4x series.
Fix potential problem with parsing of variables from a matched line.
Added support for 64-bit Windows (x64).
Note: For version 2.54 and higher of this probe, NimBUS Robot version 3.00 (or higher) is a prerequisite. You are advised to
carefully read the document "Upgrading the NimBUS Robot" before installing/upgrading.
December
2008
September
2008
2.41
July 2008
Fixed a timing problem which could cause a line in the log file to be skipped by the next scan if it was written during the
current scan of the file.
May 2008
December
2007
Fixed issue with file offset being stored incorrectly when probe is stopped/restarted.
UNIX: Fixed incorrect time display in logfile and potential heap corruption issues due to a non-threadsafe system call.
Increased size of buffers used to store profile and watcher names. Fixed memory leak when alarm message was over
1024 characters.
Fixes segmentation violation due to failed compilation of a regex. Log an error message when a regex fails to compile.
Added support for editing archived configurations.
Enhanced configuration tool resize.
Fixed problem with $FILENAME expansion.
Added support for referring environment variables.
2.19
Add advanced option "sendclear" to watchers. If set it will send a clear alarm if the current watcher is as expected and
the watcher has a suppression key set. Note this requires that the suppression key is unique enough that it will not clear
an alarm unexpectedly. Using a variable that is unique for each alarm situation in the suppression key is advised.
October
2007
2.18
Apply changes to log level and log size after restart of the probe.
September
2007
Store last run for profiles in logmon.dta so expansion of LASTRUN() is correct even after a restart.
Fix problem with Format rules not triggering.
August
2007
May 2007
2.02
Fixed problem where alarm flag would be reset to 'yes' on every restart.
February
2007
Fixed potential hang situation where a thread would fail to release a lock.
Added support for URL authentication when windows authentication is used with a proxy configuration.
Probe has been re-written as a multi-threaded daemon.
Profiles are checked in a thread, allowing for higher throughput and configurable intervals for each profile.
A new mode 'url' is available, which fetches a web page through the url_response probe and performs checks on the
page.
Variables in a watcher can be read from positions in the regular expression in addition to the fixed character or column
specifications used prior to this release.
Variables can have a threshold set, where an alarm is sent only if the variable is outside the expected value.
Variables can generate Quality of Service (QoS) messages, either with their values (if numerical) or with the result of the
check against an expected value (both numerical and strings).
A watcher is no longer bound to sending either Alarm, QoS or user defined messages. One or more types of message
can be selected for each watcher.
Added ability to run a command when a watcher matches.
1.67
Fixed problem with using time formatting together with wildcards in filenames.
Extended timeout for getting 'queue' data.
December
2006
Fixed crash caused by long log messages. Now long log messages will be cut after 1024 characters.
The introduction of wildcards caused the two modes of 'queue' and 'command' to not function any longer. The wildcard
check is now only performed if the mode is set to scanning files, and for the modes of 'queue' and 'command' it is
working as before wildcards was introduced. 'command' was also not showing up in dropdown list. This is fixed.
1.63
September
2006
Supported Locales
From logmon version 3.42 and later, the probe supports the following encoding files for various locales:
Encoding
Name
UTF-8
Unicode (UTF-8)
UTF-16BE
UnicodeBigUnmarked
UTF-16LE
UnicodeLittleUnmarked
Shift_JIS
Japanese (Shift-JIS)
ISO-2022-JP
Japanese (JIS)
ISO-2022-CN
Chinese(ISO)
ISO-2022-KR
Korean (ISO)
GB18030
GB2312
Big5
EUC-JP
Japanese (EUC)
EUC-KR
Korean (EUC)
ISO-8859-1
ISO-8859-2
windows-1250
windows-1252
June 2005
Important! Do not use the Raw Configuration GUI when the probe is deployed in a non-English locale.
Preconfiguration Requirements
The probe has the following preconfiguration requirements:
The probe requires at least one of the following components for monitoring:
ASCII-based log files
URL of the web page
Command Outputs
Messages in the CA UIM Hub queues
The url_response probe for monitoring web page content.
Note: AIX platform does not support url_response mode, hence must be disabled for both Admin Console and Infrastructure
Manager.
Known Issues
The probe does not support monitoring log queues with multiple encodings. If the sysloggtw probe monitors syslog devices with different
encodings, the probe might not be able to identify the characters from the SYSLOG.IN queue.
The probe does not support Byte Order Mark (BOM) in the monitored files. If BOM is present in the files, the first character might be a '?'.
The probe only generates alarms for a file if the probe continuously receives pattern matches, as in Queue mode or from a continuous
running command, from the monitored file.
Command for some system calls (fgets) fail, resulting in wrong exit code by the probe.
Signal 13 and 15 result in exit code 141 and 143. The signals are used to kill a process.
The logmon Probe Provisioning UI does not allow user to view the file for URL mode.
The probe does not support URL mode, Command mode and Run Command on Match on IBM iSeries platform.
The probe has the following limitations when deployed in a non-English locale:
While monitoring an ASCII-based file using the ASCII characters for matching, you cannot use Japanese characters in the alarm
message text. The probe cannot identify Japanese characters in alarm messages in such cases.
The probe garbles the file name when clicking the Tail/View File button, while monitoring a Japanese log file.
The probe GUI shows garbled text on clicking the Tail/View File button to view the log file text.
The Raw Configure GUI of the probe is not supported for updating the probe configuration because it can corrupt the entire probe
configuration file.
The localization is supported only on Windows 32-Bit, Windows 64-Bit, and Linux 64-Bit.
The probe does not support queue with multiple encodings.
Revision History
Threshold Configuration Migration
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Upgrade Considerations
Installation Considerations
Installation Notes
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.20
What's New:
GA
June 2015
The probe can now be migrated to standard static alarm thresholds using the threshold_migrator probe.
Added support for IPv6.
Fixed Defects:
The probe displayed the hostname instead of IP address as the source in alarms. Salesforce case 00159505
2.1
GA
March 2014
2.0
GA
Sept 2013
1.0
Initial release
Dec 2012
Notes:
During migration, the process cannot be stopped and the probe must not be configured. The probe will restart once the process
is complete.
The changes in the probe after migration are:
The Infrastructure Manager GUI of the probe will not be available and the probe will only be configured on Admin Console.
Probe specific alarm configurations in the probe monitors will be replaced by Static Alarm and Time Over Threshold
configurations.
The alarms will be sent by the baseline_engine probe.
User defined variables will not be available
The lync_monitor probe requires the following software environment to migrate with threshold_migrator probe:
CA Unified Infrastructure Management 8.3 or later
Robot 7.5 or later (recommended)
Java JRE version 7 or later
Probe Provisioning Manager (PPM) probe version 3.21 or later
baseline_engine (Baseline Engine) version 2.60 or later
Upgrade Considerations
Upgrade to version 2.20 of the probe has the following consideration:
The probe will support upgrade of more than two thresholds for each profile. However, since the probe supports just two thresholds for
each profile, new thresholds cannot be configured for an upgraded profile. You must either delete thresholds until the number is less than
two, or create a new profile.
Installation Considerations
In order to monitor the lync server, CA Unified Infrastructure Management robot (version 3.02 or above) should be installed on the server.
1. Install 'lync_monitor' package into your local archive.
2. Drop the package from your local archive to the targeted robot.
3. Double click the probe in the Nimbus Manager.
4. Configurator opens with a default set of profiles in an inactive state.
5. Activate the profiles to start monitoring.
Note: For each activated 'Eventlog' profile, the probe scans windows events on the lync server from the last scanned event entry to
match the given criteria. If the system generates large number of events between the intervals, then scanning large number of events
may affect the performance of the system.
Installation Notes
The probe dynamically generates Quality of Service table names. Some of these table names might contain more than 64 characters, which can
create problems when inserting data into the SLM database. When a CA Unified Infrastructure Management earlier than 3.35 creates the SLM
database, the name column, in the S_QOS_DEFINITION table of the SLM database, is 64 bytes. You must update the size of the name column
to 255 bytes. The latest data_engine qos discards definitions that are greater than 64 characters. If you do not update the size of the name
column, earlier versions of the data_engine might fail.
Update the size of the name column to 255 bytes manually by using a database tool:
Design the table S_QOS_DEFINITION table and change the size of the name
column to 'varchar(255)'.
Update the size of the name column to 255 bytes by running the following query on the database (Note that this query will not work on SQL
Server 2000):
-- Added code to change field width of the name column of the S_QOS_DEF
INITION table
-- A constraint needs to be dropped first to be allowed to do so
declare @const varchar(500),@sql varchar(500)
--Remove temp table, if exists.
IF Exists(SELECT * FROM tempdb.dbo.sysobjects WHERE [ID] = OBJECT_ID('t
empdb.dbo.##Objects')) DROP TABLE ##Objects
--Creating new temp table.
create table ##Objects (iname varchar(255),is_pk int)
-- Inserts indexes that are primary key on the table.
insert into ##Objects SELECT [name],is_primary_key FROM sys.indexes WHE
RE object_id = OBJECT_ID(N'[dbo].[S_QOS_DEFINITION]')
-- Adding values to variables.
select @const = iname from ##Objects where is_pk = 1 set @SQL = 'ALTER
TABLE [dbo].[S_QOS_DEFINITION] DROP CONSTRAINT [' + @const + ']'
-- Dropping the PK constraint.
EXEC (@SQL)
-- Cleaning up temp table.
drop table ##Objects
-- Altering table, modifing name, and addeding new PK.
ALTER TABLE [S_QOS_DEFINITION] ALTER COLUMN [name] [varchar](255) not n
ull
ALTER TABLE [dbo].[S_QOS_DEFINITION] ADD PRIMARY KEY CLUSTERED (
[name] ASC
) ON [PRIMARY]
Requirements
Hardware Requirements
No minimum hardware requirements.
Software Requirements
The maintenance_mode probe requires the following minimum software environment:
CA Unified Infrastructure Management Server 7.5 or later.
Unified Management Portal version 7.5 or later.
nas probe version 4.32 or later.
Installation Considerations
The maintenance_mode probe is installed as part of an CA UIM installation.
Known Issues
No known issues.
Fixed Defects
No fixed defects in this release.
Revision History
Version
Description
State
Date
1.10
GA
September 2014
1.10
GA
June 2014
1.0
Initial release.
March 2014
availability, and scalable services. Although MongoDB supports a "standalone" or single-instance operation, usually production MongoDB
deployments are distributed.
The mongodb_monitor (MongoDB Monitoring) probe constantly monitors the internal performance and resource usage throughout a node in a
MongoDB cluster. Each node in the MongoDB cluster should be installed with the probe for a comprehensive monitoring experience. The probe
uses operating system commands and MongoDB API calls that are supported by MongoDB. The information is presented to the cluster
administrator as metrics, alarms, and reports. You can select and schedule an extensive range of checkpoints to meet the needs of your specific
monitoring requirements.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
General Use Considerations
Revision History
Version
Description
State
Date
v1.0
Initial version
GA
June 2015
Installation Considerations
1. Install the package into your local archive.
2. Drop the package from your local archive onto the targeted robot.
3. Use Admin Console to access the probe configuration GUI or raw configure options.
Upgrade Considerations
None
Description
Revised nas version to match the CA UIM version.
State
Date
GA
January
2015
4.75
Made additional performance improvements for viewing USM alarm information in large-scale environments.
Controlled
Release
Sep
2015
4.73
GA
Aug
2015
Controlled
Release
Jul
2015
GA
Mar
2015
4.72
Fixed an issue in which alarm_enrichment rules based on empty probe ID values no longer worked in nas 4.67. Salefo
rce case 00161140
Fixed an issue in which the On Interval AO setting would not respect alarm count filters. Salesforce case 00162735
Custom tags are now truncated at 255 bytes. This fixes an issue in which custom tags exceeding 255 bytes were not
inserted into the UIM database. Salesforce case 00162186
4.67
4.60
Fixed an issue in which long alerts (exceeding four-thousand characters) from the ntevl probe could cause an error
with the NiS bridge. Salesforce case 00142103
GA
Dec
2014
GA
Sept
2014
GA
Jun
2014
GA
Mar
2014
GA
Jan
2013
GA
Nov
2012
GA
Jul
2012
Fixed an issue in which mysql UIM database passwords longer than nine characters would not work with mysql 5
authentication. Salesforce case 00142835
Fixed an issue in which some Lua scrips could cause memory leaks in nas versions 4.36 and 4.40. Salesforce case
00149127
Changed the nas probe configuration GUI to reflect updated Trigger Message Counters; Greater than now appears as
Greater than or equal. Salesforce case 00144345
4.40
4.36
Fixed a Lua script issue that could cause a nas startup failure after a segmentation fault.
Fixed an issue in which nas would not send email when the On_Interval setting was selected.
Fixed an issue in which AO would not act on overdue_age profiles correctly.
Fixed an issue in which the alarm_enrichment probe would repeatedly retry messages if the required alarm fields were
not provided.
4.32
Fixed "temporarily out of resource" errors during callbacks to the nas probe.
Added support for maintenance mode.
4.20
4.10
4.01
Alarms not stored in replication database when message suppression is turned off
Fixed I18N issue, required for UMP
Fixed pre-population query in alarm_enrichment cache.
4.00
GA
Jun
2012
3.75
GA
Mar
2012
3.74
GA
Feb
2012
3.73
GA
Jan
2012
GA
Oct 31
2011
GA
Oct 13
2011
GA
Jun 24
2011
3.72
3.71
3.70
Defect fixes
IPv6 support added
Fixed column width of schedules in AO profiles (now resizeable).
3.63
GA
Jun 23
2011
3.62
Defect fixes
GA
Jun
2011
3.61
GA
Mar
2011
GA
Feb 4
2011
GA
Feb 3
2011
3.60
3.54
3.53
Defect fixes
GA
Jan
2011
3.52
GA
Nov
2010
GA
Sep
2010
GA
Jun
2010
GA
May
2010
3.51
Fixed problem with long message texts not being inserted/updated by NiS bridge
Added support for internationalized/tokenized alarms.
Added support for a Oracle NiS database.
3.44
3.42
3.41
Defect fixes
GA
Mar
2010
3.40
GA
Jan
2010
3.31
GA
Nov
2009
3.28
GA
Aug
2009
3.27
GA
Jul
2009
3.26
Defect fixes
GA
May 29
2009
GA
May 26
2009
GA
Mar
2009
GA
Feb
2009
3.25
Defect fixes
Added support to alter the NAS subscribers 'subject'
Added support for 'raw configuration' of cross-domain replication
Added support for the 'state' method in pre-processing scripts.
3.24
3.23/3.18
3.22
Defect fixes
GA
Nov
2008
3.16
GA
Sep
2008
3.15
GA
Aug
2008
3.14
GA
Jun
2008
3.12
GA
Apr 24
2008
3.11
GA
Apr 4
2008
3.10
Embedded scripting language for advanced message correlation and auto-operator functions
GA
Feb 15
2008
Enhancements
GA
Feb 4
2007
2.74
Added origin, domain, hub, robot and probe information to network transactions.
GA
Dec
2006
GA
May
2006
GA
Jan
2006
GA
Sep
2005
GA
Mar 31
2005
GA
Mar 4
2005
GA
Dec
2004
GA
Nov
2004
2.72
2.71
2.70
Fixed issue with auto-operator and the ability generate a command after alarm ack
Added a calendar feature controlling filters and auto-operator methods
Functionality extensions.
2.68
2.67
Fixed problems with import/export and hubnames containing the label "hub"
Added NimBUS domain, hub and probe as possible matching criteria.
Added new auto-operator action type: post-message
Fixed various issues related to the auto-operator clearing alarms
Added possibility to expand variables in SMS phone field.
2.66
2.65
2.64
Fixed defects
GA
Jun
2004
2.63
Modified the name lookup algorithm to permanently exclude more than 3 consecutive lookup failures
GA
Mar 23
2004
2.62
GA
Jan
2004
GA
Oct
2003
2.61
2.60
GA
Jul
2003
2.51
GA
Feb 17
2003
2.50
GA
Feb 13
2003
2.47
GA
Nov
2002
2.46
GA
Aug
2002
2.45
Added suppression time when escalating an alarm. Fixed problems with get_alarms from Alarm Console
GA
Jun
2002
2.42
GA
May
2002
2.41
GA
Feb 14
2002
2.40
GA
Feb
2002
2.39
GA
Jan 17
2002
2.38
GA
Jan 15
2002
2.37
GA
Dec
2001
2.36
GA
Nov
2001
2.35
GA
Oct
2001
2.34
GA
Jul
2001
2.33
GA
Jun
2001
2.3
GA
May
2001
2.0
GA
Apr
2001
Requirements
This section contains the requirements for the nas probe.
Hardware Requirements
None
Supported Platforms
The nas probe is supported on all UIM hub platforms, except for AIX--which is unsupported for nas. This includes 32-bit hub platforms.
Note: Refer to the Compatibility Support Matrix for the latest information on supported UIM platforms.
Considerations
This section contains the considerations for the nas probe.
Installation Considerations
The nas requires a permanent queue on the Hub. If you are upgrading an existing alarm server, the queue will already be defined.
Revision History
Threshold Configuration Migration
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for net_connect probe.
Version
Description
State
Date
3.21
Fixed Defects:
GA
December
2015
GA
November
2015
GA
August
2015
The probe did not retrieve the Hostname of the specified IP Address of the monitored host. Support case number
00267015
The probe did not save the changes in Infrastructure Manager when managing services through Bulk Configuration. Sup
port case number 00270819
Set the default state of the Response Time monitor to ON to display data on the Unified Dashboard automatically.
3.20
What's New:
Added support for monitoring hosts with dynamic IP address.
Fixed Defects:
When viewing data on USM, the net_connect probe DEV files conflicted with the interface_traffic probe DEV files and
displayed random device information. If both the hostname and IP address are provided for a monitored host, the
net_connect probe used only the hostname to display data. Salesforce cases 00165874, 70007073
You can create a key in the probe to display data on USM, using the host IP address. For more information, see the Kno
wn Issues and Workarounds section.
When monitoring only the services of a host, and not the ICMP connectivity of a host, the probe did not generate any
QoS data. Salesforce case 70002909, 70004940
The probe did not resolve the $group variable if it was used in a tree structure. Salesforce case 70002994
The probe was deleting the monitoring profiles created using the drag-and-drop feature. Salesforce case 70002993
3.11
What's New:
The probe can now be migrated to standard static alarm thresholds using the threshold_migrator probe.
Note: Refer Threshold Configuration Migration section for more information.
Fixed Defect:
User was not able to set the default host parameters in the Infrastructure Manager version of the probe. Salesforce case
00169100
3.10
What's New:
GA
July 2015
GA
January
2015
GA
July 2014
Added three new fields (Max Ping Threads, Max Service Threads, and Max PacketLoss Threads) under Setup
Properties > Advanced > Performance Properties section in IM GUI, and under net_connect > General Configuration
section in AC GUI to provide flexibility to the user in configuring separate number of threads for ping, services, and
packet loss feature.
Probe will now send single ping for connectivity check and response time calculation instead of separate pings for both.
Added Log Size field to define the size of the log file.
Fixed Defects:
User was not able to monitor the IPv6 host. Salesforce case 00154639
Multiple alarms were received and the alarm message contained the text: 'jitter is above threshold limit' even when no
threshold was defined. Salesforce case 00162039
User was not able to change the message for jitter alarms. User can now edit already existing (default) jitter alarms, but
cannot create new jitter (OK and Fail) alarms using Message Pool Manager in IM GUI. These two default jitter
messages will be sent by the probe for all the profiles. User cannot configure custom jitter alarm using Host or Service
properties (The probe does not support this functionality). User can also edit message text of default jitter alarms, which
will reflect in the Alarm Console. It is recommended not to delete any default jitter alarm, as these are the only 'jitter OK'
and 'jitter Fail' alarms sent by the probe. User will not be able to add these jitter alarms again from probe GUI. Salesforc
e case 00137628
3.05
Fixed Defects:
Probe stopped working with services that did not have Active key. Salesforce case 00141024
After re-applying default settings to a profile, a newly created message did not get saved in the configuration file. Salesf
orce case 00146089
Runtime error occurred when a service monitor was edited. Salesforce case 00148572
After the value of the Failed intervals field was changed once, the value did not get changed if it was changed again. Sal
esforce case 00149390
If a service was added using drag-and-drop function, some default sections were not saved in the configuration file. Sale
sforce case 00142415
3.04
Fixed a defect where probe was constantly in error state and not generating PID. Salesforce case 00119246
Fixed a defect where the QoS Target associated with old profiles was not changing to Profile Name if the probe version
was upgraded. Salesforce case 00135038
Fixed a defect where the probe stopped rotating the log file if the log size was set higher than 2 GB. Salesforce case 00
128940
3.03
Fixed Defects:
April 2014
For an inaccessible service on a host machine, the probe was not delaying the retry attempt for the time specified in the
Delay Between Retries field.
User was not able to change the severity of the alarm Failed to execute profiles in scheduled time interva l. This alarm is
now configurable through GUI.
The probe failed to restart when the Monitor ICMP connectivity (ping) check box was unchecked and the probe was
restarted.
3.02
Fixed the issue where the probe was generating the faulty "connection failed" alarms and "NULL" QoS for devices
without publicly defined hostname, such as switches. Now, the probe uses the IP address for devices that do not have a
publicly defined hostname and generate appropriate alarms and QoS.
January
2014
Fixed an issue where jitter and packet-loss alarms were not clearing.
Fixed an issue where the probe sends 0 as the response time for all pings instead of the actual response time.
Fixed an issue where the net_connect probe sends QOS value 0 for devices that do not ping.
Fixed an issue where the net_connect probe sends duplicate QOS when packet-loss monitoring is enabled.
3.00
2.93
Fixed an issue of CPU usage and clock drift (QOS with negative response time).
November
2013
July 2013
Added entries in CiManager for unit of Packet latency and Packet Jitter.
2.93
May 2013
2.92
Fixed an issue where in parent-child relationship probe is processing the child even when the parent node is down.
April 2013
April 2013
December
2012
Fixed a probe crash on restart and empty callbacks call from probe utility.
July 2012
Fixed an issue to sort profile on basis of IP address. Supported sorting for both IPv4 and IPv6 address.
June 2012
Fixed an issue where change in QoS source for one profile also changes other profiles.
April 2012
2.71
January
2012
December
2011
2.70
October
2011
2.66
October
2011
2.65
Fixed an issue where the probe was reporting incorrect jitter values in alarm and QoS.
Fixed a crash occurring due to insufficient wait time used by the probe to wait on profile & service threads graceful exit.
September
2011
The probe now supports a new raw configurable attribute named thread_timeout for setting the thread timeout value in
seconds.
The probe looks for the new attribute under setup section of configuration file. By default the probe uses 30 as timeout
value to wait on the threads to gracefully exit on probe stop/restart situations.
Added option for changing QoS source.
Note: Do not change the source unless you have good reasons. The Robot address is the "correct" source for this
probe.
2.63
Added support for configuring packet loss alarms & QoS separately.
Added packet jitter value to the values PDS sent with alarms.
The key name for this value in the alarm is packet_jitter.
Added support for expanding group value for chained host profiles.
Added an alarm when profile execution exceeds regular profile execution interval.
August
2011
2.62
April 2011
Fix: GUI bug when using bulk configurator to configure services. If text field 'Delay between retries' is empty,
configuration of services would silently fail.
Changed default ICMP packet size to 32 bytes, default delay between packets to 30 seconds, default packets to send in
packet loss to 10.
Added alarm on jitter.
Added option in UI to configure service threads.
Updated bulk configuration UI for missing configuration parameters.
2.53
2.60
March
2011
February
2011
Added new check boxes for Latency and Jitter in SOC GUI.
2.52
2.51
January
2011
December
2010
2.41
2.40
October
2010
September
2010
August
2010
August
2010
Added fix to provide number of retries before sending alarm with delay between retries.
Added fix to properly clear service alarms.
Redesigned bulk configuration window.
Added option in GUI to set default service parameters.
2.34
Added fix to remove error message in logs in ssh service by sending SSH identification string.
June 2010
Added fix to use new API when both IP and hostname are available to avoid name-lookup.
Added fix in probe to send proper hostname in callback.
Added fix in GUI to update hostname and ip address on test.
Added fix in probe to break the loops on restart/stop.
Added fix in probe to initialize the buffers, before checking the response while monitoring the services.
Added fix in probe to change "Bind to network interface" on restart.
Added fix for crash in GUI when profile was renamed and deleted.
Commented code for group expansion.
Added code to add contact info field in bulk configuration.
Added a field for delay between packets in profile form.
Fixed constant QoS value.
2.26
Added a fix to not clear the alarm in every interval before sending the challenge response failure alarm.
February
2010
2.25
February
2010
2.24
Added a fix for intermittent probe restart when running a service request.
Updated callback to test the response using IP address when both hostname and IP is specified.
February
2010
2.23
December
2009
2.15
Fix for not always using correct IP address when hostname is specified in profile. (Fix available in version 2.15 and from
version 2.23, not versions in between).
December
2009
2.22
December
2009
2.21
December
2009
2.12
August
2009
June 2009
Fix: If a connection test fails, the QoS for packet loss is NULL not 100(%).
April 2009
Fixed problem with the challenge response timeout not being read. Note the following issue:
The Timeout field on the service properties dialog requires a value. If empty, the GUI will crash!
2.04
2.03
March
2009
January
2009
January
2009
2.00
May 2008
March
2008
1.80
Fix: GUI could cause a runtime error if you entered an integer value as name for a new profile.
Fix: ping test response times should be correct now
September
2006
Fix: drag-n-drop bug that would allow you to drag a host from GUI and drop it in another program, it would disappear
from net_connect GUI
Fix: IP-addresses should be easier to edit.
Feature: Bulk configuration tool.
Feature: "Group" variable can be used in alarm messages.
Feature: Ability to generate alert when ping latency threshold exceeds a customizable limit.
Feature: When adding a profile with a name that already exists, program will suggest a new name.
Feature: The new host dialog has improved code to detect and suggest parent group/folder.
Feature: Simple port scan utility.
Added more standard port numbers to probe definition.
Fixed name-resolution problem in UI.
1.70
February
2005
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.43
Fixed Defect:
GA
December
2015
GA
October
2014
Probe was displaying current value instead of accumulated value. A key, use_accumulated_value_in_qos, has been
added in Raw Configuration to enable you to change the value as required. Support case number 00168839
Note: For more information about this key, refer Upgrade Considerations.
1.42
What's New:
Added support for Solaris 64-bit SPARC.
Fixed Defect:
Fixed a defect where the probe did not start on the Linux x64 system. Salesforce case 00127020
1.41
Fixed a defect where maxSample value in QOS message is not reflected for 10 GBps interface.
April 2013
December
2012
1.30
March
2011
July 2008
May 2008
Added: Post-install event that will try to select a adapter based on traffic sniffed and save it to CFG. Probe will not
support blank adapter in CFG file.
August
2006
March
2004
Installation Considerations
Ensure that winpcap library does not exist on the 2008 32 bit system where net_traffic probe will be installed.
Upgrade Considerations
Consider the following upgrade scenarios:
Upgrading the probe from version 1.41 or earlier to version 1.43
Set the value for use_accumulated_value_in_qos as yes in the Raw Configuration interface to enable probe to send accumulated value in the
QoS data. By default, the key value is set to no to send the current value in the QoS data.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
SNMPv3 Support
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.38
Fixed Defects:
GA
December
2015
GA
October
2015
GA
July 2014
GA
September
2013
GA
June 2013
Updated the metrics document with information on latency value calculations. Support case numbers 00162025,
00165092
The probe did not use case-sensitive discovery of monitoring objects. The same value was displayed for different
objects with the same name. Support case number 244877
Updated the probe to optionally enable case-sensitive discovery of monitoring objects. For more information, see the Co
nfigure General Properties and Create Agent Profile sections of the netapp IM Configuration.
The probe displayed incorrect indicators for monitor status in probe interface. Support case number 70007002
1.37
Fixed Defects:
The units for LUN Latency checkpoint did not match on the UMP and SLM database. Salesforce case 00165092
The probe used to restart after seven days. Salesforce case 0070001695
While creating a new custom monitor, the probe was unable to sort OIDs appropriately when the number of OIDs
exceeded nine. Salesforce case 00167324
Incorrect alarms were generated for 'Not between' operator as only one of the two defined threshold values was getting
saved. Salesforce case 00155465
1.36
Fixed Defects:
Fixed an issue in which the Volume Names were sometimes displayed in HEX ASCII code in the Alarms and Logs. Sale
sforce case 00120380
1.35
Fixed Defects:
Fixed an issue where LUN Size Used and LUN Percent Used were showing up as zero and exclamation sign on some
values.
Need to capture LUN Latency - System Metric from NetApp Probe.
Fixed an issue where the probe was unable to create Static monitor.
Fixed the failure of Threshold monitoring.
Fixed the failure to apply Auto-configuration.
Fixed a defect in count based metrics.
Fixed the Snmp V3 authentication issue.
1.34
1.33
Fixed Netapp where the probe was having issue with get values with Netapp C mode.
Fixed netapp where logs showing Error messages for the calculation of multiple OIDs.
Fixed netapp agent reports value of zero on formulas using more than one OID variable.
March
2013
Fixed netapp where agent profile does not appear to make use of default global timeout/retries settings.
Fixed a defect where netapp displays incorrect value for average based checkpoints.
Fixed a defect where netapp gives incorrect unit labels for QoS definition
"QOS_STORAGE_PERCENT_CAPACITY_USED_VOL".
1.32
1.31
Fixed a defect where checkpoint of type formula base on more than one OID was giving wrong values.
Fixed a defect for issues with Netapp Monitoring.
Fixed a defect where log file size was not changing.
January
2013
December
2012
Fixed a defect where description in Netapp - Total Capacity - aggregates - Monitor Properties is incorrect - and should
say 'GB'
Fixed a defect where formula used to calculate Total Raw Capacity was using 1000 instead of 1024.
Merged monitoring frequency latency changes.
1.30
1.24
November
2012
November
2012
1.22
June 2012
Two new threshold operations ('=a or =b' and '!=a and !=b')
Scale limit for SNMP query of more than 300 OID indexes fixed.
Custom Monitors not working properly or correctly documented.
Online HELP was not properly launching from IM GUI
Improved DEBUG and ERROR logging.
ONTAP API QOS metrics not showing correctly in Infrastructure Manager GUI
netapp.cfg flag to force graceful probe restart at a designated hour of the day
Added new monitors: CP Time, LUN Read Latency, LUN Write Latency, and Other Latency monitors to DTA
configuration file.
Note: Version 1.21 was an interim hot fix release which is deprecated by 1.22.
1.20
March
2012
QOS metrics delivery stops for certain configurations after running for a limited time due to a deadlock.
netapp.exe process continued to consume memory due to leak.
HEX data showing up in 'target' (aka instance name) field in published QOS metrics in the data engine for LUN volumes.
1.12
December
2011
October
2011
March
2011
Installation Considerations
Consider the following information before you deploy and use the probe:
SNMP v3 requires that your network community strings are at least eight characters long.
ONTAPI requires user with administrator privileges.
The monitors dependent on ONTAPI are available only when ONTAPI Settings are configured. The requirement also applies to the Auto
Configuration node.
The netapp Basic Monitors template is available as the default template for the 7-mode. The netapp C-mode Basic Monitors template
is available as the default template for the c-mode.
Upgrade Considerations
Consider the following information before you upgrade the probe from version 1.36:
The Default Templates Settings from the previous version are not retained.
The default template is applied when the Default Templates Settings screen is opened and OK is pressed.
SNMPv3 Support
The netapp probe is enabled to monitor agents based on the SNMPv3 protocol. Ensure the following guidelines are met when monitoring the
SNMPv3 agents.
If your probe instance is monitoring multiple SNMPv3 hosts, ensure that the EngineID of all hosts are unique. Duplicate EngineID may
cause sporadic connection timeouts and failure alarms.
The probe does not allow you to create duplicate profiles for an SNMPv3 agent. Do not use the Raw Configure option or add directly in
the configuration file for creating multiple profiles for the same SNMPv3 agent. This can result in some unpredictable results by the probe.
The NFA Inventory probe is the integration point between CA Network Flow Analysis and CA Unified Infrastructure Management (CA UIM). The
integration allows you to view NetFlow/IPFIX/Jflow/Sflow data in context with SNMP data for an interface inside of the Unified Management Portal
(UMP).
The views displayed are:
Stacked Protocol Trend - In
Stacked Protocol Trend - Out
Top Hosts
Top Conversations
The integration enables customers to drill from CA UIM to CA NFA for additional diagnostic detail, and back to CA UIM from within CA NFA.
Multi-tenancy is supported.
interfaceMappingDelay - the time in minutes after an inventory update to perform the mapping of interface groups to origins. The
minimum value is 1, the maximum value is 15, and the default value is 5.
interfaceMappingBatchSize - The number of interfaces to request origins for in a batch. The minimum value is 1, the maximum value is
20000, and the default value is 1000.
Contents
Revision History
Probe Specific Configuration Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
General Use Considerations
Revision History
Version
Description
1.1
Added support for Single Sign-on (SSO) between USM 8.31 and NFA 9.3.2:
SSO without LDAP or SAML2
State
Date
GA
July 2015
GA
May 2015
Initial version.
Installation Considerations
1. Install the package into your local archive.
2. Drop the package from your local archive onto the targeted robot.
3. Use Admin Console to access the probe configuration user interface.
Upgrade Considerations
None.
The NFA Inventory (nfa_inventory) configuration is available through the Admin Console user interface only and not through the Infrastructure
Manager (IM) user interface.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues
Revision History
This section describes the history of the revisions for the probe.
Version
Description
State
Date
2.32
Fixed Defects:
GA
December
2015
GA
April 2013
Updated Probe Specific Software Requirements section of the Release Notes to include the information about the
type of operating system (32-bit or 64-bit) to be used while deploying the probe. Also, added a Known Issue to mention
the probe version that is compatible with 32-bit servers installed on a 64-bit system. Support case number 00169666
Updated the content for Set up IBM Notes client. Support case number 70000934
2.31
Fixed a defect where server of address format cn=???/ou=???/o=??? was not getting processed
2.30
March
2013
2.20
June 2010
2.12
Added fix to load a proper 'mail_response' alarm level value to the messages listview.
January
2010
2.11
Implemented code locking when calling Lotus Notes APIs. The change is implemented to avoid deadlocks when multiple
profiles execute in parallel.
October
2009
Built with new API and new compiler. Note that the requirements have changed to Notes Client >= 7.0.2 and Robot >=
3.00
2.04
February
2008
Threaded version.
Improved handling of error situation.
July 2006
1.18
September
2005
Known Issues
The probe has the following known issue:
You cannot use the probe version 2.31 or earlier to monitor 32-bit servers that are installed on a 64-bit system, either locally or remotely.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.53
Fixed Defects:
GA
November
2015
GA
July 2013
The notes_server64 probe could not be configured. Support case numbers 00166940, 00167461
Updated the probe documentation for Regular Expression support when you configure the probe to monitor the Statistics
variable. Support case number 70000737
1.52
1.51
1.40
March
2013
March
2011
1.30
June
2010
1.20
December
2009
Note that the requirements have changed to Notes Client >= 7.0.2 and Robot >= 3.00.
1.10
March
2005
The nq_services probe makes services internally available to CA Network Flow Analysis and other CA NetQoS products to allow these products
to interrogate CA Unified Infrastructure Management (CA UIM) for data.
Contents
Revision History
Revision History
Version
Description
State
Date
1.0
Initial version.
GA
August 2015
Note: It is recommended that the CA Unified Infrastructure Management Server and CA Unified Management Portal (UMP) are
the same version.
Robot version 5.23 or later
Java Virtual Machine (JVM) version 1.7 or later (deployed as part of the probe package)
CA Network Flow Analysis R9.3.2 or later
Installation Considerations
1. Install the package into your local archive.
2. Drop the package from your local archive onto the targeted robot.
Upgrade Considerations
None.
nsa does not require a run-time environment (including UIM), and contains all functionality within a single binary.
The user may encrypt sensitive data (for example, UIM or database login passwords) and may incorporate this into the scripts.
Contents
Revision History
Hardware Requirements
Software Requirements
Installation Considerations
Revision History
This section describes the history of the revisions for the nsa probe.
Version
Description
State
Date
2.06
Fixed an issue in which extra lines of text occurred in the CLI when the probe was active. Salesforce case 00158212
GA
GA
March 3, 2015
2.05
Fixed an issue in which running nimbus.alarm() using an 88-character suppkey crashes nsa. Salesforce case 0009
2078
Fixed an issue in which SNMP v3 failed to create SNMP objects. Salesforce case 00117304
2.04
Fixed an issue in which the nsa probe segmentation faults when probe-based authentication is used on custom probes
on SLES11 64 bit.
GA
September 9,
2014
2.01
Fixed the linking with MySQLConnector for C on Linux. Earlier, these had runtime dependencies on MySQL libraries.
GA
December 31,
2010
GA
November 29,
2010
GA
GA
December 30,
2009
GA
September 30,
2009
GA
2.00
1.12
1.11
1.10
1.0
Initial Release.
Hardware Requirements
No additional hardware requirements.
Software Requirements
The nsa probe requires the following software environment:
The MySQL Connector for C 6.0.2 for your platform (http://www.mysql.com/downloads/connector/c/#downloads) if you want to use
MySQL.
Installation Considerations
The nsa package is installed by dropping it onto the target Robots in Infrastructure Manager, and installs to the SDK directory.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Known Issues
Revision History
This section describes the history of the revisions for the nsdgtw probe.
Version
Description
State
Date
1.23
Fixed Defects:
GA
December
2015
GA
July 2014
GA
April 2013
GA
August
2012
GA
June
2012
GA
March
2012
The probe did not generate incidents in CA NSD when the alarm message included special characters. Added optional
ability in Admin Console to remove invalid characters from alarms before the probe creates an incident. Support case
numbers 00148396, 00154262
For more information, see the Configure General Properties section in nsdgtw AC Configuration.
The probe did not accept whitespace character in the Requester Organization field. Support case number 00167819
What's New:
Added ability to set up proxy settings for connection to CA NSD through Admin Console.
For more information, see the Set Up CA NSD Connection section in nsdgtw AC Configuration.
1.22
Fixed Defects:
Fixed a defect in which the probe was sending unnecessary calls to NSD against the CLEAR alarms whether a ticket is
associated or not. Salesforce case 00136199
Fixed a defect for probe locking feature available through controller probe. Salesforce case 00108629
Fixed a defect in which the probe was not generating ticket for alarm message containing special characters. Salesforce
case 00117647
1.21
1.20
Features added:
User can now configure "Resolved" ticket status to trigger alarm closure along with "Closed" ticket status.
Alarm closure can now close the ticket and user can configure what status ("Resolved" or "Closed") should be set on the
ticket if the alarm closes.
User has now an option to either map alarm severity with ticket severity, with ticket priority or having no mapping at all.
The ticket priority or severity can also be changed to a configurable value when an alarm is cleared/acknowledged.
Fixed defects:
Incorrect "Time Origin", "Time Arrival" and "Time Received" shown on the NSD ticket.
Incorrect field mapping for affected device service desk.
The "Message:" prefix would not occur in the "Symptom Description" field of NSD incident ticket
1.15
Fixed defects:
Unable to open "Field Mapping" in nsdgtw probe after we remove all the Standard fields from the Field Mapping list.
Unable to remove the field mapping when it is in edit mode.
1.14
Modified the probe to ensure that the severity for incidents is updated correctly based on the severity mappings
configured for the probe.
Also corrected the logic to save the last incident check timestamp to ensure that all alarms are successfully
cleared/acknowledged and none are skipped or missed during processing.
1.13
Modified the probe to ensure that all the NMS Alarms are cleared out when multiple incident tickets are closed within
service desk.
GA
February
2012
1.10
GA
June
2011
GA
May 10
2011
Added new field requester organization with dollar variable support for multi-tenancy support.
1.06
Removed auto assign filter for suppression count causing recursive incident creation.
Custom mapping section added in cfx file.
1.05
GA
May 2
2011
1.04
GA
April 2011
Installation Considerations
Test the Auto Operator function of Nimsoft Alarm Server (NAS) and the Auto Assignment feature of the probe. CA recommends you to ensure
that rules in both the features do not assign an alarm to the same CA UIM user. Erroneous assignment rules can create additional or duplicate
tickets.
Some assignment considerations are as follows:
If you assign an alarm to Service Desk user, then un-assign and assign it back, a duplicate incident is created.
If you have Auto Alarm Assignment turned on in nsdgtw with some filter, no alarms are assigned in NMS. However, an incident is created
in the Service Desk. Now, if you try to assign the same auto alarm in NMS, a duplicate incident is created in Service Desk.
If you already have NAS auto-operator set for certain conditions, CA recommends you to not use Auto Alarm Assignment. In case both
(NAS auto-operator and auto-alarm assignment) catch similar filter criteria for an alarm, it results in duplication of incidents.
If Offline Management is turned off and user performs alarm assignment to Service Desk user while probe is turned off, no incident is
created in Service Desk. For more information on Offline Management, see nsdgtw Advanced Configuration.
Known Issues
The probe has the following known issues:
The Alarm Severity and SLA section saves the mapping details to the configuration file of the probe. The probe uses the mapping details
but does not display the mapping details.
The probe does not generate incidents in CA NSD when the alarm message includes special characters. You can select Discard Invalid
Characters in the General Configuration section of the hostname node in Admin Console. The probe automatically handles invalid
characters when configured using Infrastructure Manager (IM) GUI.
Proxy settings for connection to CA NSD are only available through the Admin Console interface of probe version 1.23 or later.
Note: An event is a significant activity in a system or application, which requires user attention. Microsoft Windows logs all such events
and make them available to the user through the Event Viewer tool. This process helps the user to identify and troubleshoot the
hardware or software issues in the system.
Contents
Revision History
Supported Locales
System Specifications
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
Known Issues
Revision History
This section describes the history of revisions for ntevl probe.
Version
Description
State
Date
4.21
Fixed Defect:
GA
December
2015
Beta
December
2015
GA
October
2015
GA
July 2015
Updated information about monitoring security logs for Microsoft Domain Controllers in documentation. Support case
number 70006645
What's New:
Added support to exclude System, Application and Security logs for monitoring.
4.20
Fixed Defect:
ERROR eventlog was not mapped with the correct UIM alarm severity. Support case number 70007434
What's New:
Re-designed the probe to improve scalability.
The new probe design is applicable for Windows Vista and later.
4.12
Fixed Defect:
Alarm variables were not expanding for non ASCII Characters. Salesforce case 00166695
The probe crashed due to incorrect logging in the probe. Salesforce case 00162278
4.11
What's New:
Upgraded support for factory templates.
Added a note in Admin Console GUI article that describes how Event Count alarm will work only if at least one event is
triggered for the matching profile. Salesforce case 00159016
4.10
Beta
June 2015
4.03
GA
March
2015
4.02
What's New:
Added the Enable Position File Backup Interval check box on the Properties tab. Salesforce case 00145842
Fixed Defects:
Messages appeared in reverse order when the probe was run on Windows Server 2003 Standard x64 Edition R2
(SP2). Salesforce case 00149339
Probe CPU consumption was very high. Salesforce cases: 00146816, 00152031, 00151572, and 00150479
January
2015
4.01
What's New:
September
2014
Added the localization support for B-Portuguese, Chinese (simplified and traditional), French, German, Italian,
Japanese, Korean, and Spanish languages from both IM and Admin Console GUI. For localization support through
Admin Console GUI, the probe must run with PPM 2.38 or later version.
Updated the probe IM GUI and Admin Console GUI for specifying the character encoding in different locales.
Note: Do not use the Raw Configure GUI for updating the probe configuration in the non-English locales because it can
corrupt the probe configuration file.
Fixed Defects:
Fixed the issue of removing quotes, double quotes, and comma from the event message text when generating alarms.
Salesforce cases: 00142620, 00140474
Fixed the issue of not displaying Critical alarm severity in the drop-down list, when the probe is hosted on Windows
Server 2008 operating system. Salesforce case 00132207
Fixed the defect of IM probe GUI where the probe is not saving updated log level in the probe configuration file. Salesfor
ce case 00140301
Fixed the issue where is the probe is not resolving the $severity_str variable value. Salesforce case 00140472
Fixed the defect where the probe is adding extra characters to the date string while displaying event details on IM probe
GUI and in alarms. Salesforce cases: 00134035, 00133326
3.91
Fixed the memory leak issue where the probe is not able to handle the events when the events count is in 10 digits or
more. This issue was causing high CPU and memory utilization by the probe.
April 2014
Fixed the defect of Browse button on the VB GUI of the probe under the Event selection tab. The Browse button was
getting tempered when user clicks the button for selecting a batch file.
Fixed defect of the Description field on the VB GUI of the probe under the Event Properties dialog by allowing user to
paste text in the field.
3.90
What's New:
December
2013
Fixed defect related to the Japanese event logs, where the event description is not displaying on the probe GUI.
Fixed the suppression key override issue, which was truncating the suppression text after 50 characters and appending
the profile name.
September
2013
July 2013
Fixed issue related to Alarms generated from ntevl probe are getting de-duplicated.
3.83
Incorrect event type display issue on japanese text suppported system fixed
April 2013
Added functionality to monitor Operational and Admin event logs (introduced from Vista/Windows 2008 onwards).
Added Probe Defaults.
December
2012
Fixed a defect where probe GUI is slow in responding in case of large number of events.
Fixed memory leaks.
3.70
September
2012
3.63
3.62
3.61
July 2012
November
2011
October
2011
Fixed a crash which use to occure when event description is more than 2K sizeerror handling from BEA weblogic server
3.60
June 2011
February
2011
December
2010
Applied a fix to remove extra white space which was appearing after removing newline characters.
2.35
Added a fix for replacing recurring hard returns with a single delimiter in description field.
August
2010
July 2010
June 2010
Enhanced the probe to allow generation of variables from message body, and also to send alerts on this variables.
May 2010
Added support to raise an alert only after a particular number of instances of an event within a particular time frame.
3.30
March
2010
3.23
Resolved the problem where only a partial event list was fetched. The most obvious situation was on computer restart.
December
2009
3.22
Added a fix in evlWmi library for fetching InsertionStrings column value from WMI if Message value is not available.
November
2009
3.21
November
2009
Added fix in the probe and GUI for replacing hard returns with user defined delimiter in event description field.
Fixed Day Light Saving time issue.
October
2009
September
2009
3.10
Updated configuration file for event logs, added preconfigured event logs (Application, System and Security) in section.
Updated WMI library for handling custom event logs.
September
2009
Added key (wmi_timeout) in the setup section of configuration file. This key can be used to set the WMI query timeout in
seconds if there are huge number of events.
No propagation alarm functionality issue fixed.
Added fix in Windows Vista running service pack version 1 or below to fetch the event indexes using WMI. Vista version
prior to SP2 had an issue where the probe was unable to fetch the event indexes properly.
Added a fix in the evlWmi library for handling computer's FQDN. In some windows platforms when a machine is in a
domain the computer field of event logs shows computer FQDN. Earlier the probe was failing when checking
watchers/excludes computer field.
3.02
Fixed an issue in the Exclude tab where the excluded events were not automatically activated after upgrading the probe
from versions that did not have an option to activate or deactivate such events.
July 2009
Now, after upgrading from previous version, the probe activates all the excluded events, by default.
Added support for checking underlying OS version detection, if the OS version is Windows 2000 or below the probe
triggers an alarm and stops execution
3.01
April 2009
April 2009
December
2008
September
2008
2.15
Modified regular expression comparison code to avoid problems with large string comparisons in exclude profiles.
September
2007
2.14
Opens registry with the minimum necessary access rights to avoid generating security events on Windows 2003 Server.
January
2007
2.13
Added possibility to enter 'localhost' in the computer field of the watch and exclude profiles to only match on events from the
local machine to the probe.
November
2006
2.10
Modified initial configurator sorting of events. Added option to allow starting the configurator without fetching the event list.
January
2006
2.02
Support added for variables in alarm message, suppression key and subsystem. The variables are: profile, description,
source, event_id, category, message, log, severity, user, computer and time_stamp.
Quality of service message added for number of events found.
Supported Locales
From version 4.0 and later, the ntevl probe supports the following non-English locales:
December
2004
B-Portuguese
Chinese (traditional and simplified)
French
German
Italian
Japanese
Korean
Spanish
The probe supports the following system encoding for the different locales:
Encoding
Name
UTF-8
Unicode (UTF-8)
UTF-16BE
UnicodeBigUnmarked
UTF-16LE
UnicodeLittleUnmarked
Shift_JIS
Japanese (Shift-JIS)
ISO-2022-JP
Japanese (JIS)
ISO-2022-CN
Chinese(ISO)
ISO-2022-KR
Korean (ISO)
GB18030
GB2312
Big5
EUC-JP
Japanese (EUC)
EUC-KR
Korean (EUC)
ISO-8859-1
ISO-8859-2
windows-1250
windows-1252
System Specifications
The probe version 4.2 is tested for the following operating system configurations in both event and poll modes.
Note: However, CA recommends using event mode over poll mode when event generation rate is over 400 events/sec.
Operating system
RAM
Number of cores
8 GB
2 (3GHz)
4 GB
1(3GHz)
Installation Considerations
The ntevl probe monitors the event logs for new messages and generates alarm messages according to the defined setup. You can configure the
probe for triggering each time a new message is added to the event log. Also, you can check the event log for new messages at a fixed interval to
reduce the system load. Consider the following points while installing the ntevl:
Restart the probe when the time zone is changed or when "Automatically adjust clock for daylight saving changes" is selected or cleared.
The Windows event log watcher probe version 3.0x uses WMI to retrieve the event logs. Accessing windows event logs using WMI may
severely affect the performance of the Windows 2000 system. If the probe is deployed on Windows 2000 system, the probe raises an
alarm and stops execution.
Upgrade Considerations
When upgrading the probe from any previous version to 4.00 or later, delete the conf_ntevl.exe file from the temp > util folder.
Important! During upgrade, if event_record_id in ntevl pos file differs from the ID of the event viewer (windows application), then the
probe first reads the previously generated events and sends an alert on newly generated events if -Z option is not provided in argument.
Known Issues
Note: In case you are using CA UIM 8.0 or later, you can use Raw Configure GUI only through the Admin Console.
If monitoring profile contains locale-specific characters, then that monitoring profile cannot be viewed in any other locale from the IM
probe GUI. You can use the Admin Console GUI to view the profile on a different locale.
CA recommends using event mode over poll mode when the number of events generated is more than 400 events/sec and if log file size
is larger than 10496 KB.
If you run the probe in multi-threaded environment, probe alerts order can differ from event generation order. This happens only when
generation rate is high for monitoring an event.
The IM probe GUI displays an error message and Admin Console GUI can stop responding when the Maximum Event to Fetch field
value is more than 1000. You can update value of the field to 1000 or less to resolve the issue. The actual event count can vary for
different system configuration.
Do not use same profile name for ntevl and adevl probes, when deployed on same robot.
Use either IM GUI or AC GUI of the probe to avoid any unexpected issues that can occur during probe configuration.
The events of Microsoft Windows applications and services can display different category in the probe GUI as compared to the Event
Viewer tool of Windows. For example, the event (event Id is 701) shows the event category as Online Defragmentation in Event Viewer
tool and Performance in the probe GUI.
You get a Package Editor error when you start the ntevl probe along with any other i18n supported probe for the first time.
The probe GUI can display an error message while viewing event details when the Maximum Event to Fetch field value is more than
1000.
With CA UIM 8.0 or later, follow these steps:
1. Open the Raw Configure GUI.
2. Update value of the fetch_number key (under setup section) to 1000 or less.
3. Restart the probe.
With NMS 7.6 or earlier, follow these steps:
1. Open the IM probe GUI.
2. Update value of the Maximum Event to Fetch field to 1000 or less (under the Setup > Properties tab).
3. Restart the probe.
The Admin Console version of the probe has the following additional limitations:
The probe does not show the details section of events having XML content.
The ntp_response probe checks the current time from NTP (Network Time Protocol) or SNTP (Simple Network Time Protocol) servers, and also
measures the response time of the query. The probe can also request status information from NTP servers. Starting with version 1.4 onwards, the
ntp_response probe also supports NTPv4.
Alarms can be generated on response time and, in the case of NTP servers, offset and jitter (in comparison with their time reference). Alarms can
also be generated on error situations
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for the (ntp_response) probe.
Version
Description
State
Date
1.40
What's New:
GA
February
2015
GA
April 2014
GA
April 2013
GA
November
2012
To view the updated alarm subsystem Id as 1.1.3.7, reinstall the probe after removing any previous version of the probe from
the target robot. In case of upgrading the probe version, the alarm subsystem Id remains same as 2.5.
Fixed a defect to send clear alarm when response time is less than threshold.
Fixed a defect on sending QOS_DEFINITIONS on inactive profiles and when QOS are not checked.
Deactivated the test profile in the probe defaults.
1.30
1.24
GA
June
2012
1.23
GA
April 2012
1.22
GA
December
2011
GA
March
2011
GA
March
2011
GA
June
2010
1.21
1.20
1.10
1.04
Added sending of initial clear alarm on all profiles on startup and reconfigure.
GA
May 2010
1.0
Fixed problem which sometimes caused program failure in the configurator when using the 'Test' button.
GA
April 2004
Revision History
Supported Locales
Threshold Configuration Migration
Operator Reversal after Migration
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.02
What's New:
GA
August
2015
Added support to send QoS source as short name (only host name) or fully qualified domain name.
Added support of dynamic variables (object, counter, and instance) in static and dynamic threshold messages. This is
applicable on CA Unified Infrastructure Management 8.31 or later.
Added a note in IM and AC GUI Reference articles to recommend the user to provide a value in the Threshold
Operator and Threshold Value fields, as the probe is unable to send an alarm if these fields are blank. Salesforce case
00167095.
Fixed Defect:
The probe was displaying duplicate instances for a counter of a specific object. Salesforce case 00164171
2.01
What's New:
GA
June 2015
Beta
May 2015
Added localization support for B-Portuguese, Chinese (simplified and traditional), French, German, Italian, Japanese,
Korean, and Spanish languages for Admin Console GUI only.
Upgraded OpenSSL to version 1.0.0m.
Added a note in Infrastructure Management and Admin Console GUI Reference articles as the probe generated alarms
when the reverse of the defined operator was met. Salesforce case 00153155
2.00
What's New:
Added localization support for B-Portuguese, Chinese (simplified and traditional), French, German, Italian, Japanese,
Korean, and Spanish languages for Admin Console GUI only.
Fixed Defect:
The QoS_WIN_PERF_DELTA value was always calculated as 0, if the Counters and Objects were defined and the Instances
were left blank. Salesforce case 00153243
1.90
What's New:
March
2015
Fixed Defect:
January
2015
The probe was unable to fetch the current value for the Token Request counter of the AD FS object. Salesforce case
00150809
1.88
Fixed Defect:
October
2014
The probe was not calculating delta value of some counters. Salesforce case 00137816
1.87
Fixed Defect:
April 2014
Fixed the defect in which the clear alarm message text is same as the corresponding priority alarm message text.
Fixed the defect where the counter value was different in clear alarm message text from the actual value.
1.86
Fixed Defect:
March
2014
Fixed a defect of incorrect Metric Id for the QOS_WIN_PERF_MAX, which is now having the same Metric Id as the
QOS_WIN_PERF. This updated Metric Id helps in plotting the UMP graph correctly.
1.85
Fixed Defect:
June 2013
Fixed a defect of not displaying the counter and instance values on the probe GUI.
1.84
February
2013
1.83
June 2012
1.82
May 2012
January
2011
December
2010
November
2010
1.71
June 2010
1.70
May 2010
Fixed an issue where the probe issues a clear alarm for all the profiles even if no alarm has been raised previously.
1.61
Added cluster_setup section to the default probe configuration file to enable cluster support for ntperf64.
May 2010
Delta calculation changed from using the averaged value to the actual value.
Fixed problem where the delta alarm needed to be enabled for the delta QoS messages to be sent.
1.60
September
2009
April 2009
1.42
December
2008
1.41
September
2008
Note: Please note that, for version 1.41 and higher of this probe, NimBUS Robot version 3.00 (or higher) is a prerequisite.
You are advised to carefully read the document "Upgrading the NimBUS Robot" before installing/upgrading.
1.29
Altered handling of profiles where no object or counter is specified. With the new behaviour the probe will look for
objects and counters with no name.
September
2007
Changed ntperf to be dedicated for 32-bit objects and ntperf64 for 64-bit counters.
1.27
May 2007
1.23
June 2006
To be able to store Quality of Service information for separate object instances in separate QoS tables, an additional
configuration option is added to the profile: 'When object has instances, add instance name to QoS table name'.
1.12
Supported Locales
The ntperf probe version 2.0 and later supports the following non-English locales:
B-Portuguese
Chinese (traditional and simplified)
French
German
March
2006
January
2005
Italian
Japanese
Korean
Spanish
Note: The non-English locales are supported only on the Admin Console.
Important! The Infrastructure Manager GUI of the probe will not be available if the probe is migrated using threshold_ migrator.
Opening the IM GUI will display a message informing that the probe can only be configured using the Admin Console and will redirect
you to the probe Release Notes.
For more information on how to migrate a probe, see the threshold_migrator (Threshold Migrator) probe document.
Notes:
If both ntperf and ntperf64 probes are deployed on same robot, you are recommended to configure either ntperf or ntperf64 at
a time for successful migration.
The changes in the probe after migration are:
The Infrastructure Manager GUI of the probe will not be available and the probe will only be configured on Admin Console.
Probe specific alarm configurations in the probe monitors will be replaced by Static Alarm and Time Over Threshold
configurations.
The alarms will be sent by the baseline_engine probe.
Any relational operators in the probe configuration will be reversed.
User defined variables will not be available.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues
Revision History
This section describes the history of the revisions for the ntservices probe.
Version
Description
State
Date
3.24
Fixed Defect:
GA
December
2015
GA
October
2015
GA
July 2015
GA
June 2015
The probe crashed when the user customized the configuration file by clearing the <Services> tag. Salesforce cases
70006602, 70006788, 70006179
3.23
Fixed Defect:
The probe did not generate clear alarms in the following situations: Salesforce case 00169873
Specified expected running state was 'not stopped' and retrieved actual state was Started.
Specified expected running state was 'not running' and retrieved actual state was Stopped.
3.22
Fixed Defects:
Fixed a defect where the probe displayed garbled special characters in the Service Name in the French language. Sales
force case 00164218
3.21
Fixed Defects:
The probe does not revert back to earlier configuration after policy based templates are deactivated.
The probe did not accept baseline configurations from policy based templates.
Field descriptions in factory templates display numbers instead of actual descriptions.
3.20
June 2015
3.18
March
2015
3.17
Fixed a defect where an application error, Event ID 1000, was generated while upgrading the probe version. Salesforce case
00140528
August
2014
3.16
Fixed a defect where a service with angular brackets in the service name was not being monitored. The angular brackets are
now supported in the service name. Salesforce case 00127106
July 2014
3.15
Fixed issue where stopping/starting of the existing or newly created services which are of automatic type was not
generating the clear alarm even if the expected state was identical with the actual state. Salesforce cases: 00128528,
00128511, and 00122930
June 2014
Fixed issue where different Metric Ids were getting generated for the clear alarm and its corresponding priority alarm.
3.11
March
2013
3.11
Fixed a crash for a service which is active but not present on system services.
Fixed a defect where apply button is not active at QoS value changed.
3.10
Added a feature for overriding the default suppression key profile name.
Added support to monitor (send Alarms and QoS) for "Not Responding" service status.
January
2013
December
2012
Fixed a defect where pause and resume functionality was not working.
Added Probe Defaults for the probe.
2.94
2.93
June 2012
May 2011
January
2011
December
2010
June 2010
2.70
March
2010
2.60
Added support for handling 'not stopped', 'not running' fields in the Expected Running state.
Added support for displaying newly added services and raising alarms when a new service is added or started.
August
2009
Added support for raising an alarm/raising an alarm & removing service from profile when a service is removed from the
system.
Fixed an issue where alarm doesn't get cleared when a service is restarted.
Added support for automatically adding services to profile based on "All"/"Running, "Automatic" criteria.
Added support for using $robot variable in alarm messages.
Fixed an issue related to probe not sending clear alarm when automatically add new services feature is disabled.
Fixed an issue in probe installation/upgrade. Now the probe modifies service states "start pending", "continue pending",
"pause pending" and "paused" as "running" and state "stop pending" as "stopped".
2.50
2.43
2.41
April 2009
December
2008
September
2008
Note: For version 2.41 and higher of this probe, NimBUS Robot version 3.00 (or higher) is a prerequisite. You are advised to
carefully read the document "Upgrading the NimBUS Robot" before installing/upgrading.
2.34
Moved privilege handling from thread to main process to avoid potential crash situation.
Threaded 'list_services' call to be able to better support using this call from other probes for fetching remote computer
data.
Modified authentication order when fetching remote computer data.
April 2008
2.32
Added flag bOkToSendClear, to avoid sending clear for services with Action = force state, at restart
Fixed config read - global settings were only read on probe restart.
October
2007
Known issue:
When other probes use the ntservices probe to fetch service information from remote services, situations can occur where
the probe is taking a long time serving request. In this situation the probe will be unavailable for other requests. For this
reason, do not set the list_services timeout to high.
2.30
September
2007
Added field profile_name that can be edited and used in alarm strings.
Added alarm monitoring of service account (log on as) credentials.
Added list of active profiles, service state and credentials callback for dashboard use.
Added configurable timeout to list_services callback.
Added option to ignore number of restart attempts.
2.21
December
2005
2.11
June 2004
Known Issues
The known issues of the probe are:
There is a key delete_profile_section in the setup section of configuration file. If the value for this key is set to yes, all service profiles
are deleted from the probe. The services are again configured depending on the configurations in the Default values for added
services section of the probe GUI.
The Admin Console probe version has the following additional limitations:
ntservices Probe UI does not dynamically reload the state of the services. You must reload the configuration after changes are made to
the service through services.msc (Windows OS).
You can set up multiple configurable monitoring profiles to extract vital information about your database servers at specified intervals. Database
administrators can use this information to tune and optimize database performance, and do capacity planning.
The probe monitors local or remote Oracle database instances, periodically scans through monitoring profiles, and applies checks to the
instances. The probe successfully monitors:
32-bit Oracle client on a 32-bit OS
64-bit Oracle client on a 64-bit OS
Note: The oracle probe also provides monitoring support for Real Application Cluster (RAC) and is configurable only through Admin
Console.
With version 4.8 and later, the oracle probe introduced multi-tenancy support. The probe allows you to monitor Oracle 12c container database
(CDB) and pluggable database (PDB). Oracle 12c database introduces multitenant option with new CDB and PDB concepts. A CDB is similar to a
conventional database containing controlfiles, datafiles, data dictionary, redo logs, tempfiles, undo, and so on. A PDB only contains information
specific to its objects.
Contents
Revision History
Monitoring Capabilities
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Prerequisites
Add Environment Variables
Add CDB/PDB Connection Entries
Provide Access Rights for a Database User
Upgrades
Oracle Supported Versions and Clients
Known Issues
Revision History
This section describes the history of the revisions for the oracle probe.
Version
Description
State
Date
4.91
Fixed Defects:
GA
September
2015
GA
June 2015
When creating a profile from Admin Console, all the profile checkpoints are now inactive, by default. Salesforce case
00159509
When opening the probe in a deactivated state, the probe displayed incorrect upgrade message text. Salesforce case
00160547
The probe displayed incorrect status of the activated checkpoints in the Status tab. Salesforce case 00168455
On a non-Windows OS, the probe now displays the Device ID of the monitored Oracle database. Salesforce case
70002729
4.90
What's New:
Added support for AIX 6 and 7 platforms.
4.81
What's New:
Updated the probe Help link.
December
2014
4.80
What's New:
November
2014
Added support for Oracle 12c Container Database (CDB) and Oracle 12c Pluggable Database (PDB).
Added a checkpoint to monitor the number of PDBs in a specific CDB.
Fixed Defects:
Fixed a defect where the Profile Timeout alarms were visible in alarm console even though the probe connection was
successful. The alarms did not clear even if Clear Alarm on Restart check box is selected. Salesforce case 00144941
4.71
4.70
Fixed a defect where the Oracle probe GUI flickered on clicking Profiles/Checkpoints in the Status tab. Salesforce case
00143137
Added support for Oracle 12c.
Removed support for the Oracle versions 9i and 10g.
4.61
Fixed Defects:
October
2014
September
2014
July 2014
TNS alarms were visible in alarm console even after the probe connection was successful. The alarms did not clear
even if user changes the connection and Clear Alarm on Restart check box was selected. Salesforce cases:
00122633, 00132174, 00121391
SQL timeout alarms were getting generated irrespective of the checkpoint and had incorrect suppression key.
For the tablespace that has autoext as Y, probe was calculating wrong value of FREESP variable. Due to the wrong
calculation, the messages displayed incorrect values. Salesforce case 00132777
The probe package of 32-bit was getting installed on 64-bit AIX system. (Salesforce Case: 00132673)
Instance name was not coming in the check_dbalive alarm message where the connection was unsuccessful. Salesfor
ce case 00131416
4.60
New Feature:
March
2014
Implemented the Oracle RAC features including 20 new checkpoints for RAC.
Defects Fixed:
Fixed a defect where global cache checkpoints are not returning any value for oracle 11.
Note: RAC can be configured only through Admin Console GUI.
4.56
New Feature:
March
2014
Added a callback to determine whether dependencies are present on the system where probe is deployed so that the
probe runs successfully.
Note: This feature is only applicable for Admin Console GUI.
4.55
Defects Fixed:
January
2014
Defect fixed related to the Oracle probe not displaying metrics for all tablespaces. The UMP was not displaying QoS as
a chart in the metric tree for all Oracle native checkpoints.
Defect fixed related to QOS_ORACLE_tablespace_alloc_free Pct and QOS_ORACLE_tablespace_free Percent
checkpoints, generating alarms for each tablespace with almost identical values.
4.52
4.51
February
2013
January
2013
4.50
June 2012
November
2011
4.41
August
2011
June 2011
Custom checkpoint can now be updates when opened through Status Tab.
Fixed the issue of sending null QOS for dict_cachehit_ratio
Supports Oracle client 11.2 for Linux 32-bit & 64-bit systems.
This version does not support oracle client 11.2 for solaris sparc system.
Fixed Memory Leak issue
Fixed an issue in tablespace_free checkpoint where the probe was incorrectly calculating free tablespace % for temp
tablespaces.<\li>
Fixed one more issue in the tablespace_free checkpoint where the free tablespace calculation was incorrect.
Internationalization corrections.
4.31
September
2010
Fixed other minor GUI issues that are related to a custom checkpoint.
Added field in the custom query tab to select predefined connections.
Added support for the role-based login.
Added support for alarms.cfg.
Added functionality for delta calculation in custom checkpoint.
Added code to prohibit the use of whitespace in configurator.
Added support to confirm if the user has enough rights to run the query.
Removed validation of adding at least one threshold while creating a checkpoint.
Fixed GUI crash defects.
Fixed defect to return valid checksum after restart for Custom query.
Fixed defect for metric remaining_extents takes way too long to complete.
Fixed defect; so, correct binary is deployed in 64-bit Linux.
Fixed defects in v4 framework library.
4.20
June 2010
Fixed an issue where the probe was reporting incorrect values in the GUI.
4.11
4.10
Added a checkpoint active_connection_ratio for monitoring percentage of active connections to total available
connections.
March
2010
March
2010
4.04
Fixed an issue in the database size checkpoint where total database size was reported incorrectly.
January
2010
4.03
January
2010
December
2009
Probe now sends NULL QoS for all the QoS defined in the QoS list when no data is received.
3.91
3.90
September
2008
July 2008
3.75
March
2008
Note: While the package contains Solaris binaries, Solaris is not a supported platform now. Installation of this probe on
Solaris is not recommended.
3.74
February
2008
3.73
January
2008
3.71
December
2007
3.70
3.64
November
2007
November
2007
July 2007
3.60
June 2007
exclude lists
# of samples for alarming
new checkpoint long_queries
auto-update (GUI)
new checkpoint tablespace_size
new checkpoint database_size
new parameter log-size
new parameter "Alarm Source"
buf_cache_hit_ratio query adjusted for Oracle 10.2
3.53
3.52
January
2007
December
2006
datafile_status problem with SYSTEM tablespaces solved (v$datafile_header now used instead of v$datafile)
datafile_status and dbfile_io object threshold saving on UNIX systems problem solved
tablespace_alloc_free - query changed
3.50
3.05
3.04
August
2006
July 2006
June 2006
2.16
2.12
10g tolerancy
new checkpoint lock_waits
user_locks and locked_users changed
error in buf_cachehit_ratio_users fixed
check for % in password removed
start parameter -n to change probes name
keeping time_cnt through restart (24hour cycle)
GUI - new button "info" to display GUI and executable version numbers
Monitoring Capabilities
March
2006
November
2005
March
2005
The oracle probe monitors the following information about either local or remote database instances:
Database uptime
Tablespace growth
Database growth
Tablespace status
Index status
Data-file status
Rollback segment status
Fragmented segments
The extents that can't extend
The data dictionary cache-hit ratio
The data buffer cache-hit ratio
The Redo copy latch-hit ratio
The library cache-hit ratio
The sort-hit ratio
PGA resource consumption (monitor memory consumption of Oracle users)
The rollback segment contention
The number of invalid objects
The number of chained rows
The number of users currently logged onto the server
The MTS response time
The number of MTS waits
The enqueue resources
The UGA memory usage
User locks and locked users
Lock waits event time
The user buffer cache-hit ratio
System waits and user waits
Datafile i/o
System statistics
Global cache service utilization for RAC
Global cache fusion ratio for RAC
Global cache lock get time for RAC
Global cache lock conversion timeouts for RAC
Global cache average lock get time for RAC
Global cache corrupt blocks count for RAC
Global cache lost blocks count for RAC
Number of Long running queries
Tablespace size
Database size
Resource utilization %
Dataguard status
Dataguard gap
Dataguard timegap
Tablespace temp free
Active users
Flash recovery area memory free
Active connection ratio
The number of PDBs in a CDB
Prerequisites
Before deploying the oracle probe on your robot, you must perform the following activities:
Add Environment Variables
Add CDB/PDB Connection Entries
Provide Access Rights for a Database User
Setting the ORACLE_HOME path is mandatory to generate the correct Device ID, routing the QoS messages and alarms to the correct device,
and viewing data on the Unified Service Manager (USM).
On Windows platform:
Set the ORACLE_HOME path to the directory where the client is installed. For example, C:\oracle\product\11.2.0\client_1.
On UNIX or Linux platform:
1. Set the ORACLE_HOME path to the directory where the client is installed. For example, /home/oracle/app/oracle/product/12.1.0/client_1.
2. Set the ORACLE_SID path to the service identifier name that you configured by using the Oracle client.
3. Set the LD_LIBRARY_PATH path to the Oracle library directory. For example, /home/oracle/app/oracle/product/12.1.0/client_1/lib.
Security-Enhanced Linux (SELinux) is a mandatory access control (MAC) security mechanism implemented in the Linux kernel. SELinux has
three basic modes of operation: Enforcing, Permissive and Disabled. In Permissive mode, SELinux is enabled but does not enforce the security
policy, only warns and logs actions. The Current Enforcing Mode must be Permissive on Linux platform.
Follow these steps for setting the Current Enforcing Mode to Permissive:
1. Open SELinux Management and the SELinux Administration window appears.
2. In the Current Enforcing Mode list, click Permissive.
3. Close the SELinux Administration window.
CDB1 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 10.112.16.128)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = CDB1)
)
)
PDB1 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = oracle-cdb)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = PDB1)
)
)
For a CDB connection, prefix c## or C## to the user name and grant required permissions to the user.
Upgrades
When upgrading the oracle probe from previous versions to version 4.80 or later, the PDB_Count checkpoint is added as a template and is
available in all the profiles. By default, the checkpoint is disabled. Enable the checkpoint to monitor the number of PDBs in a particular CDB. This
checkpoint will work only for those profiles that are linked to a CDB connection. For other profiles, no result is displayed even if the checkpoint is
enabled.
12.1.0
11.2.0
11.1.0
12.1.0
Yes
Yes
Yes
11.2.0
Yes
Yes
Yes
11.1.0
Yes
Yes
Yes
Known Issues
The known issues of the probe are:
The probe may not connect to the database after you install or upgrade the Oracle client. Restart the probe after installing the client for a
successful connection.
The oracle probe must not be configured on both the Infrastructure Manager (IM) GUI and Admin Console (AC) GUI.
The probe configuration for both the IM GUI and AC GUI is separate. For example, any profile that is created in the IM GUI is not
available on the AC GUI and must be recreated.
While upgrading the oracle probe from version 4.56 and earlier to version 4.60 or later, ensure that the PPM, service_host, and MPSE
are restarted for the RAC functionality to become usable.
The users cannot create custom alarm messages. Hence, they need to select an existing alarm message.
In Oracle RAC, from Oracle 11.1 and above, the term global cache is replaced by gc in all checkpoints. Therefore, if any custom query
in Oracle 11g or 12c includes the term global cache, then no value is returned. The RAC functionality is supported and configured
through Admin Console GUI only.
On Windows 64-bit platform, the probe cannot be installed into the default directory - "Program Files (x86)". A bug in Oracle Client is
causing connection errors, if the application home directory name includes special characters, like "(" (Oracle Bug 3807408).
Error ORA-12705 together with log entry "OCIEnvCreate failed with rc = -1" can happen, if environmental variable NLS_LANG is set.
Solution is to set this variable to empty space in the controller environment.
On 64-bit Linux, user may get a warning message of insufficient access rights when connection test is performed, even if all the required
access rights are provided. The connection can still be used to schedule the profile. Please ensure all the required access rights are
provided to the user.
In custom checkpoints, if query tries to fetch data from a table with more than 32 columns, probe will limit the number of columns to 32.
If a custom QoS is added to an existing monitoring profile, the Unified Management Portal (UMP) creates a separate node, Custom, in
the Metric section. It does not display the user-defined description and unit.
If a custom checkpoint is added to an existing monitoring profile, the Unified Management Portal (UMP) creates a separate node,
Dynamic, in the Metric section. It does not display the user-defined description and unit.
The Admin Console GUI of the probe has the following additional limitations:
On version 4.90 of the probe, custom QoS in a checkpoint do not generate alarms if the default QoS of the checkpoint is deleted. This is
applicable for AIX 6 and AIX 7 platforms.
While upgrading the oracle probe from version 4.56 and earlier to version 4.60 or later, ensure that the PPM, service_host, and MPSE
are restarted for the RAC functionality to become usable.
The users cannot create custom alarm messages. Hence, they need to select an existing alarm message.
The oracle probe must not be configured on both the Infrastructure Manager (IM) GUI and Admin Console (AC) GUI.
The probe configuration for both the IM GUI and AC GUI is separate. For example, any profile that is created in the IM GUI is not
available on the AC GUI and must be recreated.
Dynamic Population of Message Text field with the corresponding Message field selected in the drop-down list at runtime is a limitation
of the PPM. On creating a checkpoint with a new threshold, first select message and save it. After the reload operation, the message text
field gets updated with the corresponding message text.
If you see PPM-023 error and Unable to Retrieve Configuration issues, click Retry or reopen the AC GUI.
In Oracle RAC, from Oracle 11.1 onwards, the term global cache is replaced by gc in all checkpoints. Therefore, if any custom query in
Oracle 11g or 12c includes the term global cache, then no value is returned.
To discover RAC-specific nodes, the ppm probe should be deployed on the same subnet as the oracle probe.
On Windows 64-bit platform, the probe cannot be installed into the default directory - "Program Files (x86)". A bug in Oracle Client is
causing connection errors, if the application home directory name includes special characters, like "(" (Oracle Bug 3807408).
Error ORA-12705 together with log entry "OCIEnvCreate failed with rc = -1" can happen, if environmental variable NLS_LANG is set.
Solution is to set this variable to empty space in the controller environment.
On 64-bit Linux, user may get a warning message of insufficient access rights when connection test is performed, even if all the required
access rights are provided. The connection can still be used to schedule the profile. Please ensure all the required access rights are
provided to the user.
If the oracle probe is deployed on a non-Windows OS, the Unified Management Portal (UMP) displays the metrics and alarms on the
robot and not on the Oracle server.
In custom checkpoints, if query tries to fetch data from a table with more than 32 columns, probe will limit the number of columns to 32.
If a custom QoS is added to an existing monitoring profile, the Unified Management Portal (UMP) creates a separate node, Custom, in
the Metric section. It does not display the user-defined description and unit.
If a custom checkpoint is added to an existing monitoring profile, the Unified Management Portal (UMP) creates a separate node,
Dynamic, in the Metric section. It does not display the user-defined description and unit.
If a checkpoint has multiple QoS and you activate or deactivate one QoS, the other QoS of the checkpoint are also activated or
deactivated.
If a checkpoint has multiple alarms and you activate or deactivate one alarm, the other alarms of the checkpoint are also activated or
deactivated.
If you enable a checkpoint in the Checkpoints node, the checkpoint is enabled for all the monitoring profiles and all alarms are
generated.
In custom checkpoints, if you edit a message variable, it adds a new message variable in the Message Variable table. To edit a message
variable in custom checkpoint, delete and create a new message variable.
Output queues are objects defined in the IBM iSeries system to contain the spooled files or printer output files that are queued for printing. Output
queues are created by a user or by the system. Spooled files are created from the application program, from a system program, or by pressing
the Print key from your keyboard.
The iSeries Output Queue Monitoring (outqs) probe uses profiles to monitor output queues in the IBM iSeries systems. You can create and
activate profiles to monitor the number of spooled files present in an output queue. When you configure a profile, you can set a threshold value to
generate an alarm or QoS message. The probe generates alarms or QoS messages when the total number of files in an output queue
outnumbers or equals the threshold value specified in the profile. The probe scans all the output queues in the system against the profile. This
ensures you that the system is not overloaded with continuous print requests.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for the outqs probe.
Version
Description
State
Date
1.16
What's New:
GA
September 2015
GA
September 2014
What's New:
Added support for Static Threshold alarm.
Added support for configuring the probe through the Admin Console (web-based) GUI.
Fixed Defects:
Fixed a defect in which DevID and MetID were not matching for QoS.
1.02
December 2012
1.01
October 2012
1.00
Initial release.
July 2012
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.61
Fixed Defect:
GA
June 2015
Beta
June 2015
GA
June 2014
The probe returned an incorrect value of the LDAP Read Time monitor for exchange_monitor. Salesforce case
00162939
1.60
What's New:
Added support for internationalization.
1.53
What's New:
Added a callback (get_all_values) which returns all values of the given perfmon instance. This callback accepts
counter name, object name (maximum two objects), and the instance name and returns a table containing values of the
specified object. If there are multiple instances with the same name, then values of all instances are returned. The probe
uses this callback while monitoring the application pool.
1.52
October
2013
1.51
Changed remote authentication order: 'user/password', 'impersonation', 'implicit'. Improved error handling on performance
object parsing. Fixed crash issue.
September
2012
1.50
March
2011
1.33
1.32
December
2008
September
2008
September
2007
Changed order of remote connect attempts - try with specified credentials first.
1.16
August
2007
Configurable ping_timeout. Default set to 500 ms. Setting this to 0 turns ping completely off.
Fixed hang situation related to finding processes.
1.09
November
2006
March
2006
Corrected problem with large instance names. Modified program logic to ensure fast response to interactive requests.
December
2005
Revision History
Requirements
Hardware Requirements
Software Requirements
Installation Considerations
Revision History
This section describes the history of the revisions for the policy_engine probe.
Version
Description
State
Date
8.2
Initial release.
GA
March 2015
Requirements
Hardware Requirements
The policy_engine probe should be installed on systems with the following minimum resources:
Memory: 1024 MB of RAM
CPU: 3 GHz dual-core processor, 32-bit or 64-bit
Software Requirements
This section describes the software required to use the Policy Editor and create policies.
For policy_engine v8.2 running with CA UIM v8.31
Robot version 5.23 or later
Java 7 (java_jre1.7) - the hubs where the required probes are running should have java_jre1.7 loaded in the Installed Packages in Admin
Console (typically installed with UIM 8.0 and later)
ace v8.3
Admin Console v8.31
Alarm Server (nas) v4.73 and alarm_enrichment v4.73
Probe Provisioning Manager (ppm) v3.22
udm_manager v8.31
Verify CA Unified Management Portal (UMP) v8.31 is installed and running. With UMP, the following required probes are also deployed
ump_policyeditor v8.31
wasp (Web Application Server Probe) v8.31
For policy_engine v8.2 running with CA UIM v8.2
Robot version 5.23 or later
Java 7 (java_jre1.7) - the hubs where the required probes are running should have java_jre1.7 loaded in the Installed Packages in Admin
Console (typically installed with UIM 8.0 and later)
ace v3.5
Admin Console v8.2
Alarm Server (nas) v4.6 and alarm_enrichment v4.6
Installation Considerations
policy_engine is installed on the primary hub with CA UIM Server installer v8.2 or later.
Note: An administrator cannot upgrade ppm to a new version. Instead, deploy a newer version of the ppm probe after deleting existing
versions of ppm.
Contents
Revision History
Probes That Use PPM
ppm Hardware Requirements
ppm Software Requirements
ppm Dependencies
Functions Supported
Installation Considerations
Upgrade to a Newer Version of ppm
Backward Compatibility
Performance Considerations
Revision History
This section describes the history of the revisions for the ppm probe.
Version
Description
State
Date
3.22
What's New:
This version of ppm is installed on the primary hub when you run UIM Server Installer v8.31.
GA
August
2015
If you deploy ppm v3.22 to hub robots, baseline_engine v2.6 and prediction_engine v1.31 are also deployed to the
same robot.
You can enter ${ in the Custom Alarm Message and Custom Alarm Clear Message fields to select variables for the
custom alarm message from a drop-down list.
3.20
3.11
What's New:
Settings for alarm severity now includes more operators.
GA
April 2015
GA
January
2015
GA
December
2014
GA
September
2014
You can create a custom alarm message or a custom clear alarm message.
The
icon for the Compute Baseline, Dynamic Alarm, or Time To Threshold Alarm check boxes provides a
message that indicates when the baseline_engine or prediction_engine probes are not installed or running.
3.05
What's New:
Minor documentation updates.
3.0
Modified the static alarm, dynamic alarm, and Time To Threshold alarm threshold GUI for monitoring probes.
Alarm threshold GUI includes a Compute Baseline, Publish Alarm, and Publish Data check boxes. Select these
check boxes to configure alarm thresholds and the severity of the alarm generated when a threshold is breached.
2.38
2.32
5 new Adapters
GA
June 2014
2.27
GA
April 2014
2.26
GA
March
2014
GA
December
2013
GA
May 2013
2.21
Fixed defects
Miscellaneous UI text/messaging/L10N cleanup
15 New Adapters
dirscan, logmon and printers enhancement with CTD V2
2.03
Fixed defects
Miscellaneous UI text/messaging/L10N cleanup
exchange_monitor adapter now includes support for the DAG feature
2.02
GA
April 2013
2.01
First general release of the probe included as part of the NMS 6.5 Installer.
GA
March
2013
Minimum Version
ad_response
v1.60
ad_server
v1.43
adevl
v1.60
adogtw
v2.71
apache
v1.51
automated_deployment_engine
v8.1
casdgtw
v2.34
cdm
v4.81
celerra
v1.50
cisco_ucm
v1.80
cluster
v3.12
controller
v5.70
cuegtw
v1.04
data_engine
v7.90
db2
v4.05
dhcp_response
v3.22
dirscan
v3.05
distsrv
v5.30
diskstat
v1.05
dns_response
v1.64
e2e_appmon
v2.20
email_response
v1.42
emailgtw
v2.70
exchange_monitor
v4.01
ews_response
v1.02
fetchmsg
v1.50
file_adapter
v1.40
fsmounts
v1.22
ha
v1.44
history
v1.04
hpsmgtw
v1.01
hub
v5.82
ica_response
v3.0
ica_server
v1.3
iis
v1.53
informix
v4.11
jboss
v1.33
jdbcgtw
v1.01
jobqs
v1.12
jobs
v1.36
jobsched
v1.23
journal
v1.06
jvm_monitor
v1.44
ldap_response
v1.33
logmon
v3.23
lync_monitor
v1.0
mysql
v1.47
nap
v4.10
net_connect
v2.90
nexec
v1.35
notes_response
v2.31
notes_server
v1.52
nsdgtw
v1.21
ntevl
v3.90
ntp_response
v1.31
ntperf
v1.87
ntperf64
v1.87
ntservices
v3.15
oracle
v4.54
outqs
v1.14
perfmon
v1.51
printers
v2.53
processes
v3.75
rsp
v2.93
sharepoint
v1.51
smsgtw
v3.01
sngtw
v2.0
spooler
v5.70
sql_response
v1.60
sqlserver
v4.82
sybase
v4.13
sysloggtw
v1.40
sysstat
v1.10
tcp_proxy
v1.10
v1.0
tomcat
v1.22
url_response
v4.13
webgtw
v1.00
weblogic
v1.35
webservicemon
v1.25
webshpere
v1.63
ppm Dependencies
The following table shows the probes required to configure dynamic, static, Time Over Threshold, and Time To Threshold alarms for each release
of CA UIM v8.0 and later. It is recommended to deploy and run the versions of the probes supported by the version of CA UIM you are running on
your UIM server. This ensures that you can configure alarm and threshold settings, and the alarms are successfully generated when thresholds
are breached.
CA UIM Version
Version of ppm
Version of
baseline_engine
Version of
prediction_engine
Version of nas
Version of
alarm_enrichment
8.31
3.22
2.6
1.31
4.73
4.73
8.2
3.11
2.5
1.2
4.67
4.67
8.1
3.0
2.4
1.1
4.6
4.6
8.0
2.38
2.3
1.01
4.4
4.4
Functions Supported
As shown in the ppm Dependencies table, ppm, baseline_engine, prediction_engine, nas, and alarm_enrichment probes have inter-dependencies
that require certain versions of each probe to be installed on hub robots. Starting with ppm v2.38, the supported versions of the ppm,
baseline_engine, and prediction_engine probes must be running on all hub robots to successfully generate QoS alarm messages.
ppm 3.22
CA UIM v8.31 should be running on the UIM Server. If you run ppm v3.22 with an earlier version of CA UIM, you might not see all of the
alarm and threshold fields, and as a result the alarms may not be generated correctly.
UIM Server Installer v8.31 installs ppm v3.22, baseline_engine v2.6, and prediction_engine v1.31 to the primary hub.
You must manually deploy ppm v3.22 separately on all hub robots. When you deploy ppm v3.22, baseline_engine v2.6 and
prediction_engine v1.31 are also deployed to the same robot.
Alarm Server (nas) and alarm_enrichment probes must be deployed and running on the primary hub if you want to configure Time Over
Threshold alarms. Install nas v4.73 and alarm_enrichment 4.73 with the nas package on the primary hub.
If baseline_engine and prediction_engine are deployed on a hub but these probes are not running, then no alarms should be generated.
Select the Compute Baselines, Dynamic Alarm, or Time To Threshold Alarm check boxes to configure dynamic, static (if applicable), or
Time To Threshold alarm and threshold settings.
If alarm or threshold fields are inactive on a monitoring probe's GUI, click the field help
icons next to the Compute Baselines,
Dynamic Alarm, or Time To Threshold Alarm check boxes to determine which required probe is not deployed or deactivated on a robot.
To configure the Time to Threshold alarm and threshold, select the Time To Threshold Alarm check box and configure the remaining
settings.
If you deploy baseline_engine to a secondary hub, make sure you create a new hub queue or amend existing hub queues to forward the
new QOS_BASELINE messages to the primary hub. See the Recommended Multiple Hub Probe Deployment section for details.
You can create custom alarms.
Enter ${ in the custom alarm fields to see a list of variables that baseline_engine will substitute. These variables are probe-specific.
ppm v3.11
CA UIM v8.2 should be running on the UIM Server. If you run ppm v3.11 with a version of CA UIM earlier than CA UIM v8.2, you might
not see all of the alarm and threshold fields, and as a result the alarms may not be generated correctly.
You must manually deploy ppm v3.11 separately on all hub robots. When you deploy ppm v3.11, baseline_engine v2.5 and
prediction_engine v1.2 are also deployed to the same robots.
ppm, baseline_engine, and prediction_engine must all be deployed and running if you want to configure dynamic, static (if applicable), or
Time To Threshold alarm and threshold settings for monitoring probes.
Alarm Server (nas) and alarm_enrichment probes must be deployed and running on the primary hub if you want to configure Time Over
Threshold alarms. Install nas v4.67 and alarm_enrichment 4.67 with the nas package on the primary hub.
If baseline_engine and prediction_engine are deployed to a hub but these probes are not running, no alarms are generated.
Select the Compute Baselines, Dynamic Alarm, or Time To Threshold Alarm check boxes to configure dynamic, static (if applicable), or
Time To Threshold alarm and threshold settings.
If alarm or threshold fields are inactive on a monitoring probe's GUI, click the field help
icons next to the Compute Baselines,
Dynamic Alarm, or Time To Threshold Alarm check boxes to determine which required probe is not deployed or deactivated on a robot.
To configure the Time to Threshold alarm and threshold, select the Time To Threshold Alarm check box and configure the remaining
settings.
If you deploy baseline_engine to a secondary hub, make sure you create a new hub queue or amend existing hub queues to forward the
new QOS_BASELINE messages to the primary hub. See the Recommended Multiple Hub Probe Deployment section for details.
You can create custom alarms.
ppm v3.0
CA UIM v8.1 should be running on the UIM Server. If you run ppm v3.0 with a version of CA UIM earlier than CA UIM v8.1, you might not
see all of the alarm and threshold fields, and as a result the alarms may not be generated correctly.
You must manually deploy ppm v3.0 separately on all hub robots. In addition, manually deploy baseline_engine v2.4 and
prediction_engine v1.1 to the same robots.
ppm, baseline_engine, and prediction_engine must all be deployed and running if you want to configure dynamic, static (if applicable), or
Time To Threshold alarm and threshold settings for monitoring probes.
Select the Compute Baselines, Dynamic Alarm, or Time To Threshold Alarm check boxes to configure dynamic, static (if applicable), or
Time To Threshold alarm and threshold settings.
Alarm Server (nas) and alarm_enrichment probes must be deployed and running on the primary hub if you want to configure Time Over
Threshold alarms. Install nas v4.6 and alarm_enrichment 4.6 with the nas package on the primary hub.
If baseline_engine and prediction_engine are deployed to a hub but these probes are not running, no alarms are generated.
To configure the Time to Threshold alarm and threshold, select the Time To Threshold Alarm check box and configure the remaining
settings.
If you deploy baseline_engine to a secondary hub, make sure you create a new hub queue or amend existing hub queues to forward the
new QOS_BASELINE messages to the primary hub. See the Recommended Multiple Hub Probe Deployment section for details.
Run ppm v3.0 on hub robots in a CA UIM v8.1 environment. If you run ppm v3.0 with earlier versions of CA UIM, you might not see all of
the alarm and threshold fields, and as a result the alarms may not be generated correctly.
ppm v2.38
CA UIM v8.0 should be running on the UIM Server. You cannot run ppm v2.38 with NMS systems.
You must manually deploy ppm v2.38 separately on all hub robots. In addition, manually deploy baseline_engine v2.24 and
prediction_engine v1.0 to the same robots.
ppm, baseline_engine, and prediction_engine must all be deployed and running if you want to configure dynamic, static (if applicable), or
Time To Threshold alarm and threshold settings for monitoring probes.
Alarm Server (nas) and alarm_enrichment probes must be deployed and running on the primary hub if you want to configure Time Over
Threshold alarms. Deploy nas v4.4 and alarm_enrichment 4.4 to the primary hub.
If baseline_engine and prediction_engine are deployed to a hub but these probes are not running, no alarms are generated.
To configure the Time to Threshold alarm and threshold, select Time To Threshold Alarm in the Predictive Alarm drop-down men and
configure the remaining settings.
If you deploy baseline_engine to a secondary hub, make sure you create a new hub queue or amend existing hub queues to forward the
new QOS_BASELINE messages to the primary hub. See the Recommended Multiple Hub Probe Deployment section for details.
Installation Considerations
ppm must be installed on every hub robot that hosts a probe that will be configured using Probe Provisioning. If ppm is missing, Monitoring
Services generates an error message that is displayed in Admin Console.
Note: When you manually deploy ppm to hub robots, you must also download and install JRE 7 separately on each hub robot.
Backward Compatibility
Versions of ppm, starting with v2.38, are supported on UIM servers running CA UIM v8.0 and later. It is recommended that you run the version of
ppm released with the version of CA UIM you are running on the UIM Server.
Each release of ppm, starting with ppm v2.38, can run with CA UIM v8.0 and higher. However, only the features supported in the release of ppm
running on a hub robot that manages the probe you are configuring are visible.
The features available in earlier versions of ppm are brought forward and are available in newer versions. The following list shows the version of
ppm released with CA UIM and indicates the new features introduced in each version of ppm.
CA UIM v8.31 with ppm v3.22
You can enter '${' to see a drop-down list of variables available for custom alarm messages. The variables displayed in the drop-down list
are variables the baseline_engine will substitute or are probe-specific variables.
CA UIM v8.2 with ppm v3.11
You can create custom alarm messages. If the Compute Baseline, Publish Alarm, or Publish Data check boxes are not selectable, click
the help icon for these check boxes. The help message indicates when the baseline_engine or prediction_engine probes are not installed
or not running.
CA UIM v8.1 with ppm v3.0
Provides a Compute Baseline, Publish Alarm, and Publish Data check box. Select the Compute Baseline check box if you want
baseline_engine to compute baselines. Select the Publish Alarm and Publish Data to allow the QoS messages and alarms to be placed
on the UIM message bus.
CA UIM v8.0 with ppm v2.38
Includes a redesign for static and dynamic alarm fields, and allows you to configure Predictive Alarm and Time Over Threshold alarm
settings.
Note: ppm v2.38 (and later) is not supported on servers running NMS 7.6.
Performance Considerations
Items that can affect the performance of ppm include:
With CA UIM v8.2 and earlier, the number of instances of the cdm probe in your environment (running more than 250 instances of the
CDM probe) might impact performance.
Note: The number of instances of the cdm probe running in your environment is no longer an issue when you install CA UIM
v8.3 and cdm v5.31.
ad_response Probe Does Not Have the Same Metric ID and Only Two Counters Can be Added
Symptom:
The ad_response Probe Adapter UI currently does not have the same metric Id for different categories search, replication and response.
In ad_response probe adapter GUI only two counters can be added as compared to VB GUI ad_response probe.
Solution:
Use Infrastructure Manager.
cdm Probe UI Does Not Support Changing the 'Internal Alarm' Message
Symptom:
The Internal Alarm message cannot be changed through the CDM Probe Provisioning UI.
Solution:
Use either Raw Configure or Infrastructure Manager. Raw Configure is supported through both the Admin Console and the Windows based IM.
controller Probe UI Always Allows Setting of QoS Source to Robot Name Instead of Computer Hostname
Symptom:
The "QoS source to robot name instead of computer hostname" setting for the controller can be set through the Probe Provisioning UI. This
setting should only be enabled if controller has been configured to use a Specific Name for the Robot Name.
Solution:
Ensure that a Specific Name is being used for the Robot Name prior to setting this option.
The db2 probe supports the usage of 'Status Tab/Group.' At this time the db2 Probe Provisioning UI does not expose this capability.
Solution:
Use Infrastructure Manager.
The dirscan probe supports the functionality of rechecksum on pattern files which is still at this time not exposed in the dirscan probe provisioning
UI.
Solution:
Use Infrastructure Manager.
dns_response Probe UI Allows all Port Numbers Rather Than Allowing 53 for UDP Protocol Only Feature
Symptom:
The dns_response Probe Provisioning UI currently allows user to enter all port numbers for UDP Protocol however it works fine for port 53 only.
Solution:
Its mentioned in help document, UDP Port works fine on port 53 only.
Symptom:
The e2e_appmon probe has the capability to record scripts. However, script recording is currently not available when configuring this probe with
Admin Console.
Solution:
Use Infrastructure Manager.
emailgtw Probe Provisioning UI does not Support Viewing or Editing Template Files
Symptom:
Viewing and editing of template files for the emailgtw probe is not currently supported using Probe Provisioning.
Solution:
None within Probe Provisioning (alternate is to use Infrastructure Manager).
Symptom:
The file_adapter probe supports the feature of adding custom QoS. At this time the file_adapter Probe Provisioning UI does not expose this
capability.
Solution:
Use Infrastructure Manager.
iis Probe Adapter UI Does Not Currently Have the Same Metric ID for Localhost With IP Profile
Symptom:
The iis Probe Adapter UI does not have the same metric ID for localhost with IP profile.
Solution:
Create a profile with localhost name.
Solution:
Use Infrastructure Manager.
Jboss Adapter UI Does Not Support Creation of Templates, Auto Monitor and Auto Configuration
Symptom:
The Jboss Probe Provisioning UI does not provide the ability to create Templates, Auto Monitor and Auto Configuration.
Solution:
Use Infrastructure Manager.
Jboss Adapter UI Does Not Support Any Other QoS Except Default
Symptom:
The Jboss Probe Provisioning UI does not support any other QoS except Default.
Solution:
Use Infrastructure Manager.
jvm_monitor Adapter UI Does Not Support the Creation of Templates, Auto Monitor and Auto Configuration
Symptom:
The jvm_monitor Probe Provisioning UI does not provide the ability to create Templates, Auto Monitor and Auto Configuration.
Solution:
Use Infrastructure Manager.
logmon Probe UI Does Not Support Add New QoS Definition Option in QoS Tab for QoS Name Field
Symptom:
The logmon Probe Provisioning UI does not support Add New QoS Definition in QoS tab for QoS Name field in the drop-down list.
Solution:
When using Infrastructure Manager, the new QoS definitions can be added through Quality of Service Definitions section only.
logmon Probe UI Does Not Support File Not Found Alarm Feature
Symptom:
The logmon Probe Provisioning UI does not support File Not Found Alarm Feature.
Solution:
Use Infrastructure Manager.
To update the state of the service, reload the configuration after changes are made to the service.
ntservices Probe UI Does Not Remove Service from Services List When Added to Profiles
Symptom:
When adding services to a profile in the ntservices Probe Provisioning UI, the list of available services is not pruned down by the selection
process.
Solution:
To update the list of services the probe needs to be restarted.
Solution:
Use Infrastructure Manager if this feature is needed.
perfmon Probe UI Does Not Show Some Fields Under Status Section
Symptom:
The perfmon Probe Provisioning UI does not show some fields like Object, Counter and Instance under Status tab.
Solution:
If you need it, use Infrastructure Manager.
smsgtw Probe UI Does Not Provide Editable Drop Down for Sending Messages
Symptom:
The smsgtw does not support an editable drop down menu for sending sms messages.
Solution:
If you want to send sms messages for smoke testing purposes, then you can create a profile with the test number needed, or use Infrastructure
Manager.
Solution:
Use Infrastructure Manager if this feature is needed.
sql_response Probe UI Does Not Support SQL Query 'from file' Functionality
Symptom:
The sql_response probe supports getting the details of a file in the SQL query tab. At this time, the sql_response Probe Provisioning UI does not
expose this capability.
Solution:
Use Infrastructure Manager if this feature is needed.
sql_response Probe UI Does Not Support Running the Profile and Showing it's Features
Symptom:
The sql_response probe supports getting the details of a profile by right clicking on it and selecting "run now". At this time, the sql_response
Probe Provisioning UI does not expose this capability.
Solution:
Use Infrastructure Manager if this feature is needed.
The sqlserver probe supports the usage of 'schedules.' At this time the sqlserver Probe Provisioning UI does not expose this capability.
Solution:
Use Infrastructure Manager if this feature is needed.
Symptom:
The sybase probe supports the usage of 'schedules.' However, this feature is currently not visible in the sybase Probe Provisioning UI.
Solution:
Use Infrastructure Manager if this feature is needed.
url_response Probe UI Does Not Allow Test Profiles In Proxy Environment in UNIX
Symptom:
The url_response Probe Provisioning UI does not allow test profiles in proxy environment in Unix.
Solution:
If you need it, use Infrastructure Manager.
Symptom:
The Websphere Probe Provisioning UI does not provide the ability to create Templates, Auto Monitor and Auto Configuration.
Solution:
Use Infrastructure Manager.
Websphere Adapter UI Does Not Support Any Other QoS Except Default
Symptom:
Websphere Adapter UI does not support any other QoS except Default.
Solution:
Use Infrastructure Manager.
Websphere Adapter UI Does Not Support the Rescan Host Option
Symptom:
The Websphere Probe Provisioning UI does not support the rescan host option which allows the user to scan the profiles in the host server after a
pre-configured interval (15 minutes) and loads any profiles available in the host server under the respective Resources nodes in the probe GUI.
Solution:
Use Infrastructure Manager.
Symptom:
SSL authentication screen can only be viewed using Adapter UI
Solution:
Use the Raw Configure options to configure SSL in Infrastructure Manager.
Weblogic Adapter requires Truststore password in order to generate Certificate Expire Alarms
Symptom:
The Weblogic Probe Provisioning UI requires Truststore password to generate Certificate Expire alarms.
Solution:
Use the Thin Client.
Websphere Adapter UI Does Not Support Creation of the Templates, Auto Monitor and Auto Configuration
Symptom:
The Weblogic Probe Provisioning UI does not provide the user to create the Templates, Auto Monitor and Auto Configuration.
Solution:
If you need it, use Infrastructure Manager.
The prediction_engine probe gathers the trending information used to calculate when a particular event might occur. The primary function of the
prediction_engine probe is to let you configure Time To Threshold predictive alarms for monitoring probes. You can configure a Time to Threshold
predictive alarm by accessing a probe's GUI in Admin Console or by entering the time to threshold (TTT) Command from within the
prediction_engine probe directory.
To configure Time To Threshold settings for applicable probes, make sure ppm v2.38 (or later), baseline_engine v2.34 (or later), and
prediction_engine v1.01 (or later) probes are installed and running on the hub robot.
Revision History
Requirements
Hardware Requirements
Software Requirements
Probe Dependencies
Supported Platforms
Installation Considerations
Known Issues
Revision History
This section describes the history of the revisions for the Prediction Engine probe.
Version
Description
State
Date
1.31
What's New:
Minor changes to ensure the Time To Threshold configuration is saved properly.
GA
August
2015
1.3
What's New:
Software enhancements to extend functionality to probes that were previously blacklisted.
The prediction_engine v1.3, baseline_engine v2.6, and ppm v3.20 probes are included in the CA UIM Server v8.3
installer and are installed on the primary hub during installation. Deploy ppm v3.20 on all hub robots that are
controlling monitoring probes. When you deploy ppm v3.20, baseline_engine v2.6 and prediction_engine v1.3 are
included and installed on the same hub.
1.2
What's New:
When you deploy ppm v3.11 to hub robots that are controlling monitoring probes, baseline_engine v2.5 and
prediction_engine v1.2 are included and installed on the same hub.
GA
March
2015
GA
December
2014
GA
September
2014
A Time To Threshold Prediction graph is displayed in Unified Service Manager for metrics that have a configured
predictive alarm threshold. For each metric, two predictive QoS messages are generated: one that indicates when
the predictive alarm might be breached, and another to indicates the predictive value at the current time.
1.1
What's New:
Configuring the Time To Threshold settings on the probes' GUI has been slightly modified.
The probe GUI for configuring static, dynamic, and Time To Threshold alarms has changed. To configure Time To
Threshold settings, select the Publish Data and Publish Alarms (new) check boxes to allow the probe to send its
QoS metrics and generate an alarm. To configure Time To Threshold, instead of selecting Time To Threshold from
the Predictive Alarm drop-down menu, you now select the Time To Threshold Alarm check box. Then select the
appropriate alarm and threshold settings.
ppm v3.0 and baseline_engine v2.4 must be deployed and running on the hub robots where prediction_engine v1.1
is deployed in order to get the correct QoS alarms.
1.01
Requirements
Hardware Requirements
The Prediction Engine probe should be installed on systems with the following minimum resources:
Memory: 512 MB of RAM
CPU: 3 GHz dual-core processor, 32-bit or 64-bit
Software Requirements
The prediction_engine probe requires the following software environment:
CA Unified Infrastructure Management (UIM) 8.0 or later
Robot version 5.23 or later
Java 7 (java_jre1.7) - the hubs where the required probes are running should have java_jre1.7 loaded in the Installed Packages in Admin
Console (typically installed with UIM 8.0 and later)
Probe Dependencies
The prediction_engine probe uses the baseline generated by the baseline_engine probe to generate predictive QoS messages. Then
prediction_engine relies on the nas and alarm_enrichment probes to process and forward predictive alarms on the UIM message bus when
configured predictive alarm thresholds are breached. In addition, prediction_engine requires the ppm probe running on the hub robot to enable
Admin Console to display the appropriate configuration fields.
The following table shows the versions of baseline_engine, ppm, nas, and alarm_enrichment probes that you should be running with the different
versions of prediction_engine. If the versions of these probes are mismatched, meaning you deploy prediction_engine v1.3 with an earlier version
of baseline_engine and ppm on a hub robot, the system will not be able to properly produce predictive alarms.
Released with CA UIM
prediction_engine
baseline_engine
ppm
8.31
1.31
2.6
3.22
4.73
8.2
1.2
2.5
3.11
4.67
8.1
1.1
2.4
3.0
4.6
8.0
1.01
2.34
2.38
4.4
Supported Platforms
Refer to the Compatibility Support Matrix for the latest information about supported platforms. See also the Support Matrix for Probes for more
specific information about the probe.
Installation Considerations
The versions of ppm, baseline_engine, and prediction_engine probes you deploy to hub robots should match the versions of these
probes running on the primary hub.
ppm, baseline_engine, and prediction_engine must all be deployed and running on hub robots if you want to configure dynamic, static (if
applicable), Time To Threshold alarm and threshold settings for monitoring probes.
Known Issues
Restart Required After Upgrading to UIM 8.1 From UIM 7.5
java_jre1.7 Required
java_jre1.7 Required
Problem:
The prediction_engine v1.0 (and later) and baseline_engine v2.34 (and later) probes require the hub on which it is installed to have a Java
environment pointing to java_jre1.7. If there is a mismatch between the version of Java the secondary hub's environment is pointing to and the
probe's Java dependency, some of the java-based probes (including prediction_engine and baseline_engine) might not start.
In addition to the probe not starting, you might also see an error similar to the following:
Sep 19 17:23:10:955 [2624] Controller: Max. restarts reached for probe 'baseline_engine' (command = <startup java>)"
This error appears in the probe's log file:
baseline_engine.log file located at:
/Nimsoft/Probes/SLM/baseline_engine
prediction_engine.log file located at:
/Nimsoft/Probes/SLM/prediction_engine directory
In some instances, if prediction_engine and baseline_engine are already running on a secondary hub and you deploy another java-based probe
that requires the environment to point to a version of Java earlier than java_jre1.7, prediction_engine and baseline_engine might fail to start after
the deployment. In this case, no errors appear in the log files for prediction_engine and baseline_engine.
Solution:
In either situation, redeploy java_jre7 to the secondary hub and then restart the entire robot.
Important! If prediction_engine and baseline_engine were calculating baselines or new metrics and an error condition arises, these
calculations will be inaccurate from the time the error condition began. After prediction_engine and baseline_engine are restarted,
baselines generated by baseline_engine will be accurate after sufficient samples have been collected and the predictive alarm metrics
will begin again at the top of the next hour after restarting the robot or the prediction_engine probe.
Revision History
Probe Specific Software Requirements
Note: For SOC functionality, CA Unified Infrastructure Management Server 5.6 or later and UMP 2.5.2 or later is required.
Revision History
This section describes the history of the revisions for the printers probe.
Version
Description
State
Date
2.53
GA
November 2012
2.52
GA
January 2011
2.51
GA
December 2010
2.50
GA
September 2010
2.40
GA
May 2010
GA
April 2009
GA
December 2008
2.31
2.22
With version 3.9 onwards, the processes probe introduced monitoring of IPC Counters that share data across multiple and commonly specialized
processes using communication protocols. The probe allows you to configure the QOS and Alarms for the following counters:
Semaphores
Message Queues
Shared Memory Segments
Monitoring of IPC Counters is supported by AIX, Linux, Solaris, and Windows platforms.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Upgrade Considerations
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
4.01
Fixed Defects:
GA
July 2015
GA
June 2015
The probe was not using customized clear message text in alarms. Salesforce case 00163151
The probe was unable to monitor processes and generate alarms if the used memory was greater than 100 MB. Salesforc
e case 00162654
When the probe reported a process as down, it did not generate process down alarm with modified message text. Salesfo
rce cases: 00161389, 00160183
Processes profile script was not executing properly. Salesforce case 00158405
On Unix platform, the QOS_Definition for the IPC counters QOS_IPC_SHARED_MEMORY_SEGMENTS_UTILIZATION,
QOS_IPC_MESSAGE_QUEUES_UTLIZATION, QOS_IPC_SEMAPHORE_SETS_UTILIZATION and QOS_IPC_PROCE
SS_UTILIZATION sent an incorrect value for the hasmax field. Salesforce case: 00167456
4.00
What's New:
Upgraded OpenSSL to version 1.0.0m.
3.92
Fixed Defects:
Fixed a defect where all the eight IPC Counters were applying to the template on the Windows OS instead of applying of only
the two supported ones ('Number of Processes' and 'Number of Semaphore Sets').
March
2015
3.91
March
2015
What's New:
Added IPC Counter monitoring for Message Queues, Semaphores, Shared Memory Segments for Linux, Solaris, AIX, and
Windows platforms.
Fixed Defects:
Fixed a defect where the probe did not clear the process up and down alarms. Salesforce cases: 00156008, 00156866,
00157484, 00154035, 00156940, 00156992, 00156510
Note: While upgrading the probe version from 3.83 to 3.91, process up and down alarms do not clear from
alarm console automatically. User has to acknowledge these alarms manually
Fixed a defect where the probe did not clear alarm from alarm console for CPU usage. Salesforce cases: 00155806,
00153144
3.83
Fixed Defects:
February
2015
Fixed a defect where the probe version 3.81 did not clear the process up or down alarms on the Alarm Console. Salesforce
case 00146533
3.82
Fixed Defects:
December
2014
Fixed a defect where only one process_down alarm was generating for all closed instances of a process. Salesforce case
00135021
Fixed a defect where suppression key is missing for process restart alarm. Salesforce case 00143378
Fixed a defect where the probe version 3.81 did not clear the process up or down alarms on the Alarm Console, when the
process was stopped. Salesforce case 00146533
Fixed a defect where the probe version 3.81 did not clear the Expected Instances alarms, when the process returns to the
expected instance number. Salesforce case 00148604
Fixed a defect where the probe did not clear the alarms when the process returns to the expected instance number. Salesf
orce case 00151678
3.81
Fixed an issue where certain alarms were being suppressed on selecting the Track Processes by Process Identifier checkbox,
while creating a profile. Salesforce case 00144501
October
2014
3.80
June 2014
3.77
Fixed Defects:
April 2014
Fixed an issue where a message was not getting deleted from the message pool even when the user tries to delete the
message without clicking Apply/OK.
Fixed an issue where alarm is generated with the default text and not the modified text when the user edits the default
MsgProcessRestart message, and restarts the probe and the process.
3.76
Fixed Defects:
February
2014
User was not able to add custom messages with clear severity in the Message Override list.
User was not able to get alerts with correct username (UID was getting displayed instead of username) when the
username characters are more than 8.
3.75
November
2013
3.74
November
2013
January
2013
3.70
June 2012
Fixed an issue for process stop action alarm related to expanding $errmsg variable.
June 2011
3.62
June 2011
3.61
June 2011
3.60
May 2011
Added support process handle count monitoring in Windows environment deployments. This feature is not applicable for
non-windows platforms.
Added support for clearing alarms on restart for the profiles that are no more in the alarm state.
Fixed an issue related to $expected_user alarm variable expansion.
Added support for monitoring thread count for processes on AIX, LINUX & SOLARIS.
Added support for overriding QoS target at profile level.
Fixed a crash in thread count monitoring.
Fixed Service Oriented Configuration defect.
Additional error checking on fetching performance data.
3.52
3.51
January
2011
December
2010
Applied fix to resolve the issue: TNT2 metric different for clear alarms.
Fixed Desktop Handle leak.
3.30
3.21
August
2010
April 2010
April 2010
March
2010
Added support for alarm clear when the monitored process goes down.
March
2010
3.13
February
2010
3.12
October
2009
Added the missing MsgCpuUsageMin and MsgCpuUsageRange alarms to the cfx file.
Added support for LINUX_22.
Overwrite of subsystem ID for all alarms is now possible.
Handling of process names greater than 32 characters in case of Solaris.
Script execution on alarm conditions.
Fixed the exclude functionality. No alarms or QoS will be sent for processes that are excluded.
Support for $robot variable in messages.
Added 'default' flag for alarm messages.
Improved error handling for fetching performance data.
Added default clear message. Fixed configuration tool and probe to enable the use of the clear message.
Fixed minor GUI bug when using the new grouping of profiles feature. Sometimes, when the user double clicks a process
in the status tab, and that profile is monitored by a profile, the profile would not be displayed unless the group were active
and selected in the profiles tab.
2.90
Added option to group profiles together. Group name can be used as part of alarm messages. Groups can be
activated/deactivated without having to activate/deactivate all profiles within a group. Profiles can be moved between
groups.
April 2009
Added option to clone a profile (create a new profile with same settings as the source).
Added option to select many rows in profiles list box.
Added option to 'track processes by process ID'. This opens up a new feature, to alert when a process has been restarted.
It also opens the possibility to send in individual QoS samples for otherwise similar processes. Processes which you can't
normally separate using the standard methods like regexp or command line arguments.
Added option to monitor process instances between a given range.
Added option to alert if avg. cpu falls below a given threshold (similar as thread count and memory usage).
Added option to invert the test for process owner (to be able to alert when a process is NOT running under a given user).
Easier than inverting a regexp method.
Fix: The GUI and the probe were treating the "proc_cmd_line" key differently which caused confusion. Probe accepted the
string without the key "scan_proc_cmd_line", while the GUI did not. Now the GUI will interpret these keys the same way
the probe does.
Fix: Improved various GUI input validation fields. Such as text fields which accept only numerical values should now only
accept that.
Improvement: When the GUI is talking to a processes probe which runs on a Windows system, it will display both the
working set (VM) and the pagefile memory in two separate columns for processes.
Fix: Trailing blank space in command line arguments caused profile matching to fail on Windows platforms.
2.73
December
2008
2.72
Probe update
November
2008
2.71
September
2008
2.53
October
2007
2.52
September
2007
Known issues:
The probe is unable to get the command line of 64-bit processes on Windows and of all processes on Windows Vista.
Since version 2.40, it is possible to try and retrieve process information again if the probe believes the data to be corrupt.
This limit has been defined to a default value of 1 (which means, try 1 more time and then give up). It can be tweaked in
the raw-configure to any number between 0 and 10.
2.51
May 2007
Fix: Improved logic to try and detect corrupted process information on some rare situations on HPUX. If the probe detect
corrupted data, there is a retry function that tries to retrieve process information again.
January
2007
Fix: The test button in the GUI, for testing a profile against running processes did not respect the 'Excluded processes' list.
This has been fixed.
Fix: The GUI sometimes didn't show if a process was being monitored, even if it the probe was monitoring the process.
The reason for this was that the GUI was trying to match profiles against the process list from the probe. The process list
returned to the GUI (from the probe) was a snapshot of current running processes. The process match against profiles
logic has been moved to the probe, so the list displayed in the GUI should always show the processes that are actually
being monitored by the probe. A process may also be monitored by more than 1 profile, which in previous versions the GUI
was unable to handle.
GUI: New input field to allow custom log file size.
GUI: The help button in the dialog for Profile Monitoring dialog is now pointing to Profile setup in the documentation.
Fix: Probe should no longer be case sensitive on all windows platforms when matching process owner, process name or
command line arguments.
GUI: Added a new menu item on the right-click context menu of the listview control that displays profiles. Prompts you to
give a new profile name. If you select OK and the name does not already exist, the profile will be renamed once you save
your config file.
GUI: The 'Expected Instances' field in the Profile Monitoring dialog has been changed to allow more processes. The limit
has been increased from 999 to 9999.
Windows: skip command line detection for 'System' process to avoid intermittent memory access violations.
2.35
Solaris: logging prints errno variable even when no error has occurred, making it seem as if there is a problem when there
isn't.
September
2006
Unix: Improved handling of child processes (fixes a problem with the signal handler hanging the probe in some cases)
Unix: Improved handling of forked processes
2.32
February
2006
2.30
March
2005
Message variables are made available to more alarm situations. The current set of variables are: arguments,
command, description, errmsg, executable, max_restarts, pid, proc_cmd_line, process, start_dir, watcher
Instances
instances_expect
instances_found
instances_op
process_count
process_count_type
Processes
cpu_average
cpu_total
expected_cpu_usage
expected_size
expected_user
max_samples
op
samples
size
thread_limit
threads
user
time_delta
which
Window
window_name
window_class
window_text
Important! If you install 32-bit OS in a Solaris system with 64-bit architecture, then the probe stops functioning and the status indicator
on the GUI turns red.
Upgrade Considerations
While upgrading the probe version from 3.83 to 3.91, process up and down alarms are not cleared from alarm console automatically.
User has to acknowledge these alarms manually.
For viewing the metrics that are available in the processes probe version 3.9 on the USM portal, you can perform any one of the following
actions:
Upgrade your NMS version to NMS 7.6 or CA UIM 8.0 or later.
Install the ci_defn_pack probe version 1.02. You must restart the nis_server when you deploy the ci_defn_pack.
Important! You can install the ci_defn_pack probe from https://support.nimsoft.com
Revision History
Version
Description
State
Date
1.25
What's New:
Minor software modifications.
GA
August 2015
1.24
GA
December 2014
1.23
GA
September 2014
1.22
Defect fixes.
GA
June 2014
1.20
Fixed a defect where qos_processor failed to enrich some qos data due to database deadlocks in MySQL
GA
Mar 2013
GA
May 2012
1.10
Initial implementation.
Supported Platforms
The qos_processor probe is supported on the same operating systems as CA Unified Infrastructure Management.
You should modify the default configuration parameters according to your requirements before activating the probe.
Revision History
This section provides the history of revisions for the reboot probe.
Version
Description
State
Date
1.41
What's New:
GA
November
2015
1.30
Improved logging
December
2010
June 2010
September
2009
1.11
June 2009
December
2008
September
2008
Note: For version 1.1 and higher of this probe NimBUS Robot version 3.00 (or higher) is a prerequisite. Carefully review the
"Upgrading the NimBUS Robot" document before installing/upgrading
1.05
December
2006
April 2003
Note: The rsp probe supports only password-based and key based authentication. Keyboard-Interactive and authentication less
method are not supported. If the UNIX-based remote server is not password-based or key-based authentication enabled, the rsp probe
is not able to discover the remote host.
Contents
Revision History
Monitoring Capabilities
Monitoring Support
Supported Hosts
Threshold Configuration Migration
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Windows Operating System
SSH Monitoring to UNIX Like Operating System
UNIX System Utilities for Data Collecting
Known Issues
Revision History
This section describes the history of revisions for this probe.
Version
Description
State
Date
5.11
Fixed Defects:
GA
October
2015
GA
June 2015
GA
March
2015
The time stamp in the probe configuration file did not match with the time of the last event that was generated on the
monitored host. So the probe sent duplicate alarms for the event. Salesforce case 00160027
If a process restarted with different command line parameter, the probe displayed the process in down state. Salesforce
case 00162366
On Linux platform, the probe displayed incorrect value for the Total CPU monitor. Salesforce case 00167398
5.10
What's New:
The probe can now be migrated to standard static alarm thresholds using the threshold_migrator probe.
Added support for factory templates.
Fixed Defects:
The probe displayed wrong subsystem ID for alarms. Salesforce case 00166698
The probe accepted only 512 bytes in the event description and was storing the first 512 bytes in the database. Salesfor
ce case 00161142
5.01
What's New:
The probe generates priority alarms for all the metrics.
Fixed Defects:
Fixed a defect where if a process doesn't exist on the monitored remote system, alarms other than the process down
alarm do not expand the variables in the alarm message.
5.00
What's New:
Beta
March
2015
Added the localization support for B-Portuguese, Chinese (simplified and traditional), French, German, Italian,
Japanese, Korean, and Spanish languages from both IM and Admin Console GUI. For localization support through
Admin Console GUI, the probe must run with PPM 2.38 or later version.
Updated the probe Infrastructure Manager GUI and Admin Console GUI for specifying the character encoding in
different locales.
Note: Do not use the Raw Configure GUI for updating the probe configuration in the non-English locales because it can
corrupt the probe configuration file.
Fixed Defects:
Fixed a defect where the probe was not discovering the hosts on the SuSE r11 OS. Salesforce cases:
00132878, 00145831
4.03
Fixed Defects:
October
2014
Fixed a defect where the probe was collecting, and storing data of all the processes, ntevents, and services of all the
hosts. Salesforce case 00130220
Fixed a defect where the configuration details of a section, instance, or monitor was not displayed in the Profile
Templates node in the Advanced Configuration section of the monitor. (Salesforce Case: 00139452)
Fixed a defect in which the probe makes three consecutive attempts to discover and establish a WMI connection with
the host. Salesforce case 00143692
4.01
Added support for monitoring remote systems over Internet Protocol version 6 (IPv6).
March
2014
4.00
What's New:
December
2013
Implemented probe restart feature. RSP does not reload full configuration on adding/deleting/modifying a profile. Now
the probe restarts the configuration without closing the probe window.
Implemented Custom WMI feature. This enables user to monitor any WMI classes and objects.
Fixed Defects:
Defect fixed related to probe crash where the probe crashed on specific non-English locale process/service names.
Defect fixed related to mismatch in reported time of event on different OS. Event tab now shows local time irrespective
of Operating System.
Defect fixed where the probe was not alerting Critical event on Japanese Locales.
Defect fixed where the probe used to show large number of process bulbs when process was down.
Defect fixed: for the upgrade scenario of the probe from version 2.92, where probe is missing two key values. This
defect was affecting those users who have created custom groups in 2.92 and were upgrading to 3.0 onwards.
3.07
3.06
3.05
October
2013
September
2013
September
2013
Fixed a SOC issue where probe used to crash when cycle is left blank.
Fixed an issue where probe did not discover Solaris system where prtconf command returns error.
3.04
On windows QoS values for CPU system, wait, idle and user would be sent as NULL.
Optimized discovery callback to improve time for discovering hosts.
August
2013
3.02
Adding functionality to include monitoring of system wait and idle CPU time for windows
July 2013
Fixed a defect where swap memory was not getting reported on windows.
Fixed a defect where processes were not properly monitored if command line of any process contains a single quote().
Fixed a defect where probe was not alerting when swap memory is not present.
Fixed a broken functionality of passwordless ssh using passphrase.
Fixed a broken functionality of cpu monitoring on AIX.
Fixed a crash defect in expand_ipv6 callback when input parameters are not specified.
Fixed a crash issue when command line of a process is very long
Fixed issues related to incorrect number of instances shown on GUI.
Fixed an issue where folders having # in their names were not monitored.
Fixed an issue where probe shows process name instead of command line when profile is created as name +
commandline.
3.01
June 2013
Fixed a defect where process selected as process + Commandline option from GUI does not appear.
Fixed a defect where rsp used to crash on windows after 1 day.
3.00
June 2013
Report Viewer functionality is discontinued from GUI.(However, user can use UMP).
LUA load and clear alarms at startup has been removed.
Interval is now supported at host level instead of monitor level and all monitors are now governed by cycle. For example
a profile with interval 5 min and disk as 3 will have disk run after every 15 mins.
2.93
2.92
February
2013
January
2013
2.90
October
2012
September
2012
Updated discover_host callback to add architecture and address_width values to CPU entries.
Fixed false positive alerts for monitored processes in LINUX.
Added support for password-less ssh (RSA/DSA).
2.81
Added support for IPv6 address range expansion for discovery feature.
March
2012
2.80
Fixed issue related to incorrect QOS and alarms for process instances. The discovery feature of address range for IPv6 is not
supported in this version.
December
2011
2.72
November
2011
Added support for fetching machine name via SSH and WMI.
Updated libraries.
2.71
October
2011
Added support for fetching machine name via SSH and WMI.
2.68
March
2011
2.66
January
2011
2.65
December
2010
2.61
2.60
December
2010
October
2010
Optimized code to improve the performance while reading a large amount of WMI data.
Fixed a crash in case of NTevent profile deletion.
September
2010
2.51
July 2010
2.50
June 2010
2.40
June 2010
Note: Profiles which are monitoring event logs from machines that are not in the RSP probe's timezone need to be
reconfigured by deleting the section from the affected profiles.
2.31
2.30
March
2010
February
2010
February
2010
2.20
2.17
December
2009
December
2009
December
2009
In case of data collection failure the checkpoint icon in GUI changes to data not available icon.
Fixed nimlog messages.
NTEvents_helper included in db cleanup.
Fixed the issue of incorrect NTEvent QoS count.
2.16
Fixed the bug in CPU.lua which wrongly reported 100% CPU utilization on some Linux machines.
December
2009
2.15
Added support for deleting the monitored CPU and Disk devices from the main user interface.
December
2009
2.14
Changed the RSP services QoS definition to match the QoS definitions of NTServices probe.
Added new key to hide irrelevant UI controls in case of default and inst_default template specifically in processes,
windows services and windows events template settings
December
2009
2.13
July 2009
March
2009
2.10
March
2009
2.02
Added functionality similar to Processes, NTEvents and NTH probe to existing RSP probe functionality.
Parsing of the data is now done using the Lua scripts instead of the C code.
December
2008
Added job queue to fix the problem DB locking issue in case of multiple threads inserting simultaneously in DB.
Modified DB to support Lua scripting.
Upgrade from 1.18 to 2.02
1.18
1.17
May 2008
March
2008
'Close' of the authentications window now works even when no authentication profiles exist.
1.16
1.15
January
2008
November
2007
September
2007
Modified advanced setup configuration tree so that opened tree branches do not close on open of others.
February
2007
1.02
December
2006
Monitoring Capabilities
The rsp probe gathers the following statistics on the CPU utilization, disk, memory usage, and load:
CPU usage (%) with multi-CPU support
The high severity (error) and low severity (warning) alarms for the total CPU usage.
The high severity (error) alarms for the individual CPU usage in multi-CPU systems.
Disk
Automatic local disk discovery
Disk Usage (MB or %)
Information retrieval on the local disk (Total disk space, Free disk space, and Used (%) disk space).
Memory
Total memory usage (%)
Physical memory usage (%)
The page file usage (%)
Swapping or the paging activity page/s
Load
Load at a specific moment
Processes
Alarms for the invalid process owner
Alarms for the invalid CPU usage
Alarms for invalid process size
Alarms for the invalid thread count
Alarms for the wrong number of process instances
Alarms on process up/down
Services (Windows Specific)
Alarm on the state of a service
NTEvents (Windows Specific)
Alarms on specified NTevents
Custom WMI (Windows Specific)
Alarm specific to an object of WMI class
The rsp probe monitors thresholds for all the data points it gathers and allows you to also gather the QoS data. The QoS names match those
used by the CDM probe since the gathered data is the same. To gather remote data, the probe uses WMI on Windows systems and native
commands on UNIX/Linux systems, either through SSH or telnet.
Note: These commands are run as the root user on UNIX or Linux systems. Therefore, it is strongly recommended that you use SSH to
avoid the root password being transferred without encryption over the network.
The rsp probe is multithreaded and allows simultaneous data gathering from several servers. You can monitor up to 50 servers simultaneously.
This number depends on several factors, which are as follows:
Monitoring of the Window computers is less time consuming as compared to UNIX computers
Using SSH is faster than using telnet
The capacity and speed of the network
The capacity of the computer hosting the probe
The capacity of the computers being monitored
Monitoring Support
The rsp probe runs on Windows and Unix or Linux operating systems as specified in the Support Matrix for CA UIM Probes. You cannot monitor
Windows machines from an instance of rsp running on a UNIX or Linux robot.
The host and the corresponding monitored platforms that are supported for the rsp probe are:
Windows->Windows
Windows->Unix
Windows->Linux
Unix->Linux
Linux->Unix
Linux-> Linux
Unix->Unix
Important! The rsp probe cannot be deployed on AIX systems but can monitor remote AIX systems running on only versions 4, 5, or 6.
Supported Hosts
From rsp version 5.0 and later, the probe supports hosts with the following encodings:
UTF-8: Unicode (UTF-8)
UTF-16BE: UnicodeBigUnmarked
UTF-16LE: UnicodeLittleUnmarked
UTF-32BE: Unicode (UTF-32 Big endian)
UTF-32LE: Unicode (UTF-32 Little endian)
Shift_JIS: Japanese (Shift-JIS)
ISO-2022-JP: Japanese (JIS)
ISO-2022-CN: Chinese(ISO)
ISO-2022-KR: Korean (ISO)
GB18030: Chinese Simplified (GB18030)
GB2312: Chinese Simplified (GB2312)
Big5: Chinese Traditional (Big5)
EUC-JP: Japanese (EUC)
EUC-KR: Korean (EUC)
ISO-8859-1: Western European (ISO)
ISO-8859-2: Central European (ISO)
windows-1250: Central European (Windows)
windows-1252: Western European (Windows)
Important!
Use a text editor like Notepad++ on Windows system and gedit on Linux system to edit your configuration file directly. If you
use Notepad as an editor, it appends BOM (Byte Order Mark) for an UTF-8 encoding file and the probe does not start with a
BOM included configuration file.
Do not use Raw Configuration GUI when the probe is deployed in a non-English locale.
After migration of disk alarms, the threshold values and the configured alarm messages will change.
Note: For more information, refer Disk metrics in the rsp Metrics article.
The QOS_PROCESS_STATE that monitors the state of the process will not be migrated.
The higher threshold will be migrated for any monitor having same severity for both high and low threshold.
Any profile created with a group does not display on the Admin Console GUI under Templates drop-down.
If the monitors selected from the Admin Console GUI are not part of group, the QoS and alarms will not come for them.
Installation Considerations
The installation considerations for different operating systems are mentioned in the following sections.
/usr/sbin/swapon
/usr/bin/uptime
/bin/vmstat
ps -efo
Known Issues
The known issues of the probe are:
The rsp probe expects the data to be returned by the hosts being monitored. In case of failure to collect the data from hosts, the probe
raises an alarm and sends NULL QoS. In some cases, where max values in QoS are dynamic, the QoS data is discarded by data engine
as the data is not available from the hosts.
At startup, the rsp probe looks up the host name and sends data integrity alarms for each host. If these alarms are generating slowly
(delay of 4-5 sec between each alarm) instead of a burst then there might be some problem in DNS setting of the server where the probe
is running.
The probe does not support monitoring of forwarded events.
You may see inconsistency in cycles shown in GUI and template.
The probe cannot monitor remote Windows server if the probe is deployed on a linux or solaris system.
The probe does not support host name and user credentials with non ASCII characters.
The Admin Console GUI of the probe has the following additional limitations:
Profiles created from the Admin Console GUI are not available on the IM GUI.
The probe does not provide creation of custom WMI profiles.
The probe does not provide monitoring through template.
Note: If the probe is migrated using threshold_migrator 2.01, the template configurations are also available on the Admin
Console GUI.
Salesforce is a cloud-computing platform that provides physical computing infrastructure and a common application platform that runs on the
infrastructure. The salesforce cloud infrastructure consists of a set of datacenters (referred to as salesforce instances). The datacenters can be
located anywhere in the world.
The salesforce probe monitors the following parameters on the Salesforce cloud using the page http://trust.salesforce.com/trust/status/:
Average transaction speed in milliseconds: The transaction speed is calculated as the difference between the entry and exit time-stamps
of a user from the Salesforce application server. The transaction speed is an average of all transactions across all salesforce instances.
The average is calculated for a 24-hour period defined by UTC time.
Number of transactions: The total number of transactions executed within the cloud.
Instance Status: Each salesforce instance displays the current operating status. Typical values are: Instance Available (operating
normally), Performance Issues (instance performance degraded), Service Disruption (severe service degradation), Informational
Message and, Status not Available (offline).
The salesforce probe can monitor any number of organizations. For each organization, the probe can measure the following QoS parameters:
Web Login Logout timing: The time taken to log in and log out from the selected organization through the Web.
Data Storage Used: The amount of data currently stored for the organization. Organizations typically have contractual limits as to how
much data they are permitted to store.
File Storage Used: The number of files currently stored for the organization. Organizations are contractually limited as to how much file
storage they are allowed to use.
Number of API calls in the last 24 hours: The count of web service API calls made to the organization in the last 24 hours. Organizations
have a contractual limit as to how many API calls they are permitted in one day.
API Login Logout timing: The amount of time the system takes to log in and log out from the organization through a web service API call.
Query Execute Time: The amount of time the system takes to execute a query against an organization. The probe user can define any
number of custom queries to execute.
Number of Objects: The number of objects returned by the specified query.
Query Returned Value: The probe can extract a specific value from the result objects of a query and can use that value for generating
QoS.
Important! The salesforce probe from version 2.0 and later is accessible only through the Admin Console GUI and not through the
Infrastructure Manager GUI. Upgrade from any previous version of the probe to version 2.0 is not supported.
Contents
Revision History
Upgrade Considerations
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.0
GA
January
2015
The probe is now available only through the Admin Console (AC) GUI and not through the Infrastructure Manager (IM)
GUI.
Upgrade from previous versions to version 2.0 and later is not supported.
Note: Probe is supported from CA Unified Infrastructure Management 8.0 and later only.
Upgrade Considerations
This section contains the considerations for the salesforce probe.
The salesforce probe version 2.0 and later is available through Admin Console (AC) GUI only and not through the Infrastructure Manager
(IM) GUI.
Upgrade from previous versions to version 2.0 is not supported.
If you have been using a probe version prior to version 2.0 then you must manually transfer the existing configurations to the new
version.
You must remove all the versions of the salesforce probe that are older than version 2.0 as upgrade to version 2.0 is not supported.
Revision History
Prerequisites
System Requirements
Hardware Requirements
Software Requirements
Limitation
Troubleshooting
Revision History
This table describes the history of probe updates.
Version
Description
State
Date
1.01
What's New:
Beta
Oct 2015
Beta
September
2015
Initial release
Prerequisites
The CA Unified Infrastructure Management Alarm Server (nas) probe must be installed and activated.
System Requirements
Hardware Requirements
The probe should be installed on a system with the following minimum resources:
Memory: 2-4 GB of RAM
CPU: 3 GHz dual-core processor, 32-bit, or 64-bit
Software Requirements
The probe requires CA Unified Infrastructure Management v8.1-8.3.
The probe is compatible with the following service desk applications:
Service Desk Application
Version
BMC Remedy
8.1
Bamboo Release
12.9
HP Service Manager
9.30
7.1 SP12
ServiceNow
Winter 2015
Limitation
If the Launch in Context URL from a Service Desk contains single quotes, then the single quotes( ) are replaced with the HTML ASCII code
(') and is displayed in the configured custom field of CA Unified Infrastructure Management.
For example, if the Launch in Context URL is as follows:
http://<host name>:<port>/arsys/servlet/ViewFormServlet?form=HPD:Help
Desk&server=<host name>&qual='1000000161' == "INC000000000658"
Then the URL is displayed in the following format in the custom field of CA Unified Infrastructure Management as follows:
http://<host name>:<port>/arsys/servlet/ViewFormServlet?form=HPD:Help
Desk&server=<host name>&qual=�' == "INC000000000658"
To access the URL from a browser, replace the HTML ASCII code (') with single quotes ( ').
Troubleshooting
Symptom:
The probe fails to save the configuration details when the probe is deployed on a robot which is not present on a primary hub machine.
Solution:
Change the probe security settings as follows:
1. Open Admin Console.
2. Select Settings, Probe Security.
3. In the Probe Security page, change the Access level setting for "ppm" and "sdgtw" probes to "admin".
4. Restart ppm and sdgtw probes.
Symptom:
The probe generates an alarm in CA Unified Infrastructure Management because incident creation or updating failed in a service desk.
Solution:
Check the connection details of the Service Desk in the sdgtw.cfg file.
Symptom:
The probe generates an alarm in CA Unified Infrastructure Management because it failed to update the custom alarm fields in CA Unified
Infrastructure Management.
Solution:
Restart the probe.
Symptom:
The probe fails to create an incident.
Solution:
The probe can fail to create an incident because of a mismatch between the alarm field and the service desk field. Check if any of the service
desk fields require specific values to be entered in the alarm field that is mapped to them. For example, the impact field of a service desk accepts
only High, Medium and Low as values. Map the impact field to an alarm field that contains the same values of High, Medium and Low.
Symptom:
For SFDC Service Cloud, I configured the probe to change the status of incident to Resolved when an alarm is deleted in CA Unified
Infrastructure Management. But the incident resolution fails.
Solution:
For SFDC Service Cloud, for the Incident Resolved Status field, select Closed instead of Resolved. The incident in SFDC Service Cloud does not
have a Resolved status.
Symptom:
The probe fails to subscribe to a configured queue.
Solution:
Check the connection details and the status of the relevant service desk. The probe subscribes to a queue only when you configure a connection
to at least one service desk. If the probe cannot connect to a service desk, the probe does not subscribe to the queue.
Symptom:
The probe unsubscribes to a queue.
Solution:
If the probe fails to update a custom alarm field due to a problem in CA Unified Infrastructure Management, it is unable to subscribe to the
affected queue. Restart CA Unified Infrastructure Management. If that does not resolve the issue, contact your CA Unified Infrastructure
Management administrator.
Symptom:
For SAP Solution Manager (SAPSolManager), I changed the status of an incident to Closed. But the corresponding alarm in CA Unified
Infrastructure Management is not deleted.
Solution:
Synchronization of incidents to alarms is not possible for SAP Solution Manager. If an incident is closed in SAP Solution Manager, then the
corresponding alarm is not deleted in CA Unified Infrastructure Management.
Revision History
Threshold Configuration Migration
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.70
What's New:
GA
June 2015
GA
March 2013
The probe can now be migrated to standard static alarm thresholds using the threshold_migrator probe.
Added a checkbox on the Infrastructure Manager GUI to enable and disable alarms for thresholds that have QoS
data.
1.60
1.51
1.50
Fixed Issue: Sharepoint is sending hostname on source field instead of the IP address
Fixed Issue: SharePoint probe not working on SharePoint Server 3.0.
July 2012
April 2012
December
2011
June 2011
June 2011
1.30
December
2010
1.21
Added fix to make default service profiles compatible with SharePoint 2010 setup.
November
2010
1.20
June 2010
Updated the latest Nimbus Dot Net API with SSL support.
June 2010
Added profile name to QoS target in all QoS to differentiate between profiles.
1.02
1.01
If controller is configured to use robot name for QoS, the probe will send configured name in the QoS source.
Otherwise the probe will send the host name in QoS source
Added fix in eventlog profile to minimize CPU usage.
September
2009
August 2009
Added fix for crash on probe restart. Updated bmake and package file for win64 support.
1.00
Initial release.
June 2009
Probe specific alarm configurations in the probe monitors will be replaced by Static Alarm, Time To Threshold, and Time Over
Threshold configurations.
The variable syntax will change from $<variableName> to ${<variableName>}.
The alarms will be sent by the baseline_engine probe.
The sharepoint probe requires the following software environment to migrate with threshold_migrator probe:
CA Unified Infrastructure Management 8.3 or later
CA Unified Infrastructure Management Robot 7.5 or later (recommended)
Java JRE version 7 or later
Probe Provisioning Manager (PPM) probe version 3.21 or later
Baseline Engine (baseline_engine) version 2.60 or later
Note: In order to obtain proper time zone calculations the sla_engine must be deployed in the same time zone as the data_engine
probe.
Contents
Revision History
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
3.72
Fixed Defects:
sla_engine calculates and reports data automatically without occasionally dropping SLA periods. Salesforce cases:
148390, 154959, and 159677
GA
August
2015
Controlled
Release
July
2015
3.71
3.7
Service Level Objectives (SLOs) can now use QoS metrics with names up to 255 characters. Previously, it was limited to 64
characters.
GA
Dec
2014
3.63
GA
Sep
2014
3.62
Fixed issue in which sla_engine would calculate some SLAs incorrectly in Interval mode.
GA
Mar
2014
3.60
Fixed defects:
Fixed setting any arbitrary timepoint in the GUI as the starting point for an SLA interval.
GA
Jun
2013
Jan
2013
3.58
Oct
2012
3.57
Fixed getting the wrong timezone when there are multiple locales using the same timezone description.
Jul
2012
3.56
May
2012
Added the updated mysql_nis_base_create.sql script. This file updates the database to improve the startup performance of
the sla_engine.
Mar
2012
3.54
Added a check for the database version. If the database is up-to-date, the create database script is not run.
Feb 28
2012
3.53
This version corrects the time zone calculation for SLAs that are created for time zones other than the time zone the
sla_engine and data_engine are deployed in. Changed MySQL driver to version 5.1.18. Requires Robot version 5.51 or
better.
Feb 21
2012
3.49
Longer timeout in calculation, so calculations will complete with large amount of data.
Jun
2011
3.48
Fixed SLA Engine not updating operating period for SLA of more than one month.
Setting application in oracle database connection.
Mar
2011
Minor fixes.
Dec
2010
3.34
Minor fixes.
Nov
2010
3.29
Oct
2010
3.03
2.13
2.10
sla_engine no longer fails with errors in the log under certain period conditions.
Added a section for changing default alarm settings in the configuration file.
Added support for custom alarms per SLA/SLO, configuration of this is done in the SLM.
Jul
2010
Mar
2010
Aug
2009
The multi-series plugin can now treat missing data as down in the calculation.
Mar
2007
Installation Considerations
Plan to deploy the sla_engine probe in the same time zone as the data engine.
Deactivate the sla_engine probe before performing an upgrade of the probe.
Deploy the sla_engine probe from the archive in the normal manner.
by the SLM hub. The additional sla_engine is automatically defined as a slave to the master sla_engine and is assigned SLA calculations in a
cooperative manner.
Note: Once the sla_engine is set to run in either Master or Slave mode, this setting can only be changed in Raw Configure. For more
details, see the sla_engine probe article.
The Short Message Service Gateway (smsgtw) probe provides a powerful tool to send alerts and messages over GSM digital cellular telephone
networks. Mission critical systems and applications alerts can be forwarded automatically to a mobile telephone. The bi-directional communication
enables the user to request "status" from the monitored systems.
Extensive high-availability features are built into version 2.x and above, such as Hot-Standby and Fail-Over. It is also possible to handle more two
adapters (E.g. SMSC and external GSM phone) on a single gateway. A smsgtw client is also available, enabling any PC to send SMS messages
from their desktop environment using the same gateway.
Contents
Revision History
Prerequisites
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
3.10
What's New:
GA
September
2014
GA
October
2013
3.02
GA
April 2013
GA
December
2012
GA
June 2007
3.01
2.14
Added reinit of devices at restart, default generic modem for usb connections, added com ports to 16, updated
documentation.
Added support for MultiTech modems, and for Cingular SMS services.
SMS gateway now does a cold start if its connected to secondary device for more than 5 minutes without switching
back to primary device.
Added support for DATEK SMSC version 2 messages
2.10
GA
February
2004
Prerequisites
A connection to the PC normally provided by the modem manufacturer, for example Multi-tech Modem Driver. This could be RS232
based or USB based.
One or more Robot.
The ServiceNow Gateway probe is used to generate an incident in the Incident Management process of the ServiceNow platform from the NMS
Alarm. Generating an incident helps the service desk user to take corrective actions for resolving an issue. The incident is generated when an
alarm is assigned to the designated NMS user.
The Service Now Gateway probe is currently certified on calgary version of servicenow. The probe supports Basic authentication type to manage
connections to a ServiceNow instance through a proxy server.
Contents
Revision History
Hardware Requirements
Software Requirements
Upgrades and Migrations
Revision History
This section describes the history of the revisions for sngtw probe.
Version
Description
State
Date
2.12
Fixed Defects:
GA
June 2014
The Custom2 field was storing the wrong Service Now URL for the created incident.
Alarms were being cleared automatically every hour by the Service Now account causing new incidents to be created.
Alarms were not cleared for incidents resolved or closed in Service Now.
2.01
Fixed Defects:
GA
April 2014
Fixed an issue for generating a dynamic URL to ServiceNow incident when the ServiceNow URL is based on IP
address.
Fixed the field mapping functionality to map the date time field of the NMS alarm to the String type field of the
ServiceNow incident.
Fixed the Time Arrival, Time Origin, and Time Received fields for supporting a user defined format through Raw
Configure.
2.00
June 2013
1.21
Fixed issue to Close NMS alarm on both Resolve / Closed event at service now side.
User can configure Close / Resolve incident based Request XML using SngtwRequestXML.properties file.
User can configure service now field which contains NMS alarm ID for close incident query.
Fixed issue to disable retry functionality.
Fixed issues to support few string into Message text during incident creation (& / $)
April 2013
1.20
Service Now Gateway would now be able to retry for ticket creation on failure. Retries are configurable from probe
configuration.
User can configure how what alarm severities can trigger ticket creation.
Service Now Gateway now has capability to query resolved incidents in order to close alarms.
Fixed issue where the probe configuration window opened to the field mapping tab directly.
Fixed issues with the field mapping window where duplicate service desk field mapping was added if an existing mapping
was opened for editing and source alarm field was updated.
Corrected handling of last_incidentcheck_timestamp. Previously, if the probe could not find a ticket closed by the alarm it
skipped last_incidentcheck_timestamp update. Additionally if the last_incidentcheck_timestamp was present in the
configuration file it generated an unrealistic future date to fetch closed incidents.
September
2012
1.10
June 2012
1.01
August
2011
Hardware Requirements
The sngtw probe must be installed on systems with the following minimum resources:
Memory: 2-4GB of RAM. The OOB configuration of the probe requires 256 MB of RAM
CPU: 3-GHz dual-core processor 32, or 64 bit
Software Requirements
The sngtw probe requires the following software environment:
Nimsoft Monitor Server 7.6 or CA Unified Infrastructure Management 8.0 or later
Robot 7.6 or later (recommended)
Java JRE version 6 or later (for Admin Console only)
Incident Management process of the ServiceNow platform
UMP 6.6.0 for assigning alarms and communication between the UMP and ServiceNow platform
Revision History
This table describes the history of probe updates.
Version
Description
State
Date
2.26
What's new:
GA
October
2015
GA
August
2015
GA
July 2015
What's new:
The Component User-Defined Property field now persists with rediscovery.
IP addresses that were incorrectly appearing in hex now appear as IP addresses.
Corrected issues with metric families causing device discovery to poll too many OIDs
Corrected issues where changes to metric families were causing template upgrades to fail.
Corrected an issue where agent restarts would cause the probe to incorrectly calculate metrics with delta values.
2.21
What's new:
The custom properties on device profiles now appear correctly in the probe configuration GUI.
The default templates are now read-only and inactive. Activate these templates through the template editor to apply
the default metric settings to your devices.
2.2
What's new:
GA
June 2015
GA
April 2015
GA
March
2015
GA
December
2014
Added the ability for users to define their own custom properties on device profiles. Use these properties to further
refine the rules for applying a template filter.
Converted metrics from polled to nonpolled so they no longer appear incorrectly in monitoring templates.
ManagementRedundancyRole and LastRestartReason in the Server Statistics metric family
SourceIpPort, DestinationIpPort, IpProtocol, EtherType, and MediaRingType in the System Ip Policer metri
c family
PolId and PolPriority in the Acl Stats metric family
2.11
What's new:
Added the ability to set a fixed time for data collection within the polling interval. The RandomFixedSchedule config
uration option is in the pollagent fh.conf file (<Nimsoft>\probes\network\pollagent\conf). The default probe
behavior is to collect data at a random point within the interval. When RandomFixedSchedule=true, data collection
occurs at a fixed time.
The probe now publishes data in USM by host name. Previously, the default behavior was to publish data by IP
address. You can disable this behavior in the probe Raw Configure setup options.
The device Availability metric allows you to determine if the device was available within a polling interval. The metric
value is 100 percent if the device was operating the entire time since last polling cycle, or 0 percent if the device
stopped operating for some time since the last polling cycle. This fix corrects the known issue, "The Availability
metric value is either 100 or No Value (null)".
Corrected an issue with deprecated metric families that was causing template migrations from v2.0 to v2.1 to fail.
Updated metrics in the following metric families:
DNS metric family has the CacheHitRatio metric
System Session Information metric family has the SessionRate1Min metric
License Management metric family contains the TotalLicenses and UsedLicenses metrics
Updated memory and disk requirements for v2.1 snmpcollector and pollagent.
2.1
What's new:
Created a Self-Certification portlet in USM. The portlet is a wizard that allows you to add new device and MIB OID
support to the snmpcollector probe. Use this tool if existing device metrics are not sufficient, or you have an
unsupported SNMP enabled device. For more information, see SNMP Device Self-Certification.
Added support for multiple SNMP Credentials for CA SystemEDGE devices.
Reduced probe memory requirements.
Created a probe meta-package to install snmpcollector and its dependant packages.
Deprecated the Alternate Interface, Cisco UCS Switch Fan, Cisco UCS Switch Power Supply, and Environmental
Sensor Temperature Status metric families. The deprecated metric families contained metrics that were duplicated
in the Interface, Fan, Power Supply, and Temperature metric families.
Alarms are generated when thresholds are set within the Reachability metric family.
Corrected an issue where duplicate interfaces appear due to the grouping of devices in the probe inventory.
2.0
What's new:
Created the snmpcollector Device Support tool. This website allows you to view supported SNMP devices, object
identifiers (OID), vendors, vendor certifications, metric families, and associated metrics. For more information, see sn
mpcollector Device Support.
Added the ability to create monitoring configuration templates. These templates allow you to apply consistent
monitoring configurations across multiple devices by using filters.
Added options to modify the polling frequency on devices, interfaces, and other components with template filters.
Added a default monitoring configuration template that supports At a Glance Reports.
Added discovery filters to control the list of devices that are retrieved from the discovery server. Discovery filters
govern the available monitoring targets that appear under the Profiles node.
The following settings are available in the probe configuration GUI:
Configure the SNMP port number in a device profile.
Override interface network speed.
Configure the polling frequency on a device, network interface, or component.
Removed the Metric Families node from the probe configuration GUI.
View raw interface table attributes when you select the Interfaces category for a discovered device. Attributes
include ifIndex, ifName, ifDescr, ifPhysAddress, ifAlias, ifType, ifAdminStatus, ifOperStatus, and ifMtu.
In UIM, added the ability to view the QoS metrics by interface.
1.61
What's new:
GA
July 2014
GA
June 2014
GA
May 2014
GA
April 2014
GA
March
2014
GA
March
2014
GA
February
2014
GA
January
2014
Metrics with a RollupStrategy of sum are now calculated as a rate over time since the last polling cycle.
The probe now supports floating-point numbers on robots where decimal points are represented as commas.
Added more metric families.
Added support for dynamic indexes to the Response Path Test ICMP, Response Path with ICMP Jitter, and Resp
onse Path with Jitter metric families. The dynamic index MIBs are cisco-ipsla-ethernet, cisco-rttmon, and ciscorttmon-icmp. This change was added to support Cisco network device monitoring tests. In these tests, new rows
are periodically created to store data with a new timestamp in the index.
Fixed an issue where changes to a component threshold value resulted in the removal of all other threshold values
1.6
What's new:
Added a servlet to view the internal status of the pollagent probe. The URL to access the servlet is <robotSystem>:
9715/Dump/Help.
Added automatic device rediscovery with an index shift occurs.
Added functionality for monitoring probe self-health.
1.42
What's new:
Added the ability to override the default speed in and out settings for interface devices. For more information, see the
updated Known Issues and Workarounds section.
Fixed an issue with the calculation of metric counter values.
1.41
What's new:
Fixed an issue where large values for the QoS metrics were calculated incorrectly.
Fixed an issue with inventory corruption when loading USM devices.
Removed the obsolete metric family Generic System CA. Metrics were moved into specific device metric families.
1.4
What's new:
Support for SNMP AES-192 and AES-256 privacy protocols.
Support for polling 50,000 interfaces every 5 minutes.
Added full support for Juniper devices.
Improved handling of index shift for router interfaces.
Improvements in performance for Discovery Server and detection of device components.
Corrected issues with invalid or duplicate device names appearing after rediscovery.
Updated memory and disk requirements, software requirements, and supported platforms.
Unsupported metrics no longer appear for device components. Previously, unsupported metrics would appear as
configurable but never collect data.
1.33
What's new:
Updated for use with SnapCA Unified Infrastructure Management 7.5.
Fixed an issue with QOS_pctDiscardsOut not appearing in USM.
Fixed an issue where SNMP V3 devices with AES encryption would appear in USM without default monitors.
1.32
What's new:
Minor bug fixes and performance enhancements.
Name changes for System Management Info-Router\Director and Anti-Virus Info metric families.
1.3
What's new:
Support for SNMP V3 3-DES, and AES-128 privacy protocols.
Support for CPU and physical memory metric families on Juniper devices.
Added the NMS RULE AdminStatus Down rule to filter interfaces that are administratively down.
Updated memory and disk requirements, software requirements, and supported platforms.
1.2
What's new:
GA
December
2013
GA
September
2013
What's new:
Improvements in performance, especially discovery. Usability improvements to UI.
Added the ability to monitor Cisco devices, similar to the functionality of the cisco_monitor probe.
Added the ability to set rules for monitoring configurations. These rules allow you to set different thresholds that are
based on nonpolled configuration information (for example, based on ifType).
1.01
1.0
More information:
Install snmpcollector
snmpcollector Known Issues and Workarounds
snmpcollector (SNMP Data Monitoring)
Contents
Note: The provided values are based on the probe collecting 400 metrics per device. If you collect using more devices for this number
of metrics, you need to allocate additional memory.
Contents
Value
snmpcollector
pollagent
10,000
Xmx Memory
1.5 GB
1 GB
Disk
4 GB
6 GB
Xmx Memory
4 GB
3 GB
Disk
8 GB
12 GB
Xmx Memory
6 GB
4 GB
Disk
12 GB
24 GB
Xmx Memory
18 GB
16 GB
Disk
28 GB
40 GB
Xmx Memory
28 GB
24 GB
Disk
36 GB
56 GB
50,000
100,000
500,000
1,100,000
Value
snmpcollector
pollagent
1000
Xmx Memory
2 GB
4 GB
Disk
4 GB
6 GB
Xmx Memory
4 GB
8 GB
Disk
8 GB
12 GB
Xmx Memory
6 GB
12 GB
Disk
12 GB
24 GB
Xmx Memory
8 GB
16 GB
Disk
18 GB
40 GB
Xmx Memory
20 GB
16 GB
Disk
26 GB
56 GB
5000
10,000
20,000
50,000
Value
snmpcollector
pollagent
1000
Xmx Memory
1 GB
2 GB
5,000
10,000
20,000
50,000
Disk
2 GB
3 GB
Xmx Memory
2 GB
4 GB
Disk
4 GB
6 GB
Xmx Memory
3 GB
6 GB
Disk
6 GB
12 GB
Xmx Memory
4 GB
8 GB
Disk
9 GB
20 GB
Xmx Memory
6 GB
13 GB
Disk
13 GB
28 GB
Value
snmpcollector
pollagent
1000
Xmx Memory
1 GB
2 GB
Disk
2 GB
3 GB
Xmx Memory
2 GB
4 GB
Disk
8 GB
12 GB
Xmx Memory
3 GB
6 GB
Disk
12 GB
24 GB
Xmx Memory
4 GB
8 GB
Disk
20 GB
40 GB
5,000
10,000
20,000
Do not assign the same priority to more than one rule. If you do so, the existing rule with that priority is overwritten (deleted).
Data Storage
Depending on your configuration, the snmpcollector probe can generate a large amount of data. For example, just one probe collecting 500,000
metrics every 5 minutes for a week can fill a hundred gigabytes of database space. Allocate and maintain a sufficient amount of memory in your
database.
Install snmpcollector
This article contains recommendations and considerations for installing the snmpcollector probe software.
Contents
Installation Considerations
Performance and Scalability Considerations
Install snmpcollector
Adjust Probe Memory Settings
pollagent Memory
snmpcollector Memory
Hub Configuration
Installation Considerations
Consider the following information before you install the probe:
The recommended way to install the snmpcollector probe for a production environment is with the meta-package. The meta-package
installs the snmpcollector probe with all software packages that are required to enable all the snmpcollector probe features.
Note: If required by your organization, you can still install each of the probe required software software components
individually. Download the probes from the internet archive into your local archive and then install the packages on the
appropriate hub. For more information about required software components, see snmpcollector Software Requirements.
The minimum software packages that are required to run snmpcollector to generate only QoS data are:
On the primary hub CA Unified Infrastructure Management server
You might require updates to ci_defn_pack, mps_language_pack and wasp_language_pack
The following minimum probe packages in the local archive on a hub:
snmpcollector
pollagent
ppm (automatically installs the required probes prediction_engine and baseline_engine)
If you want to enable alarms, install on a remote (not primary) hub:
prediction_engine
baseline_engine
NAS to install the alarm_enrichment package
Warning! These packages are required to enable alarm functionality in the snmpcollector probe. Do not uninstall
prediction_engine and baseline_engine once installation is complete.
The threshold alarm options only appear in the probe configuration GUI if you install these probes. We recommend that you do
install these probes. These probes provide useful features such as Time to Threshold and Time Over Threshold alarms.
On Linux/Unix systems, the /etc/hosts file should contain an entry with the FQDN for the installation system.
snmpcollector on a hub with other probes that can also consume a large amount of system resources. Some examples of these probes
are vmware, icmp, and ibmvm.
Use filters as much as possible rather than creating multiple templates. Filters let you control how the probe applies monitors based on
attributes of the target device. Every template that you create is read separately by the probe. The probe uses a large amount of system
resources to read each template.
Install snmpcollector
Follow these steps:
1. Review the installation considerations and performance and scalability considerations. These sections contain important information
about where and how to deploy the snmpcollector probe and information about alternative installation options.
2. Verify that the local archive has the packages for all the minimum probe versions. For specific information about your snmpcollector
version, see snmpcollector Software Requirements.
3. Verify that you installed any required mps_language_packs, ci_defn_packs, or wasp_language packs.
4. Restart nis_server and wasp.
5. (Optional) Import the probe meta-package (snmpcollector_hub_metapackage.zip) into the local archive.
6. Choose one of the following options:
(Versions 2.1 and later) Deploy the probe meta-package on the appropriate hub. Use the meta-package to install the current version
for snmpcollector probe packages that exist in the local archive. The meta-package installs the snmpcollector probe with all the
software packages required to enable all the snmpcollector probe features. Do not use the meta-package to install individual software
components. The meta-package installation fails if all the packages are not present in the local archive.
pollagent Memory
Follow these steps:
1. In Infrastructure Manager, select and right-click on pollagent.
2. Select Edit from the menu.
3. In the Arguments box, change the number in -Xmx1536m. The number following Xmx is the maximum amount of memory in megabytes
consumed by the probe.
4. Click OK.
snmpcollector Memory
Follow these steps:
1. Enter the Raw Configure menu for the snmpcollector probe.
2. Select the startup node.
3.
3. Select options.
4. Change the number in -Xmx1024m. The number following Xmx is the maximum amount of memory in megabytes consumed by the
probe.
5. Click Apply.
Hub Configuration
You set up queues on the hub to transfer snmpcollector data as part of the CA Unified Infrastructure Management configuration process. The
snmpcollector probe can send a large amount of data through these queues.
If the size of a get or post queue never shrinks to zero or if it always has many messages, increase the Bulk Size on the queue. The bulk size
setting allows the hub to transfer multiple messages in one packet. For more information about hub configuration, see the appropriate guide for
your hub version.
(snmpcollector v2.x and later) If NAS is installed on a remote (not primary) hub:
Set up queues on the remote hub to post data to the primary hub. The remote hub requires a queue for QOS_MESSAGE,
QOS_DEFINITION, QOS_BASELINE, and a queue for probe_discovery to export messages.
Configure NAS forwarding as "All alarm events in both directions" on the remote hub. The destination is the primary hub.
Upgrade snmpcollector
This article provides information about upgrading snmpcollector.
Contents
Upgrade Considerations
Upgrades with Templates
Memory and Disk Settings
Performance Reports Designer Data
Migration from 1.x to 2.x
Version 2.x Migrating Deprecated Metric Family Configuration
Upgrade Environment
Upgrade the Probe
Upgrade Considerations
Review the following considerations before you upgrade the snmpcollector probe or pollagent.
Note: You can disable the publishing of data by host name in the probe Raw Configure setup options.
There is no migration path built into v2.0. To upgrade from version of 1.x snmpcollector to version 2.x or later requires a reinstallation of the probe.
Existing data is lost during this process. Before you install v2.x, remove the current 1.x version and existing configuration.
Uninstall the current version of snmpcollector and pollagent.
Clear the associated probe directories:
<Nimsoft>/probes/network/snmpcollector
<Nimsoft>/probes/network/pollagent
Use Infrastructure Manager to upgrade the probe on the remote hub.
Alternate Interface
Interface
Fan
Power Supply
Temperature
Upgrade Environment
Deactivate snmpcollector and pollagent when you upgrade the probes. This will minimize possible issues related to the corruption of the probe
configuration.
to monitored devices.
Issue:
The speed for high speed interfaces (> 4 Gbits/sec) that appears on the interfaces tab in USM is incorrect.
Workaround:
Do not use the value for interface speed. This number is an approximate static maximum value.
Issue:
The various threshold settings for a metric are used to generate baseline data in USM. The scale of the monitoring environment has a
direct impact on the calculation of baseline data. In larger scale environments, it could take significantly longer for the data to appear in
USM.
Workaround:
Only enable the threshold settings that are necessary. No workaround exists at this time.
Issue:
The discovery query can take a long time to complete if there is a discovery filter with a subnet mask.
Workaround:
It can take some time for the devices to load if the discovery scope is for a large number of devices. Entering range scopes larger than
65,536 addresses (subnets greater than /16) impacts discovery performance. Wait for the discovery process to complete.
To verify the status of the device import, select the snmpcollector node, and view the Profile Import Status field.
To verify the status of subcomponent discovery, select snmpcollector > Profiles > profile name >device name, and view the Component
Discovery field.
Avoid changing an existing monitoring template. Create a new template (uses the values in the default) and apply the new template to the
devices.
Issue:
In version 1.0, discovery can take longer than expected. The snmpcollector version 1.1 probe greatly improves performance of discovery.
Workaround:
To verify discovery progress:
Version 1.0
In the probe GUI, click the SNMP Collector Probe node in the tree and view the Discovery Status field.
Version 1.1 and later
In the probe GUI, click a device name in the tree and view the Component Discovery field.
Issue:
I see an I/O communication error in the snmpcollector.log file with the following message:
Note: SNMPv3 requires that your network community strings are at least eight characters long.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Compatibility
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.90
Fixed Defects:
GA
October
2015
GA
May 2015
For new hosts, the probe applied the last used timeout value instead of the default timeout value. Salesforce case
00164867
The probe sent alarms for individual samples even before the required number of samples were collected. Salesforce
case 00168336
Note: The samples are collected when a profile calculates the average value of the monitored metric.
The probe configuration interface displayed incorrect values when a numeric value was extracted from a string value. S
alesforce case 70002968
1.89
What's New:
Added feature to modify the host not found error message and the host found clear message from the Setup window.
1.88
Fixed Defects:
April 2015
The probe did not display metrics for some OIDs until all the MIB files were deleted from the MIBS directory. Salesforce
case 00142442
Device Id was not displayed for the Agent Not Responding alarm due to which Maintenance mode was unable to stop
SNMP failure. Salesforce case 00142524
1.87
Fixed Defects:
October
2014
Fixed a defect where the user was not able to specify any limit for log file size. Salesforce case 00144902
1.86
Fixed Defects:
July 2014
Implemented the regex support for comparing string values on IM probe GUI and displaying appropriate color of the
counter. This option works only with = and != operators. Salesforce case 00122120
1.85
Fixed Defects:
April 2014
When the Threshold was set as , the expected result for operator != must be . The probe was issuing alarm [ even
with the expected result.
While OID configuration for timetick, runtime error was being thrown if the Enable Monitoring check box was
unchecked after entering an invalid threshold value.
1.84
Fixed the issue of SNMP V3 authentication where you must restart the probe for applying the SNMP V3 authentication
credentials. Now, the updated SNMP V3 credentials take effect without restarting the probe.
January
2014
Fixed the issue of probe GUI where the probe was showing the duplicate OID. The probe was not able to validate a
static OID is already configured in the profile and same OID is configured again through a template. Now, the probe
shows an error message of duplicate OID.
1.83
June 2013
Fixed a defect where value was wrong on GUI (value was always "0").
1.82
February
2013
1.81
June 2012
Fixed issue of appending "?" for wrong OIDs. Fixed Runtime error "424".
Fixed: Value for "regex to numeric value" enabled templates is coming 0.
Fixed incorrect timeticks value in QoS.
1.80
Added new callback to return all active profiles and their active children.
June 2012
Revert back to Robot address as default qos source (as used pre-1.66).
September
2011
1.66
Provided option for selecting different QoS Source and Alarm Source.
August
2011
1.65
Rebuilt with latest versions of snmp libraries. (v1.63/v1.64 had problems with SNMPv3 (authNoPriv, authPriv) on 32 bit
Linux systems).
May 2011
1.64
March
2011
1.63
February
2011
1.62
January
2011
January
2011
June 2010
1.51
May 2010
1.50
April 2010
1.41
Fixed "Variable has bad type" value displayed in GUI (and fetched by probe) when OID is actually missing.
January
2010
Added fix to copy all OID parameters while dragging from template to profiles.
October
2009
1.39
Fixed error while reading new QoS definition which contains has_max value.
September
2009
1.38
August
2009
Note: Previously the group was hardcoded to use QOS_SNMP_VARIABLE even if the group value was specified by user.
Now the group would use the group value specified by user in the GUI.
1.37
July 2009
1.36
The OID available alarm clear is sent only when OID missing alarm is previously raised.
July 2009
Note: Upgrade from version 1.33, 1.34 and 1.35 to 1.36 will change the QoS target from "profile.oid" to "profile.oid name" OR
"oid" to "oid name".
1.35
Added fix to save encrypted community string if 'Encrypt community string' option is selected.
June 2009
1.34
June 2009
1.33
June 2009
October
2007
1.22
August
2007
1.12
February
2007
December
2006
Fixed issue regarding "large" Counter32 values not being sent correctly to QoS db.
1.10
Fixed issue regarding dashboards (and Probe Utility) not working correctly when communicating with probe.
March
2006
1.09
December
2005
Added possibility to set a user defined message string when a CLEAR message is sent.
Added between and not between threshold operators.
Fixed issue related to incorrect status of device without oids.
Added possibility for changing of Subsystem ID.
Fixed issue related to QoS not being sent.
Fixed various UI.
1.07
Fixed issue with many concurrently executing profiles and source name-resolution.
November
2005
Fixed issue with OID template override. Fixed various minor UI problems.
Added variable monitor to UI to ease configuration and device troubleshooting.
Compatibility
Probes that support SNMP on Linux (interface_traffic, snmptd, and snmpget) use an SNMP library. This library can cause newer Linux systems to
issue the following message in the Linux console log:
The SNMP library supports older versions of glibc which require the SO_BSDCOMPAT flag for sockets to work correctly. The network section of
the glibc library sends this message. The message shows that an unsupported flag is being sent to the setsockopt function. The library ignores
the SO_BSDCOMPAT flag, so you can also ignore it.
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.34
What's New:
GA
September
2015
1.33
What's New:
GA
April 2015
GA
February 2015
Fixed Defects:
Fixed the issue of trailing .0 at the end of the Variable Binds in SNMP Traps generated by the probe. Salesforce case
00151851
1.31
Fixed Defects:
June 2014
Fixed a defect where the log file size is not user-configurable. Salesforce case 131654
1.30
May 2014
1.22
Fixed the issue showing incorrect source detail while receiving trap in third party Trap Receiver.
August 2012
1.21
Fixed a defect in reading user tags (if present) from the udata section of alarm messages.
December
2010
1.20
September
2010
Fixed the issue of nas IP being sent as alarm source in case of repost messages.
Fixed the issue of probe not sending traps on clear severity alarms.
Fixed the issue of probe not working for post_message.
1.07
Initial release.
May 2010
December
2009
April 2005
Known Issues
The known issues of the probe are:
The snmpgtw version 1.11 and later supports sending traps with trap_type 0. Hence, all the pre-configured profiles with traps mapped to
'0' start sending traps with trap_type 0. If traps with trap_type 0 are not required, modify the configured profiles manually. See the respecti
ve configuration article for details.
snmptd (Simple Network Management Protocol Trap Daemon Monitoring) Release Notes
The Simple Network Management Protocol Trap Daemon Monitoring (snmptd) probe enables you to receive SNMP trap messages from other
monitoring tools. Based on these messages, you can generate alarms using the probe.
The probe acts as a gateway from the SNMP environment to CA Unified Infrastructure Management (CA UIM). It also converts SNMP-TRAPs to
CA UIM alarm messages. Network devices, such as routers, switches, and bridges are SNMP driven. The devices report error conditions in the
form of SNMP-TRAP, which are sent to a directed UDP port (Default - 162) in the network. The SNMP-TRAPs can be sent to a management
station such as HP OpenView Network Node Manager. The probe listens to the specified port and converts the incoming traps according to the
defined profiles. The probe monitors the incoming trap messages from other monitoring tools and converts them to CA UIM messages.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for the snmptd probe.
Version
Description
State
Date
3.18
Fixed Defects:
GA
October
2015
GA
August
2015
GA
May 2015
When creating a profile, the probe did not retrieve the specific trap number for a vmware trap. Salesforce case
70003379
The probe displayed only the default message text in profiles if multiple profiles were created simultaneously. This
situation occurs when the Enhanced MIB Parsing feature is enabled. Salesforce case 00170395
The probe assigned the default severity to new traps when the Enhanced MIB Parsing feature was enabled. Salesforc
e case 00166577
The probe GUI became unresponsive when performing any action such as saving the probe configuration. Salesforce
cases 00162781, 00156506
Updated Known Issues and Workarounds for an issue where active profiles are not displayed in the probe configuration
interface. Salesforce case 00125427
3.17
What's New:
Added ability to add PDU variables from traps to profiles automatically.
For more information, see the General Setup Window section in the v3.1 snmptd IM GUI Reference article.
Fixed Defect:
Multiple trap profiles cannot be created from the same MIB file with different enterprise identifiers. Salesforce cases:
00163170, 00161291, 00161615
For more information, see the Known Issues and Workarounds section.
3.16
What's New:
Added support for pure IPv6.
Fixed Defect:
If the alarm suppression key is too long, the probe restarts every time that a trap is captured. Salesforce case 00153961
3.15
Fixed Defects:
April 2015
Probe was unable to create more than one new profile for CISCO-BGP4 from the MIB Trap Browser. Salesforce case 0
0158238
The IF-MIB:linklUp was displayed under NS-ROOT-MIB in the probe when importing NS-ROOT-MIB.txt from the MIB
Setup Wizard. Salesforce case 00150607
Probe did not display the QoS on number of traps if the probe configuration was saved until a new trap is generated. Sal
esforce case 00153813
Default alarm was not generated if alarm text in PDU is blank. Salesforce cases: 00155958, 00156323
Probe did not start if Enhance MIB Parsing was enabled. Salesforce cases: 00154624, 00156648
The SNMP v2 trap profiles in folders moved to the SNMP v2 Traps Unknown MIB folder if Enhance MIB Parsing was
enabled and probe was restarted. Salesforce case 00157560
3.14
Fixed Defects:
February
2015
sysUpTime and snmpTrapOID were included as numbered variables. Now they are visible as configurable variables for
v2 and v3 traps. Salesforce case 00127360
All the trap profiles with the same name but associated with different MIBs are not displayed. Salesforce case 00133041
MIB Trap Browser does not display modules or traps if Enhance MIB Parsing is selected or deselected in General Setup
and probe is restarted from within the snmptd interface. Salesforce cases: 00141428, 00151195
Alarms for traps have modified and garbled variable values when Remove Double Quotes is selected in General Setup
when deployed on the Community Enterprise Operating System (CentOS). Salesforce cases: 00138829, 00147481
Probe converts only the alphanumeric characters and not all ASCII values. Salesforce cases: 00148310, 00149058,
00149872
3.13
Fixed a defect where new trap profiles were displayed in different groups after saving the probe configuration and
restart. Salesforce case 00138814
October
2014
Fixed a defect where trap variables were displayed in hexadecimal. Salesforce case 00138835
Fixed a defect where probe GUI was not working while adding similar Engine ID with different case due to case
sensitivity. Salesforce case 00141346
3.12
Fixed Defects:
July 2014
Added default varbinds for SNMP V2 profiles, which are created from the MIB trap browser. Salesforce case 00127360
Fixed the issue of incorrect parsing of IP address when saving the IP address in a String type variable. Salesforce case
00130094
Fixed the issue of displaying the IP address of the system instead of hostname of the source. Salesforce case 0011619
7
3.11
Fixed an issue in the SNMP Trap Monitor dialog where alarms from the snmptd probe displayed IP address instead of
hostname for the Hostname field.
April 2014
For SNMP traps, the Rename functionality available on right-clicking of a profile is removed. In case, you must rename
a profile, use the Edit option.
3.10
What's New
January
2014
A customized way to read MIBs that enables auto-populating of the severity and message text for traps of supported
MIBs. By default, this feature is turned-off.
Fixed Defects
Defect fixed related to the import of trap descriptions by implementing the feature.
Defect fixed related to garbled text in alarms by adding the support for that.
3.03
October
2013
3.02
September
2013
June 2013
3.00
Added a feature to prepopulate the OIDs to be monitored for unidentified received traps while creating the profile.
May 2013
Added a feature to use the source address or the agent address to associate a trap to a device.
Merged the cim_traps profiles and dom_traps profiles with the snmptd probe.
Added a feature to substitute value codes with predefined value meanings i.e.evaluate the varbinds.
Added a feature to load / compile a trap MIB (and its MIB dependencies). Added the profile that is based on the trap
definition on the MIB.
Added a feature to interpret Compaq Insight Manager so that traps are interpreted correctly.
Added a variable $MIB_DESCR to provide the actual trap description that is defined in the MIB.
2.14
July 2012
Rebuilt with latest versions of snmp libraries. (v2.10 had problems with SNMPv3 (authNoPriv, authPriv) on 32-bit Linux
systems).
March
2011
2.00
December
2010
Fixed the crash issue due to variable 0 defined in variable rules of any profile.
Fixed the issue of MIB loading errors in the net-snmp library.
October
2010
Added fix to unlock the config file while unloading the configurator tool.
Added the fix in the PDU variable window to avoid zero in 'variable' field.
Added the fix in the trap details window to display all characters in the variable.
Added code to set a trap name as the profile name while creating a profile.
1.91
Added the missing "Remove double-quotes" functionality for string variables in alarm messages.
October
2010
1.90
October
2010
Added fix to provide the option to remove double-quotes from variable values.
Added fix to support regular expressions in variable rules.
Added fix to send OID information in alarm messages.
Added support to log errors while parsing MIBs.
1.80
Added support for the generic profile (to trap all messages).
June 2010
May 2010
1.70
May 2010
May 2010
Fixed the suppression key in cases where a custom suppression key is specified and PDU variable matching is
performed with "process all rules" turned off. Probe no longer appends the variable number to the suppkey on a PDU
variable rule match. If "process all rules" is turned on, there is no change (variable number is appended to suppkey on
the match).
May 2010
To revert to pre-v1.65 suppression key behavior when using PDU Variables Rules with the custom suppression key,
create key /setup/pre165suppkeys = 1 using raw configure.
Fixed the suppression key when no variable rules are defined and no custom suppkey is specified. Previously the probe
sent NULL as the suppression key when using the described settings.
1.64
Fixed threshold checking of PDU variables (no translation of enums before processing).
May 2010
1.62
December
2009
1.61
September
2009
Added the translation of PDU variables from integers to enumerated string values in Alarm messages. For example,
DiskStatus is defined in a MIB as 1=ok, 2=failed, 3=other". "Disk Status is 1" is now "Disk Status is ok".
Fixed logical grouping of SNMPv2 traps (groups SNMPv2 traps on the MIB module that defined the trap).
Allowed multiselect of the MIB files when adding MIBs using the MIB Setup Wizard.
Added support for Counter64 in the varbinds.
Fixed various minor issues in the GUI. Fixed $C (community) variable for use in the alarm messages.
1.55
March
2009
1.53
Fixed trap source decoding. snmp_pdu transport_data_length changed after update of nimsnmp.
July 2008
1.52
June 2008
1.51
Fixes:
December
2007
The ip mask that is entered for host deny could be interpreted as numeric, caused a crash of configurator.
Fixed the configurator hang for specific traps when creating profile from the trap monitor.
Nimbus SNMP-TRAP only posted if the convert flag is set.
Deletion of the MIB files added.
The Default message always present. Added check box for not sending the default message if no match on pdu variable
rules.
Features added:
In the SNMP monitor, copy to clipboard of selected entries, by Ctrl+C, or from the right-click menu.
Ctrl+A selects all. QoS added.
Source is IP, and target is the trap name or oid and specific trap number if available. (enterprise specific with oid of
enterprise and specific trap type, version 2 with trap object identifier). A QoS is the number of times a specific trap has
been received during the interval. The interval is default 1 minute, and can be set in the general setup.
The snmptd probe listening on multiple ports.
Fixed an error on Linux, causing high cpu usage.
Fixed GUI display error of large values for the "Specific trap number" field in the SNMP Trap Monitor window.
Fixed GUI run-time error when specifying numeric community string.
1.43
Added the possibility to "Process all rules" in the PDU variables section. Added support for 20 PDU variables pr.
trap(previously 10).
Added right-click menu in the PDU variables section which allows the user to move the PDU variables up and down.
Added support for use of variables on the "Message" field in the PDU variable section.
Added support for use of variables in "Source" and "Suppression key" fields.
December
2005
Installation Considerations
Consider the following point when installing the probe:
Ensure that ports used are free (UDP/162 is the default port). You can issue the netstat -an command to search and confirm the port
status. For example, if the UDP 0.0.0.0:162 port is present, then some other application, such as HP OpenView and Compaq Insight
Manager are using this port.
Upgrade Considerations
Consider the following point when upgrading the probe:
Upgrading the probe from version 1.9x to version 2.0x to 3.0x
The probe merges the SNMPv3 and SNMPv2 checkpoints on upgrade. If the checkpoints have same names in the SNMPv2 and
SNMPv3 sections, the SNMPv2 checkpoint configuration takes precedence. As a result, SNMPv2 settings override the SNMPv3 settings.
Contents
Revision History
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.39
Fixed Defects:
GA
October
2015
GA
April 2015
GA
October
2014
The probe was unable to display all the profiles for a group, if the group was renamed. Salesforce case 00167300
Note: See Upgrade Considerations section for more information.
The probe returned OID error if the OID value (string) had double quotes, even after setting the DTA file correctly. Salesf
orce case 00166165
Updated the Known Issues and Workarounds section in the Release Notes to state that the probe displays the same
ambient temperature of all instances with the same name. Salesforce case 00138234
1.38
Fixed Defects:
Fixed a defect in which incorrect status alarms were generated if a period (.) was not placed at the beginning of the OID.
Salesforce case 00150796
Note: It is recommended to place a period (.) at the beginning of the OID.
Fixed a defect in which QoS returned null values if dynamic variable (X) was placed in the OID. Salesforce case 001520
53
Note: Dynamic Variable is only supported at the end of the OID.
Fixed a defect in which the probe crashed when loading pools data. Salesforce case 00155852
Fixed a defect in which the probe crashed on Windows Server 2008 r2. Salesforce case 00144480
1.37
Fixed Defects:
Fixed a defect in which the probe was restarting in loop when deployed on CentOS 5 and 6. Salesforce case 00127396
Fixed a defect in which the probe was adding multiple duplicate entries on adding new devices for monitoring. Salesforce
case 00120888
1.36
Fixed Defect:
January
2014
The probe was unable to fetch the interfaces (dynamic OIDs) at a port other than the default port (161).
1.32
1.31
The defect snmptoolkit config tools alarm state is incorrect for some types of checkpoints has been fixed.
Added a feature to specify wildcard or regex in profile_name value in callback.
Fixed a defect where probe was adding extraneous carriage returns to the DTA files uploaded by the probe.
January
2013
December
2012
Fixed a defect where host severity was not properly loaded in UI.
1.2
June
2012
1.04
December
2011
The snmptoolkit probe should be installed on systems with the following minimum resources:
Memory: 2-4GB of RAM. Probe's OOB configuration requires 256MB of RAM'
CPU: 3GHz dual-core processor, 32-bit or 64-bit
Upgrade Considerations
This section lists the upgrade considerations for the snmptoolkit probe.
In the probe version 1.38 and earlier, the probe was unable to display all the profiles for a group, if the group was renamed. When you
upgrade the probe from a previous version to version 1.39, you must delete the profiles (existing in the group that was renamed in the
earlier versions) from the raw configuration section and then create new profiles.
hdb.
See the controller Release Notes for release information about the robot.
Revision History
Version
State
Date
7.80
GA
June 2015
7.70
GA
March 2015
7.63
GA
December 2014
7.62
GA
November 2014
7.60
GA
June 2014
7.05
GA
March 2014
7.10
GA
December 2013
Fixed Defects
If ssl_mode=0 when spooler starts, the SSL server setup should not execute.
In version 7.70, if ssl_mode=0 when the spooler starts, the SSL server setup executes improperly.
In version 7.80, if ssl_mode=0 when the spooler starts, the SSL server setup fails to run. The behavior is consistent with the
behavior of the hub and controller.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues
Revision History
This section describes the history of revisions for this probe.
Version
Description
State
Date
1.65
Fixed a defect where the probe did not accept passwords with semicolon (;) while creating an OLEDB connection. Salesforce
case 00159821
GA
December
2015
1.64
Fixed Defects
GA
January
2015
GA
December
2013
Fixed a defect where the probe did not process any NULL values. Salesforce case 00140172
Fixed a defect where the probe did not clear the alarms when the connection was configured as <server IP>,<port>. Sal
esforce case 00147592
Fixed a defect where the user had to test the connection when editing any profile. Now, a user can now test a query
without testing the connection, after editing a profile. Salesforce case 00134063
1.63
Fixed Defects:
Defect fixed relating to met_id and dev_id of ODBC. The met_id and dev_id did not appear in the case of ODBC, which
was fixed in this defect.
Defect fixed relating to Windows Authentication. The probe did not work when the Windows Authentication check box is
selected, which is fixed in this defect.
1.62
Fixed Defect:
Defect fixed relating to trailing zeros by removing the extra zeros appearing in the alarm threshold.
GA
June 2013
1.60
GA
March
2013
1.61
GA
February
2013
GA
August
2012
1.60
Added a callback function to fetch active profile using wild card or regex expressions in profile_name.
Fixed an issue where "datetime" type variables were not properly displayed in alarm messages.
1.53
June 2012
1.52
December
2011
1.51
Fixed an issue where row key column value was not getting expanded for alarm suppression key and message variable.
June 2011
1.41
Fixed an issue where the probe was considering sql connection timeout as 0 irrespective of setting it to any value in
config file.
December
2010
September
30 2010
Fixed QoS issue where definition are not sent when configured in value section if no data is returned by query.
1.40
September
13 2010
June 2010
1.28
Removed validations for the initial catalog field for database connections
December
30, 2009
1.27
Fixed security token leak by closing the security tokens when they are not required.
December
18, 2009
1.26
Changed the GUI to allow specifying the name for the QoS.
Fixed the issue of QoS definition not sent for all the QoS in the QoS list.
1.25
Error occurred on Test connection in case of OLEDb and ODBC with data source specified is fixed.
Problem with probe getting restart for both Apply and Ok actions fixed.
October
2009
August
2009
March
2009
October
2008
1.21
August
2008
1.20
July 2008
December
2007
1.16
September
2007
July 2006
Changed scheduling functionality to prevent the combination of 'run once' and 'exceptions'
For 'value' checking, all query results can be used in alarm messages, using column names as variables, preceded by $
(case sensitive!)
1.07
June 2006
April 2006
Known Issues
The probe has the following known issue:
Connection to ODBC is not successful if semicolon (; ) is used in the password.
September
2005
Revision History
Supported Versions
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Setting Permissions
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
4.94
Fixed Defects:
GA
January
2015
GA
October
2015
GA
July 2015
GA
December
2014
Probe was not generating QoS Messages at the specified poll interval. Support case number 246108
Probe was not displaying the value in Bytes when lock_memory, connection_memory, optimizer_memory,
sqlcache_memory, and total_memory checkpoints were executed. Support case number 246026
4.93
Fixed Defects:
Probe was reporting a connection timeout in a time less than that specified in the checkpoint interval. Salesforce case 0
0154897
Probe was displaying an error message while executing agent_job_failure checkpoint using Windows authentication. Sa
lesforce case 00155101
4.92
Fixed Defects:
Fixed a defect where the template checkpoint was overriding the static checkpoint. Salesforce case 00165905
Fixed a defect where the QOS was not always coming through Use FQDN As QoS Source. Salesforce case 00144129
Fixed a defect where the remote probe was reporting a connection timeout before the scheduled checkpoint interval. Sal
esforce case 00154897
Added user setting information for SQL Server 2014. Salesforce case 00162883
4.91
Fixed Defects:
Fixed a defect where the Use Excludes check box is disabled when a new custom checkpoint is created. Salesforce
case 00151079
Fixed a defect where the user_cpu checkpoint returns the CPU usage value greater than 100%. Salesforce case
00147822
4.90
What's New:
Beta
December
2014
GA
January
2014
GA
March
2013
GA
March
2013
Fixed Defects:
On creating a threshold for alarms from Status tab, the object name was not appearing as per expectations.
During unauthorized query changes for the custom checkpoints, the alarm was generated and was repeating the
message again and again. This defect has been fixed.
The alert message is generated once and the checkpoint is deactivated as expected.
4.81
4.80
4.72
GA
January
2013
4.71
GA
January
2013
4.71
Updated the mirror state values in the hints section of mirror_state checkpoint.
GA
December
2012
GA
December
2012
4.70
Added a feature in custom checkpoints which will allow thresholds to be defined for multiple columns.
Added functionality to set up a schedule for each alarm in custom as well as built-in checkpoints.
Added functionality to set up key specific alarms in custom checkpoints.
Added new checkpoint(fg_freeSpace_with_avail_disk) which monitors free space in filegroups after considering the
available disk size.
Fixed check_dbalive checkpoint to send value "0" instead of "NULL" as QoS for connection failure.
Fixed an issue related to incorrect calculation for fg_free_space checkpoint.
Fixed an issue related to negative values for the checkpoints fg_free_space and free_space.
Fixed an issue in custom checkpoints where NULL QoS with invalid keys were generated in case of any databaserelated errors.
Fixed an SOC issue of CM Authentication Failure.
4.6
Fixed an issue where the "Windows Authentication" method lets you access the server with or without providing a valid
user account
GA
September
2012
GA
June 2012
GA
June 2012
GA
March
2012
GA
March
2012
Fixed an issue where the "Windows Authentication" method on local system is connecting using system account rather
than the domain account configure in the connection.
Added a feature "suppress all alarms" so that all the alarms are suppressed.
Added a feature so that a monitoring profile does not run concurrently and the delay alarm raised whenever the profile
run is delayed.
Fixed an issue where the profile shows template checkpoints even though the group is selected.
Fixed the issue with logfile_usage checkpoint which was reporting incorrect values.
Fixed an issue where the probe was crashing if "detect domain automatically" is selected on a machine which is not on
domain.
4.41
4.40
4.30
Fixed an issue where the probe was failing to pick up no. of samples(overridden) correctly for static checkpoints.
Fixed a logging issue where the probe was incorrectly logging sqlserver password in plain text.
Fixed an issue where in some cases the probe was failing to return any rows for custom checkpoint queries.
Fixed QOS V2 compatibility issue, earlier the probe was not able to send the QOS as per V2 QOS specification
Fixed an issue in manual signed stored procedure feature related to permissions.
Fixed an issue where the probe was incorrectly converting metric units(KB/MB/GB etc.) for some checkpoints
Fixed an issue where the probe was not able to return any rows in case of complex custom checkpoint queries.
Fixed an issue in logfile_size and logfile_usage checkpoints where the checkpoints were failing when any database is in
the middle of recovery. The probe now skips the databases which are being recovered until the recovery is complete
and database is online.
4.22
4.21
GA
December
2011
4.20
SOC Support Added. Added support for signed store procedure for standard and custom checkpoints queries. Probe can be
run in standard as well as in sign mode. Added new checkpoints mirror_state, mirror_witness_server and mirror_sqlinstance
for monitoring Database Mirroring state, status of witness server and status of sql server instance hosting mirroring database.
Fixed an issue where sqlusr_cpu store procedure are not deleted after executing queries in case of SQL Server 2000.
Modified qos_key value for user_cpu checkpoints for avoiding large amount of QoS. Fixed an issue related to subsystemid
field where subsystemid shows wrong value. Fixed an issue where long_jobs checkpoint do not send any alarms. Fixed an
issue where logic_fragment checkpoint gives Lock request time out error. Fixed Handle leak issue. Added support for
configuring unit as minutes, hours and days in backup_status, transaction_backup_status and differential_backup_status
checkpoints. Added a new error alarm message that will be send in case of checkpoint query execution failure.
GA
August
2011
GA
April 2011
GA
September
2010
4.11
4.01
4.00
Added a new checkpoint logfile_size for reporting database log file size in MB.
GA
September
2010
Fixed security token leak by closing the security tokens when they are not required.
GA
August
2010
3.13
In case of custom checkpoints, the query password was not always saved properly. Fixed the query password encryption in
GUI.
Note: If any custom checkpoints are deactivated by the probe, those checkpoints will have to be deleted from the GUI and
will have to be added again in the probe.
GA
September
2009
3.11
GA
January
2009
GA
December
2008
GA
October
2008
3.07
3.05
3.03
GA
September
2008
3.02
GA
August
2008
3.00
GA
May 2008
GA
June 2007
GA
November
2006
GA
April 2006
GA
November
2005
problem with database names containing "-" or other special character solved
support for case-sensitive database added
"No response" parameter added into GUI
2.12
2.10
2.09
Supported Versions
The sqlserver probe supports the following SQL Server versions:
SQL Server 2005
SQL Server 2008
SQL Server 2008 R2
SQL Server 2012
SQL Server 2014
Notes:
The sqlserver v4.9 probe supports SQL Server 2014.
From April 2013 onward, Microsoft has discontinued the support for SQL Server 2000 version. Therefore, all enhancements
done for sqlserver probe v4.8 or later are not supported for SQL Server 2000.
Installation Considerations
You need to install probe on a robot to remotely monitor MS-SQL server or a PC installed with client software for SQL Server.
Setting Permissions
For SQL Server versions 2005, 2008, and 2012, 2014 set VIEW SERVER STATE permission on master database. Also, map the user for the
following databases:
master
model
msdb
ReportServer
ReportServerTempDB
tempdb
Usually, the default database roles are used to grant access to the users. However, you can edit and grant different roles and permissions to
different users. User mapping is done to GRANT, REVOKE, and DENY permissions for the user in the SQL Server.
Follow these steps:
1. Right-click on the table and go to Properties > Permissions Tab.
2. Select Add and browse for the user to grant permission.
3. Click OK.
4. Select the permission type under the Grant column.
User is mapped for the required permission. User mapping is required for the following tables:
master.sys.databases
master.dbo.sysperfinfo
msdb.dbo.sysjobsteps
msdb.dbo.sysjobs
msdb.dbo.syscategories
msdb.dbo.log_shipping_monitor_secondary
msdb.dbo.log_shipping_monitor_primary
msdb.dbo.sysjobhistory
.sys.database_files
.sys.partitions
.sys.allocation_units
.sys.internal_tables
.sys.filegroups
For Windows authentication, perform the following:
user_cpu
workspace_memory
If you have created the QoS definition for any of these checkpoints under V2 sqlserver probe, you must enable the checkbox "QoS V2
compatibility" in the General section of the probe. This ensures that all data is inserted correctly into the QoS database. If you want to use the V3
format (with the has_max flag), delete the V2-generated QoS definitions for these checkpoints (all data for these checkpoints will be deleted).
Preconfiguration Requirements
Configure Sybase Library Path
Installation Considerations
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.42
Fixed Defects:
GA
October
2014
GA
September
2014
GA
September
2014
Fixed a defect in which the probe was creating core dump files on Solaris platform. Salesforce case 00144984
1.41
What's New:
Added support for configuring the probe through the Admin Console (web-based) GUI.
1.40
What's New:
Added support for TNT2 compliance where Device Id and Metric Id are generated correctly.
Added support for encrypted authentication between the probe and the sybase replication database server for Linux and
Solaris OS.
1.32
Fixed Defects:
GA
January
2014
GA
March
2013
GA
March
2011
The order of keys that was written in cfg file was in a fixed sequence. Any deviation from this sequence caused the probe to
skip certain important keys resulting in improper configuration of the probe. This defect is fixed and now the keys can be
written in any sequence.
1.30
1.20
1.11
GA
September
2010
1.10
GA
July 2010
1.03
GA
December
2009
1.01
GA
October
2009
1.00
GA
May 2008
Preconfiguration Requirements
Software: Sybase OCS client 15.x, Sybase Replication Server 15.x, and Sybase ASE Server 15.x
Sybase Replication Server: Sybase Replication Server 15.x/Sybase ASE Server 15.x (only on Solaris sparc v9 and Linux 64-bit systems).
Libraries and Variables: Sybase libraries and environment variables must be set in the system PATH. For more information, refer the Configure
Sybase Library Path section.
Adaptive Server Enterprise 15.7: The property net password encrypt must be changed and set to a value "2" to enable the encrypted
communication between the probe and the Sybase database server.
Note: Ensure that the Sybase server is running and is accessible from the terminal.
SYBASE_WS: WS-15_0
SYBROOT: /opt/sybase
Sybase library path variables are configured.
Installation Considerations
The probe must be installed on a Linux/Solaris Robot with the Sybase server. For monitoring replication latency, the probe creates tables in both
primary and secondary database. Remove the table RSProbeLatency manually if you are uninstalling the probe.
1. Get the package sybase_rs.zip from Internet Updates.
2. Get a valid license from http://www.nimsoft.no.
3. Install sybase_rs.zip on the system running Sybase Client (use drag-and-drop from the CA Unified Infrastructure Management
Probes Infrastructure Manager Archive).
4. Double-click the sybase_rs probe for the probe configuration.
5. Go to the Connection tab and setup ASE Server and Replication Server connections.
6. Define and activate at least one monitoring profile.
Important! Delete the utility folder from the installation directory of the probe if you are only upgrading the probe. The utility folder
contains reference links from the previous version.
Note: CA Unified Infrastructure Management 8.0 or later is required for Admin Console GUI.
Known Issues
The known issues of the sybase_rs probe are:
The sybase_rs probe must be configured on either the Infrastructure Manager (IM) GUI or the Admin Console (AC) GUI.
The probe configuration for both the IM GUI and AC GUI is separate. For example, any profile that is created in the IM GUI is not
available on the AC GUI and must be recreated.
You can also configure these profiles to generate alarms and QoS when the specified threshold of an event is breached and identify performance
issues in servers. You can then diagnose and resolve these issues, and take preventive measures to ensure an optimal server run time.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Configure Sybase Library Path
User ID Authorization
ASE Configuration
Migration Considerations
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
4.30
What's New:
GA
October
2015
GA
January
2015
GA
September
2014
GA
January 2
014
GA
November
2012
Fixed Defects:
Fixed a defect in which the default threshold values for some alarms were incorrect. Salesforce case 00145161
Fixed a defect in which the probe GUI was not displaying the correct custom checkpoint schedules. Salesforce case
00145051
4.20
What's New:
Added support for encrypted authentication between the probe and the sybase database server for Linux and Solaris
OS.
Added support for Windows 2012 R2.
Fixed Defects:
Fixed a defect where DevID and MetID are not generated for alarm and QoS.
4.14
Fixed Defects:
Defect fixed related to sybase probe not able to display configuration correctly. Schedules were not visible in custom
checkpoints on reopening the probe GUI.
4.11
4.10
Fixed an issue with the database_size checkpoint where it was returning incorrect values when some DB has mixed
devices.
GA
September
2012
GA
May 2011
Fixed an issue where description of checkpoints buffer_memory, stp_memory and total_memory were incorrect.
Functionality to set up a schedule for each alarm in custom and built-in checkpoints added.
Functionality to set up key specific alarms in custom checkpoints added. Functionality to add custom checkpoints
added.
Functionality to generate QoS metrics from multiple columns that are returned by query in custom checkpoints added.
Thresholds on multiple columns can be created for custom checkpoints.
Support for AIX_5.3 64-bit added.
Support for Solaris 64-bit sparcv9 added.
3.53
Applied temporary workaround to memory leak issue by making the probe restart every midnight.
3.50
Added a checkpoint suspect_pages for reporting if suspect pages are logged for databases.
February
2011
Added a checkpoint agent_job_failure for reporting failed agent jobs within a defined interval.
Added new checkpoints ls_primary_status, ls_secondary_status, ls_primary_time_since_last_backup,
ls_secondary_time_since_last_copy, ls_secondary_time_since_last_restore and ls_secondary_last_restored_latency
checkpoints for monitoring Log Shipping in SQLServer 2005 and above.
Added a callback that able to specify wildcard or regex in profile_name value to fetch active profiles.
Fixed an issue where the active_connection_ratio checkpoint was not working.
Fixed an issue where the checkpoint schedule was running for an extra minute.
3.42
Fixed an issue where the probe was failing to pick up no. of samples(overridden) correctly for static checkpoints.
GA
September
2010
GA
September
2012
GA
August
2010
GA
August
2010
Fixed a logging issue where the probe was incorrectly logging sqlserver password in plain text.
Fixed an issue where sometimes the probe was failing to return any rows for custom checkpoint queries.
Fixed QOS V2 compatibility issue, earlier the probe was not able to send the QOS according to V2 QOS specification.
Fixed an issue in manual signed stored procedure feature that is related to permissions.
Fixed an issue where the probe was incorrectly converting metric units(such as KB, MB, or GB) for some checkpoints.
Fixed an issue where the probe was not able to return any rows in complex custom checkpoint queries.
Fixed an issue in logfile_size and logfile_usage checkpoints where the checkpoints were failing when any database is in
the middle of recovery. The probe now skips the databases which are being recovered until the recovery is complete
and database is online.
3.41
3.40
3.31
SOC Support Added. Added support for signed store procedure for standard and custom checkpoints queries. Probe
can be run in standard and in sign mode. Added new checkpoints mirror_state, mirror_witness_server and
mirror_sqlinstance for monitoring Database Mirroring state, status of witness server and status of sql server instance
hosting mirroring database. Fixed an issue where sqlusr_cpu store procedure is not deleted after executing queries in
SQL Server 2000. Modified qos_key value for user_cpu checkpoints for avoiding large amount of QoS. Fixed an issue
that is related to the subsystemid field where subsystemid shows wrong value. Fixed an issue where the long_jobs
checkpoint does not send any alarms. Fixed an issue where logic_fragment checkpoint gives Lock request time-out
error. Fixed Handle leak issue. Added support for configuring unit as minutes, hours and days in backup_status,
transaction_backup_status and differential_backup_status checkpoints. Added a error alarm message that is sent in
checkpoint query execution failure.
3.30
March
2010
GA
November
2008
3.24
Added a checkpoint logfile_size for reporting database log file size in MB.
GA
August
2008
Fixed security token leak by closing the security tokens when they are not required.
3.22
In case of custom checkpoints, the query password was not always saved properly. Fixed the query password encryption in
GUI.
July 2008
GA
June 2008
Note: If any custom checkpoints are deactivated by the probe, those checkpoints must deleted from the GUI and added
again in the probe.
3.20
GA
April 2008
3.12
GA
February
2008
3.11
GA
December
2007
GA
September
2007
Note: During the migration from sybase probe V2.xx it can happen, that the checkpoint "check_dbalive" threshold does not
get translated correctly into new value. In that case, the probe incorrectly report that the database server is not alive, even if it
is running.
Solution: the old threshold value "ONLINE" must be corected to the new value "1", using either the GUI or raw config.
3.10
2.06
The probe is now built with the new Sybase ASE 15 libraries, in addition to the ASE 12 libraries. A post-install program
determines which version of Sybase is running to ensure the correct version of the probe is used.
GA
August
2006
2.05
cfg parameter "noResponse_severity" introduced to change "Report Generation Time exceeded..." alarm severity
GA
November
2005
Installation Considerations
The sybase probe has the following prerequisites for installation:
Libraries and Variables: The probe requires the following library configurations:
libstdc++ 5 library must be present on the robot platform.
Sybase libraries and environment variables must be set in the system path. For more information, refer the Configure Sybase Library
Path section.
Software: Sybase OCS client 15.x or ASE 15.x.
Advanced Monitoring: Sybase Monitoring Server or Monitoring Tables must be installed and enabled.
Sybase: Sybase Monitoring Server must be up and running.
Adaptive Server Enterprise 15.7: The property net password encrypt must be updated to a value 2 to enable the encrypted
communication between the probe and the Sybase database server.
Note: Ensure that the Sybase server is running and is accessible from the Linux system.
User ID Authorization
The probe operates in basic and advanced modes. In the basic mode, the probe collects information from the sybase table accessible to the user.
In the advanced mode, the probe uses monitoring tables from the Sybase Adaptive Server Enterprise (ASE) to collect monitoring information of
the database from the ASE.
Basic Mode
Access to following tables is needed to run the probe in basic mode:
sysdatabases
spt_values
sysusgaes
sysprocesses
syscurconfigs
sysconfigures
Advanced Mode
Consider the following points to connect to the database in advanced mode:
User credentials to access Sybase server requires 'mon_role' authorization to connect to the database using Monitoring Tables.
Sybase System Administrator account such as 'sa' is used to run the probe using Monitoring Server API.
Note: For configuring the probe on Sybase server, Sybase Monitoring Server or Monitoring Tables must be installed and activated.
ASE Configuration
You must configure the following values in the ASE for monitoring the data collection of tables:
buf_cachehit_ratio
enable monitoring = 1
lock_requests, lock_requests_db, lock_requests_granted_db, lock_requests_waited_db
enable monitoring = 1
per object statistics = 1
object lockwait timing = 1
total_disk_io
enable monitoring = 1
stp_cachehit_ratio
enable monitoring = 1
locked_users (advanced, with sql text)
enable monitoring = 1
max SQL text monitored = 1024 or more
SQL batch capture = 1
sql text pipe active = 1
sql text pipe max messages = 256 or more (depends on interval length and server activity)
Migration Considerations
The probe has the following migration considerations:
Before upgrading the probe, delete the utility folder from the installation directory of the probe. The utility folder contains reference links
from the previous version.
If you migrate the probe from previous releases, only the old configuration file (sysbase_monitor.cfg) is migrated into a release 3
configuration file (sysbase_monitor_v3.cfg). Every instance from V2 is converted into one connection and one monitoring profile in V3.
Every profile starts one thread for SQL queries and one process as Monitor Server data collector.
From version 4.2 onward, the probe does not support advanced monitoring using Monitoring Server for Linux 64-bit systems. This is
because the Monitoring Server is not a part of the Adaptive Server Enterprise v15.7.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
Known Issues
Revision History
This section describes the history of the revisions for the sysloggtw probe.
Version
Description
State
Date
1.41
Fixed Defects:
GA
February
2015
GA
December
2012
GA
September
2010
Fixed an issue in which the dev ID was coming from local robot rather than actual device. Salesforce cases 00147258,
00147305, 00147258
Fixed an issue in which there was no alarm for backup file at the required interval, unless there was an Incoming log. S
alesforce case 00147052
Fixed an issue in which the probe was not using the configured path to delete the file. Salesforce case 00149998
1.40
What's New:
Added probe defaults for the probe.
Fixed Defects:
Fixed an issue in which log files were not getting deleted after the correct number of days set.
1.30
Added support for Windows 64, Linux 32/64 and Solaris platforms.
Fixed memory leak.
Added proper configuration file reading mechanism on probe restart.
Added proper thread termination and start functionality on probe restart and stop.
Added support to redirect messages in separate file and to provide variable names ($logsource and $date) in that
file-name.
Added support to store logs in separate log files as per the source (IP address) they are coming from.
Added support for probe logfile truncation.
Implemented file rotation algorithm (for message file). Added support for file rotation based on size or time along with
file cleanup after certain period.
1.21
GA
June 2010
1.20
GA
March
2010
1.12
GA
November
2003
Installation Considerations
Ensure that port 514/udp is free. You may do this by issuing the netstat -an command, and look for something like UDP 0.0.0.0:514. If it is
present, then something else, for example, a syslog daemon is using this port.
Upgrade Considerations
When upgrading from versions prior to 1.41 on Unix platforms, the log files must be manually cleared before upgrading.
Known Issues
Post-processing of SYSLOG-IN messages using the logmon probe is possible only when the matched pattern consists of ASCII characters.
sysloggtw probe cannot interpret incoming multi byte syslog messages from remote devices.
For example, consider four remote devices that send syslog messages to sysloggtw probe with four different encodes:
Device (A) - Shift_JIS encode
Device (B) - UTF-8 encode
Device (C) - EUC encode
Device (D) - ASCII encode
The logmon probe monitors SYSLOG-IN queue only for remote device D.
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for the sysstat probe.
Version
Description
State
Date
1.14
What's New:
GA
September 2015
GA
August 2012
1.11
GA
January 2011
1.10
1.08
1.07
July 2009
November 2007
May 2007
1.05
May 2007
1.04
Default QoS source when hostname is not found is set to the robot name.
February 2006
1.03
March 2005
Revision History
Requirements
Hardware Requirements
Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.10
GA
Jun 2010
Requirements
Hardware Requirements
None.
Software Requirements
CA Unified Infrastructure Management Robot 3.00 or later.
Contents
Revision History
Supported Probes
Changes After Migration
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for the threshold_migrator probe.
Version
Description
State
Date
2.11
What's New:
GA
September 2015
GA
August 2015
What's New:
Added support to migrate the net_connect probe to standard static thresholds.
Added support to migrate dynamic alarm variables in dirscan 3.13 or later.
2.01
What's New:
CR
June 2015
Beta
May 2015
GA
March 2015
What's New:
Added support to migrate the url_response probe to standard static thresholds.
Added support for robot specific migration.
1.00
Supported Probes
The 1.00 and later versions of the probe migrate the threshold configuration for the following probes:
dirscan (File and Directory Scan) version 3.12 or later
logmon (Log Monitoring) version 3.49 or later
ntperf (Performance Collector) 1.90 or later
The 1.10 and later versions of the probe also migrate the threshold configuration for the following probes:
url_response (URL Endpoint Response Monitoring) 4.20 or later
The 2.01 and later versions of the probe also migrate the threshold configuration for the following probes:
ad_server (Active Directory Server Monitoring) 1.80 or later
iis (IIS Server Monitoring) 1.71 or later
lync_monitor (Microsoft Lync Server Monitoring) 2.20 or later
rsp (Remote System Probe) 5.10 or later
sharepoint (Microsoft SharePoint Server Monitoring) 1.70 or later
The 2.10 and later versions of the probe also migrate the threshold configuration for the following probe:
net_connect (Network Connectivity Monitoring) 3.11 or later
The 2.11 and later versions of the probe also migrate the threshold configuration for the following probe:
exchange_monitor ((Microsoft Exchange Monitoring) version 5.20 or later
The probe does not support rollback. A backup of the config file for each instance of the probe is created in the threshold _migrator
installation directory for reference.
Some changes to alarms after migration are as follows:
Alarm messages that are sent by the baseline_engine do not yet support internationalization. The alarm messages are not automatically
translated or displayed in non english languages. However it is possible to use non english characters to customize alarm messages.
The $ variables that are used in probe alarm messages are migrated.
The variable syntax changes from $<variableName> to ${<variableName>}.
Static message variables are also migrated and replaced with standard variables such as $value and $threshold.
When you create new profiles for monitoring, customize the alarm messages in the profile, as needed. If no alarm message is configured,
baseline_engine uses a default alarm message string. The default string can be different from the probe alarm messages in other
profiles.
For more information about probe-specific changes, see the threshold_migrator Supported Probe Considerations article .
Important! The IM GUI of a migrated probe is not available after the probe is migrated using threshold_migrator. The migrated probe
can only be configured using the Admin Console GUI.
Refer Known Issues in the ppm probe documentation for the migrated probes to view the limitations of using Admin Console.
comparison checks on the page contents. The probe supports proxies and user authentication for accessing the requested web URL. In addition,
the probe generates quality of service (QoS) messages for analyzing the URL performance.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Threshold Configuration Migration
Installation Considerations
Limitations
Known Issues
Revision History
This section describes the history of the revisions for url_response probe.
Version
Description
State
Date
4.23
What's New:
GA
December
2015
GA
September
2015
GA
May 2015
Added support to disable alarms when a string is specified in Look for substring in page content field.
Fixed Defects:
Invalid time was retrieved while checking the SSL certificate of URL. Support case number 00246519
Probe was crashing due to memory corruption. Support case number 00164669
Error was generated in log file when number of samples for calculating the average was specified as zero. Support case
number 70005470
4.22
What's New:
Added support for the following authentication types in Windows platform:
NTLM
DIGEST
GSSNEGOTIATE
ANYSAFE
Fixed Defects:
Defaults were overwritten in message section in the cfx file. Salesforce case 00164613
The probe crashed as it had reached the maximum restarts. Salesforce case 00164669
Variables specified in the Source Override field in the Advanced tab did not expand to their actual values. Salesforce
case 00165815
4.21
What's New:
Added support of dynamic variables (code, description, and timer) in static and dynamic threshold messages. This is
applicable on CA Unified Infrastructure Management 8.3 or later.
Fixed Defects:
Updated alarm priority logic information in probe document. Salesforce case 00160102
Fixed a defect where url_response 4.18 did not report days to certificate expiry for SSL connections. Salesforce case 00
159461
4.20
What's New:
April 2015
The probe can be migrated to standard static thresholds using the threshold_migrator probe.
The device ID key (useHostNameForDeviceIdentification) can be set through Raw Configure. Default value of key is
no. Set this key to yes to generate device ID based on hostname instead of IP address.
Separate source override text boxes added for Alarm and QOS in Advanced tab.
Fixed Defects:
The authentication did not work for profiles and user was getting 401 error code. Salesforce case 00135591
The user was unable to monitor a URL and was getting 500 error code. Salesforce case 00153888
The user was unable to connect to https websites. Salesforce cases: 00146412, 00148664, 00146769
4.18
Fixed Defects:
March
2014
A new profile was not being created in case the URL was an IP address.
For some URLs, the substring match (with or without regex) was unsuccessful.
4.17
Fixed Defects:
January
2014
The probe always displayed Clear Pending Alarm message for any alarm being cleared, irrespective of the clear
message configured for that alarm.
Fixed a defect to prevent creating profile with blank URL.
4.16
Fixed issue related to url_response probe fails to alarm after profile is setup to not alarm on first sample. Probe default
updation. Fixed issue related to url_response probe fails to populate variables properly in UMP when internationalization is
ON in nas.
October
2013
4.15
June 2013
4.14
Fixed: File dump content is partially visible on screen. Fixed: Website with windows authentication no longer works.
April 2013
4.13
4.12
Fixed issues related to SSL Certification Fixed Certification error related to severity Fixed Issue related to File Dump
Fixed Issue related to 35SSL Connect error Fixed issue related to form authentication.
November
2012
September
2012
January
2012
December
2011
August
2011
Issue with QoS Data when upgraded from 3.63 to 3.91 is fixed.
Fixed SOC defects
3.91
March
2011
January
2011
3.90
December
2010
Fixed an issue where the probe used to wait for URL timeouts even when restart or stop is called.
3.82
October
2010
Fixed an issue in content comparison. Earlier, the probe failed to compare the content when the search criteria started
with the first character, even if the string match was ok.
3.81
July 2010
July 2010
June 2010
May 2010
May 2010
3.63
Changed the response time to total time provided by curl instead of nimbus timer.
September
2009
3.62
September
2009
3.60
August
2009
3.52
Fixed problem running on Linux without dynamic libraries present. (This was a problem in new installations of version 3.5x.
Upgrades worked OK).
August
2009
January
2009
July 2008
July 2008
June 2008
Modified configuration tool to disallow specifying an empty string as a profile name as this resulted in removal of all
non-modified profiles.
Added option to set user agent string.
3.31
Modified configuration tool to disallow specifying an empty string as a profile name as this resulted in removal of all
non-modified profiles.
April 2008
Known issues: This version and previous versions running on Linux have a bug in a library which can cause the probe
to hog cpu and hang or crash. A new version solving this problem will be made available as soon as old functionality
has been ported and verified using the new library.
3.29
3.27
3.26
Updated underlaying library to resolve problem with ok return code on LINUX when the web server was actually not available.
Fixed occasional program failure on probe restart.
Fixed alarm message handling for the 'config_error' alarm situation.
3.25
All platforms: Fixed serious memory leak when called from logmon.
LINUX: Explicit dependency on Robot >= 2.60 added to the package to ensure that a suitable glibc is available on
installation.
February
2008
November
2007
September
2007
September
2007
3.23
February
2007
Added support for saving page to disk if status indicates that fetch was unsuccessful.
Added support for encrypted passwords when fetching pages on behalf of another probe or GUI (test_url callback).
Linux version added. Note: Glibc v2.3 required to run this probe! Test_url callback threaded to allow pages to be fetched
on behalf of other probes.
3.20
Fixed expansion of variables again. Now all available variables are expanded for all message types.
November
2006
3.19
Additional flags set when using Windows Authentication to fix issue with redirection from a non-SSL to an SSL connection.
August
2006
3.18
July 2006
March
2006
November
2005
It is now possible to insert regular expressions when looking for substrings in page content.
3.13
July 2004
Note: If you want to migrate the probe to standard static thresholds using the threshold_migrator probe, set the device ID key useHo
stNameForDeviceIdentification to yes in the probe Raw Configuration.
Installation Considerations
The url_response probe can be deployed on a local system to monitor a website URL. The probe requires the following installation
considerations:
The default installation directory for url_response probe has changed to /probe/application from /probe/network. This is applicable for
versions 3.7x and above. If you downgrade to probe version lower than 3.71, the configuration files will not merge.
Some web servers require user authentication. The url_response probe transmits user information as specified in the HTTP / HTTPS
protocols. This works with webpages, where on interactive use, the browser prompts for logon information.
When the web pages implement a logon screen, you will normally have to use the E2E Application Response Monitoring.
In Microsoft environment, authentication with web servers and proxy servers is of the 'Basic' type. if 'Windows Integrated Authentication'
is required, the 'Windows NT Authentication' check box in the profile must be selected. Proxy settings will then be retrieved from the
registry, as saved by Internet Explorer.
Because of limitations in the libraries used, problems can occur on multi-processor computers. In such cases, the probe can be run in
synchronous mode (fetching one URL at a time) by setting the force_synchronous flag in the setup section of the configuration file.
Note that this will limit the number of profiles handled by the probe.
In the case of fast networks, if the ignore connection time option is selected, then the response time of the URL may sometimes be below
1 millisecond. This will be reported as 0 in the probe.
Limitations
By default, the probe is configured to limit the number of concurrent profiles to 100. That is, more profiles can be defined, but the probe checks for
the profiles sequentially. This may potentially lead to profiles not being executed at the scheduled time. The probe raises an alarm in this situation.
You can modify this limit by editing the max_threads setting in the configuration file.
Important! You must check the resource usage of the probe when updating the max_threads setting in the configuration file. This avoi
ds running into machine-dependent limitations for memory, threads or connections.
The probe initially allocates 'min_threads' threads for profile execution. If there are more profiles defined, the probe may gradually increase the
number of threads. Profiles that are not immediately allocated a thread are rescheduled 20 seconds later.
If Windows authentication option is selected, then the alarms and QoS will not be available for the following:
Time to first byte
Time to last byte
Known Issues
The Admin Console GUI of the probe has the following known issues:
The probe does not provide schedulers to the profiles.
The probe does not allow test profiles in proxy environment in Unix.
Revision History
Deployment Considerations
Revision History
Version
State
Date
8.00
GA
September
2014
2.11
Addressed a SQL Constraint Violation that could occur during report generation if scans do not complete before the start
of the next day.
GA
June 2013
GA
March 2013
2.10
2.00
Scans monitored systems for UIM component usage and stores the data. The billing report uses the data to generate
billing reports.
GA
November
2012
1.30
GA
November
2011
Deployment Considerations
The billing probe is typically deployed to the primary hub. An instance of usage_metering must reside on the same robot as the billing probe, and
must be configured as the primary instance of usage_metering. See usage_metering for information on primary and secondary instance types.
Revision History
Probe Specific Application Requirements
Probe Specific Environment Requirements
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Upgrade Considerations
Upgrading from v6.41 to v6.50
Upgrading from earlier versions to v6.41 or later
Unified Dashboards or Summary Views in UMP
Migration
Considerations for the VMware Web Services API
Considerations for Viewing Information in USM
Best Practices
Performance
Optimizing Memory for the Probe
Hardware Monitoring
SOC Support for Hosts and VMs
Known Issues and Workarounds
Raw Configure In Place of IM
Windows 2008 Socket Exception
Delay in QoS Data Publication
Inconsistent Data for Memory Shared Metric
Passive Robot
IPv6 Support
Trappable Stack Trace in Log File
Appearance of VMs in USM
Errors with Polling Intervals under 20 Seconds
Collection Times are Extremely Slow
Probe Values Don't Match the vSphere Client
For Event Monitors, All Events are Coming Across as Part of a Single Alarm
Limited Multibyte Character Support
QoS Data Fails to be Submitted into the SLM DB
State of ESXi host services out of date
Known Issues and Workarounds for Infrastructure Manager Users Only
No Value is Listed for a Monitor
Out of Memory Errors
Error When Closing the Configuration GUI
UI Fails to Load Auto/All Monitors Nodes
UI Slowly Loads Nodes Within the Inventory Tree
Revision History
This section describes the history of the probe updates.
Version
Description
State
Date
6.60
What's New:
Added support for VCenter and ESX 6.0.
Target strings are changed for host metrics of type HOST_SYSTEM. Target strings for HOST_SYSTEM are found
under the System node on hosts and are named "host/system/kernel." For example, the old target string was:
"Resource CPU Usage in Mhz (% of Mhz*NumCpuCores)". The new target string is "host/system/kernel/Resource CPU
Usage in Mhz (% of Mhz*NumCpuCores)."
Fixed Defects:
Fixed an issue in which, if a message pool identification name included an underscore, the high and low threshold
status display order was reversed. Salesforce case 00168994
Fixed an issue in which, if a metric was disabled in Infrastructure Manager, it was not disabled in Admin Console. Sales
force case 00169165
Fixed an issue in which the probe was inconsistent in which name it used to identify the source for a QoS message. Sal
esforce case 00170267
Fixed an issue in which, for certain alarms, the device ID incorrectly displayed as the default connector. Salesforce
00170025
Fixed an issue in which, for users with CA UIM v8.0-8.1, the Unified Dashboards for VMware were missing QoS that
were collected by the probe. Salesforce case 00154590
Fixed an issue in which the probe did not send device ID in alerts through the CA UIM Connector for CA SOI. Salesforc
e case 00170025
Fixed an issue in which the template editor in Admin Console offered a superfluous option to apply a log level filter.
Fixed an issue in which the probe stopped generating QoS messages after running continuously for one day.
Fixed an issue in which a MissingTargetException occurred.
Fixed an issue in which, when running on a Windows 2008 server with a socket leak, the probe displayed the following
error message: "Java.net.SocketException: No buffer space available."
Fixed an issue in which, when the user was using Japanese characters, the QoS source and target were garbled when
they were stored in the database.
GA
October
2015
6.53
What's New:
Added support to configure and apply all monitoring, manually or with templates, in Admin Console
Added documentation about how to format IPv6 addresses when used for a Uniform Resource Identifier (URI); use the
Java convention of enclosing an IPv6 address in square brackets.
For more information, see the v6.5 vmware AC Configuration and v6.5 vmware IM Configuration guides.
Added new monitors to the datastore resource:
number read averaged
number write averaged
number read rate
number write rate
Added the ability to monitor a datastore disk, including the following monitors:
number read averaged
number write averaged
number read rate
number write rate
Added the ability to monitor a datastore under a host, including the following monitors:
number read averaged
number write averaged
number read rate
number write rate
total read latency
total write latency
Added the ability to monitor a datastore under a vm, including the following monitors:
number read averaged
number write averaged
number read rate
number write rate
total read latency
total write latency
For more information about these metrics, see vmware Metrics.
Fixed Defects:
Fixed an issue in which QoS monitoring configurations were lost when the user password changed. Salesforce case
00162857
Fixed an issue in which delta values were incorrectly calculated.
Fixed an issue in which the numeric sensor monitor was not updating the power supply status. Salesforce case 001556
26
Fixed an issue in which automonitors using the value average*n did not work. Salesforce case 00160619
Note: Monitors of the enumerator type only support current values and cannot calculate delta values.
GA
August
2015
6.41
What's New:
GA
March
2015
Added Admin Console GUI element: the Detached Configuration folder in the left-hand navigation tree for the probe
displays resources that have been deleted in the VMware vSphere and which still have configuration in the probe.
Added monitoring of Distributed Virtual Port Groups and Switch.
Added support for monitoring the same VCenter with multiple credentials.
Added support for using non-English characters in the following fields: template name, template description, message
name, message error text, message OK text, and alarm thresholds.
Enabled the probe to indicate to baseline_engine (or other probes performing a similar computational function) to
compute baselines, publish QoS data to the message bus, and publish alarms when configured alarm threshold criteria
is met.
Improved discovery time and HEAP memory usage.
Fixed Defects:
Fixed an issue in which, when using two user profiles with different permissions on the same ESXi host, incorrect
credentials were set on both resources and both resources monitored the same set of objects from the vCenter. Salesfo
rce case 00152061
Fixed an issue in which, when monitoring numeric sensors on ESXi hosts, the profile did not work and the tooltip
displayed an "unable to connect" error message.
Fixed an issue in which, when monitoring QoS metric for host service availability, the service description was missing
from the CI naming.
Fixed an issue in which internationalization (translation from English) caused the QoS source/target text to be garbled
when it was stored in the database.
6.30
GA
December
2014
6.12
Added support for migration to current release from 4.01 and forward.
All group names are localized in Probe Configuration.
Corrected an issue where non-clustered VMware server instances are falsely reported as a vCenter.
Fixed default messages that reported "memory" instead of "network."
GA
July 2014
Fixed incorrect QoS target name format for Guest Disks: QOS_DISK_FREE //Free (in % of Capacity) is correct;
QOS_DISK_FREE GuestDisk///Free (in % of Capacity) was incorrect.
6.10
Added requirements for VMware VirtualCenter 5.5 and VMware ESX/ESXi 5.5
Described appearance of VMs in USM.
Support for IPv6 environments.
Restored VMwareApiAvailable metric
GA
March
2014
6.01
GA
December
2013
6.00
Performance enhancements
Support for VMware 5.5
Fixed: Provisioning VMs could cause spurious duplicate UUID alarms
Fixed: Expiration of the probes session with vCenter triggers spurious duplicate UUID alarms.
Beta
December
2013
5.10
GA
October
2013
5.03
GA
June 2013
5.02
GA
May 2013
5.01
March
2013
4.20
June 2012
4.11
March
2012
4.03
Polling interval alarms are now cleared when the collection cycle takes less time than the configured interval.
Target string corrected for DS, Cluster, and Network auto-monitors created from static monitors.
Target string corrected to contain topology path for Resource Pools and Networks.
Corrected the robot dependencies setting in the VMware probe package.
Restored missing VM Memory usage metric in the inventory tree.
Corrected reported FM snapshot size.
Added supported for tracking VMs by instanceUUIDs for environment with duplicated UUIDs (e.g., Lab Manager & vCloud)
January
2012
4.02
Updated the UMP Metrics template to match the latest Dashboard content
December
2011
4.01
November
2011
4.00
Added support for SOC management of Hosts, and VMs with VMTools installed.
Corrected target naming for Datastores contained within folders.
Fixed enabling of static monitors for the Top Level API monitors
October
2011
3.53
June 2011
3.52
June 2011
3.51
June 2011
3.42
Fixed auto-monitor behavior around wildcarded monitors (e.g., Services, CPUs, CpuStatusInfo, Disks, etc.).
StorageStatusInfo, CpuStatusInfo, MemoryStatusInfo and NumericSensorInfo monitors how support wildcarding in
auto-monitors and templates.
Probe allows for ESX Hosts vnic property to not be set.
May 2011
3.41
May 2011
3.40
Maintenance release that includes additional message types and QoS entries.
March
2011
3.31
Corrected target resolution process for submitted QoS and Alarm messages
March
2011
3.30
Probe startup process has been migrated to the standard controller mechanism.
resource alarms are now sent for unavailable resources.
Additional CPU and Memory metrics.
Configuration UI and probe performance improvements.
Corrected the source for Resource Pool AutoMonitors.
Updated the UMP metrics template monitor Memory Grants.
Deleting all monitors via Applying Templates is no longer supported. Please use the "all Monitors" section to select and
delete all monitors (e.g., select the first monitor, then use the CTL+SHIFT+END shortcut to select all, then finally use the
context menu's delete to perform the delete).
Warning level Alarm is now emitted when the data collection time for a resource is greater than the polling interval.
The probe now substitutes calculated values for Network Metrics missing/unavailable on some ESX servers (e.g., latest EXSi
4.1 Servers). When substitute values are used, the probe logs a warning message in the log file.
Correct Source As IP behavior with Template deployed monitors.
March
2011
3.29
February
2011
3.28
February
2011
3.27
December
2010
3.26
November
2010
3.25
November
2010
3.24
October
2010
3.23
Fix metrics that don't display properly in the GUI when a host name is not resolvable.
October
2010
3.22
Fix auto configuration matching defect that caused auto monitors to match non-running VMs
Fix session timeout defect that caused null data when long intervals were used
Retry queryAvailablePerfMetrics since it sometimes fails
September
2010
3.21
September
2010
3.20
August
2010
3.10
Scale improvements
NIS2 enabled
April 2010
3.02
November
2009
3.01
Fixed problems with VMWARE subject detection and reconnection after connection to server is lost.
September
2009
3.00
June 2009
2.73
February
2009
2.71
February
2009
2.70
Added support for CPU, Disk and Network instance performance monitoring.
Added the possibility to average a monitor point over the 2-5 last measurements.
Optimized start-up time for configurations with many active resources.
Added sortable column of alarm severities.
Fixed deployment of templates in clustered environments.
Fixed potential GUI issues when probe is about to restart.
December
2008
2.55
October
2008
2.54
October
2008
2.53
October
2008
2.52
October
2008
2.51
October
2008
2.50
September
2008
2.20
June 2008
2.10
April 2008
2.08
Upgrade libraries
October
2007
2.07
Fixed login
August
2007
2.06
August
2007
2.05
Fixed templates
June 2007
2.01
April 2007
2.00
March
2007
1.16
December
2006
1.15
November
2006
1.00
Initial version
September
2006
Note: Microsoft .NET framework 4.0 or later is not supported. To determine your version, go to the Microsoft documentation
and search for "How to: Determine Which .NET Framework Versions Are Installed."
Installation Considerations
The probe requires the following type of account access to the VMware environment:
Read access on all monitored entities
Upgrade Considerations
The following sections apply to users accessing the probe configuration GUI using CA Unified Infrastructure Management Infrastructure Manager
or Admin Console.
Note: Active default templates are upgraded automatically when you upgrade the probe. Any new or deprecated filters, rules, and
monitors, in the latest version of the active default template are applied automatically as part of the upgrade process. If you want to
save the monitoring configuration of your current active default template and to prevent them from being overwritten when you upgrade
the probe, copy the active default template, rename it, and apply this renamed copy before you upgrade the probe.
Note: If you are upgrading from v4.23 and want to retain your configured monitors, we recommend upgrading to v6.50 because of its
improved upgrading performance.
Migration
Migration from 4.2x versions of the probe is supported, but with important caveats. Please read carefully if you intend to perform migration.
CPU Extra (% of available) and CPU Guaranteed (% of available) were both removed from the VMware API after ESX 3.5. They have
been removed from the probe.
VMwareApiAvailable has been removed as a monitorable metric in versions 5.01 to 6.01. A self-monitoring alarm has replaced it.
This functionality was restored in version 6.10 and may be used in addition to the self-monitoring alarm.
Reservation (vm) was a duplicate of the monitor CPUReservation. To avoid confusion, reservation has been removed.
Host metrics VMNonTemplateCount, VMNamesActive, and VMNames were all intended for an abandoned feature. They have been
removed.
ToolsStatus (deprecated) was deprecated for a release and is now removed.
TemplateState has been removed in favor of having fully distinct type for VMs and templates. Static monitors that were created against
entities reclassified from VM to template will need to be recreated.
Attempting to migrate any of these monitors will result in alarms for each during every polling cycle.
The original configuration file is automatically backed up in the probe installation directory.
Any of CpuStatusInfo, MemoryStatusInfo, or NumericSensorInfo that were manually enabled in the old probe will need to be enabled in
the new probe by editing the 'mondef.cfg' file.
The 5.x versions of the probe use more memory. If you were close to the memory limit on the previous version of the probe, it is very
likely you will need to increase the available memory.
Note: This step enables the majority of the monitors required for viewing VMware information in USM. However, to view all of
the VMware specific information available, you must also perform the next step.
2. Select the Publish Data check box for the following monitors:
Lists for Hosts and vCenters
Top 5 Resource Pools - CPU
CPUOverallUsage
MemoryOverallUsage
Capacity
GuestMemoryUsage
vCenter Properties
Storage vMotions
VM vMotions
Num VMotions
Total Memory
TotalMemory
Total CPU
TotalCPU
TotalCPU
OverallMemoryUsage
Best Practices
This section contains general best practices for using the probe. See the documentation for your version and configuration interface for further
best practices specific to your situation.
Performance
The default settings should be sufficient performance for most environments. The following items can affect performance:
IMPORTANT! The Raw Configure tool has no error checking. Take care that any changes you make are valid before you
continue.
Hardware Monitoring
Note: This topic applies to users accessing the monitoring configuration GUI using CA Unified Infrastructure Management Infrastructure
Manager or Admin Console.
The probe provides numerous virtualization monitoring metrics and alerts for performance and availability of the VMware environment through the
VCenter and ESXi web services APIs. The VMware APIs also provide metrics for the underlying server hardware that runs the ESXi hypervisor.
The probe collects most of these server hardware monitoring metrics. The metrics available are limited by the data provided by the VMware APIs.
The metrics also differ in coverage and accuracy depending on the server hardware.
Therefore, we recommend using other CA Unified Infrastructure Management probes for server hardware monitoring, especially SNMP-based
probes such as snmptd. As a best practice, use the snmptd probe to catch traps due to fault or damaging events from the underlying server
hardware device that runs the ESXi hypervisor. Follow the instructions provided by VMware or the server hardware vendor to enable SNMP on
the server hardware device. To configure the CA CA Unified Infrastructure Management snmptd probe, see the probe documentation or contact
support for assistance.
Alternatively, the probe can forward alarms from the vCenter. Configure appropriate hardware alarms in the vCenter, and turn on alarm monitors
for the related devices in the probe to forward any triggered alarms.
Note: This topic applies to users accessing the probe monitoring configuration GUI using Infrastructure Manager or Admin Console.
Starting with v4.0, the probe supports configuration through Service-Oriented Configuration (SOC). Please be aware of the following behaviors
when using the probe with SOC:
Configuring a connection to a vSphere resource is performed through the Probe UI.
Template Groups for VMs or Hosts can be formed by using the dedicated setting. For VMs, the field is VirtualMachine; for hosts, its
HostSystem.
The probe will do an initial discovery of manageable Hosts and VMs and populate this content to the local robots niscache for Discovery
Server processing.
When using SOC, the probe will monitor its configuration file for changes. When a change has been noticed, the probe waits for a 5
minute quiet period before restarting.
Configuration of the probe must be done through the Probe UI or SOC exclusively. Attempting to manage a probe through both
mechanisms at the same time will result in collisions within the configuration.
Only Virtual Machines with VMtools installed can be managed by SOC. All Hosts can be managed by SOC.
Upgrading and conversion to SOC of an existing probe setup may disrupt expected data collection as a side effect of the SOC
configuration over writing pre-existing configuration.
Probe returns inconsistent data for the Memory Shared (% of MemorySize) metric.
Solution:
v6.21 and 6.30:
1. Go to the vmware.cfg file.
2. In the vmware.cfg file, search for:
Memory Shared (% of Memory Size).
3. Replace every instance of Memory Shared (% of Memory Size) with:
Memory Shared (% of Memory)
IMPORTANT: The Raw Configure tool has no error checking. Take care that any changes you make are valid before you
continue.
Note: If you have any templates configured to monitor Memory Shared (% of Memory Size), you must also change the metric
within the template to read Memory Shared (% of Memory).
Passive Robot
Symptom:
When configuring a probe deployed on a passive robot, you see an error message:
Configuration was unable to be retrieved.
Message: Request Error: Probe vmware on robot...passivehub is busy.
Resolution: Please wait and retry loading configuration later.
Error Code: PPM-023
Solution:
You can ignore the message. The hub will retrieve configuration information from a probe deployed on a passive robot the next time that the
hub_update_interval has elapsed.
Or, if you want to decrease the time it takes for a hub to retrieve configuration information from a probe deployed on a passive robot, decrease the
hub_update_interval from the the default value (900). For more information about how to configure a hub with a passive robot, see the hub guide
on the CA Unified Infrastructure Management wiki.
IPv6 Support
Symptom:
When I configure a profile using an IPv6 address, I get a stack trace error that includes the exception: Caused by:
java.lang.NumberFormatException: For input string: "f0d0:1002:0051:0000:0000:0000:0004:443".
Solution:
Follow the Java standard of enclosing the IPv6 address in square brackets.
For example: The input string [f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace error that
includes the exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
Symptom:
When using probe v6.10 or later, IPv6 addresses are not shown in USM for CA Unified Infrastructure Management 7.0 or 7.1.
Solution:
Upgrade to CA Unified Infrastructure Management 7.5.
Symptom:
Infrastructure Manager UI fails for IPv6 only robot host
Solution:
The Infrastructure Manager user interface for the probe will fail to display when the robot hosting the monitor is hosted on an IPv6 only network.
The workaround is to provide IPv4 access to the robot host.
Solution:
Ignore this error, data is still collected correctly.
memory, CPU, disk or database access, or another resource that is insufficient at peak load.
Users accessing the probe monitoring configuration UI via Infrastructure Manager can also disable collection of metrics that rely heavily on
database operations. To do this, use Raw Configure to set the 'include_summary_perf_metrics' key to 'no'.
IMPORTANT: The Raw Configure tool has no error checking. Take care that any changes you make are valid before you continue.
For Event Monitors, All Events are Coming Across as Part of a Single Alarm
This behavior is as designed.
-Xms256m - Xmx<nnnn>m
where <nnnn> is heap space up to 2048 MB or greater. For example,
to increase the heap space to 1024 MB, enter the following:
-Xms256m - Xmx1024m
Ensure the machine where the robot and probe are deployed has enough RAM.
6. Click OK and Apply.
Solution:
If the configuration GUI has been open for an extended period of time, the probe will not automatically release its lock on the configuration file (if
configuration file locking is enabled). This will only occur if the GUI is left open for periods longer than controller-configured session expiration time
and the 'Cancel' button or 'X' button are used to close the GUI.
The error is harmless and can safely be ignored.
Symptom:
In environments with a large inventory, the UI may take several minutes to load content within the Inventory Tree. For example, for a vCenter with
10,000 VMs - loading the VMs will take several minutes.
Solution:
Where possible, avoid navigating to such nodes. In the example of the VMs node, try finding the VMs under the hosting HyperVisor as opposed
to relying on the VMs node.
Cache Memory
CPU
Fibre channel (FC) COM Port
Front End Ports
IP
Memory
Storage volume
Virtual volume
Note: You can configure the probe through Admin Console only.
The probe includes the standard static alarm threshold parameters using CA Unified Infrastructure Management 8.2 or later.
Revision History
EMC VPLEX Supported Versions
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.00
What's New:
GA
December
2015
USM portal. You must restart the nis_server after you deploy the ci_defn_pack.
Install the mps_language_pack probe version 8.38 and later to view the metric type on the Admin Console. You must restart the service
_host probe after you deploy the mps_language_pack.
The web gateway (webgtw) probe automatically transfers customer billing reports to CA Technologies. This helps customers fulfill their probe
usage reporting obligations by eliminated the need to manually email the reports.
Revision History
Version
State
Date
8.00
GA
September 2014
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Known Issues and Workarounds
Revision History
This section describes the history of the revisions for the weblogic probe.
Version
Description
State
Date
1.41
What's New:
GA
June 2014
GA
March 2014
Added support for monitoring the WebLogic server using Secure Socket Layer (SSL).
1.35
What's New:
Added support for jre 1.7x.
Fixed Defects:
Fixed an issue of generating duplicate QoS for the same monitor.
1.34
Fixed Defects:
GA
January 2014
Fixed an issue in which the probe disconnected every time the server restarted.
Fixed an issue in which the metrics were appearing without a value.
1.33
GA
January 2013
1.32
GA
December 2011
1.31
GA
June 2011
1.21
GA
December 2010
1.10
GA
June 2010
1.03
GA
February 2010
1.02
Fixed a defect with tree node-name collisions that caused missing nodes and data.
Also changed cluster presentation.
GA
December 2009
1.01
Upgraded libraries
GA
November 2007
1.00
Initial version
Beta
September 2007
Installation Considerations
Follow these steps:
1. Install the package into your local archive.
2. To ensure a successful installation of the probe package (drag-and-drop), it is required that a java.exe (version not critical) exists in the
PATH. If no Java runtime is present, install Java JRE 6 or higher.
3. Drop the package from your local archive onto the targeted robot.
4. Double-click the probe for initial configuration. On first-time probe configuration, initiated by double-clicking the probe in CA UIM, the
installation wizard automatically will be launched. The wizard will prompt you for the path to the java.exe of the version required by the
probe.
Note: There is no problem running multiple Java versions on a robot but it is important that the probe is set up to reference the correct
Java runtime.
Revision History
Supported Platforms
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Known Issues
Revision History
Version
Description
State
Date
1.25
What's New:
GA
December
2014
GA
January
2014
GA
November
2012
GA
November
2012
GA
June
2012
1.24
Fixed Defects:
Fixed an issue in which, after a network connection went down and came back up, the probe sent a clear message only after
it was restarted.
1.21
1.20
Fix done to render enumerated values while capturing data for SOAP Request from GUI, at step 3 of 6.
Added Group functionality through which you can create groups and further create profile(s) in groups.
Fix done to generate SOAP Request for RPC style based webservices.
Fix done to get customized alarm messages on UMP alarm console.
Fix done to get proxy settings work properly.
Fix for basic URL authentication should work properly with valid credential at Step 1of 6.
Fix to handle HTTPS based web services which do not require any certificate.
1.11
1.10
Added Custom Header, WSDL Input through File System, and UI validation.
GA
May 2012
Fixed an issue in which default time unit set to ms in response time QOS and alarm. Salesforce case 71296
GA
December
2011
1.00
Initial release
Beta
November
2011
Supported Platforms
Please refer to the:
Compatibility Support Matrix for the latest information on supported platforms.
Support Matrix for Probes for additional information on the probe.
Known Issues
The known issues of the probe are:
New group creation functionality is not supported through the Admin Console GUI.
RegEx Edit functionality is not supported through the Admin Console GUI.
In USM view, the heading of Response Time Alarm is displayed incorrectly.
When you create more than one regular expression and configure alarms and QoS for all of them, the default Metric Id for all the regular
expressions is same. As per RegEx, the Metric Id must be different for all the regular expressions.
Contents
Revision History
Upgrade Considerations
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Supported Products
Subsystem ID Considerations
Override the Subsystem ID
Update NAS Subsystem ID
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
2.12
What's New:
CR
January
2016
GA
January
2015
Added support for WebSphere MQ 8.0.x. For more information, see Supported Products.
Added Client Mode support for connecting to WebSphere MQ. For more information about configuring Queue Managers
with Client Mode, see Set Up Client Connection Mode.
Ability to create monitoring configuration templates. The templates allow you to apply consistent monitoring
configurations across multiple profiles using filters.
Added factory template for monitors available on the Websphere MQ Unified Dashboard.
2.02
What's New:
Added support for AIX 6.x OS for WebSphere MQ 7.5.x.
2.01
What's New:
Added support for monitoring Topics and Subscriptions.
Added new metrics for monitoring in doubt status of the channels.
Added support for monitoring Remote, Alias, Model, and Cluster queues.
Added support for monitoring inhibit put and inhibit get status of the queue.
Added support for Multi-Instance queue manager monitoring.
Added support for monitoring WebSphere MQ on Solaris 11 and Windows Server 2012 R2 operating systems.
September
2014
1.01
June 2014
The probe configuration is available ONLY through Admin Console GUI and NOT through IM probe GUI.
The probe runs with NMS 7.6 or later versions and PPM 2.34 or later versions.
Upgrade Considerations
The probe has the following upgrade considerations:
Permissions for setmq_auth.sh file changes to default when the probe is upgraded to new version. Change the permission, as required,
after upgrading the probe to a new version.
If you are upgrading the probe from 2.02 to a later version, consider the following points:
The existing Queue Managers in binding state remain as is in the upgraded probe. The new discovered Queue Managers are in the
client mode.
A separate node called Queue Manager is created under Queue Manager where all metrics for a queue manager are listed.
If you are upgrading the probe to 2.01 or later, consider the following points:
To view the new metrics that are introduced in the websphere_mq probe version 2.01 or later on the Unified Service Management
(USM), you can perform any one of the following actions:
Upgrade NMS 7.6 to CA UIM 8.0 or later
Install the ci_defn_pack probe version 1.00 or later and restart the nis_server probe.
The Queue metrics that are previously generated by the probe are displayed at the LocalQueue level on the USM in versions 2.01 or
later. However, the existing metrics already displaying at Queue level remain as is on the USM.
The Queue monitors, which are activated on the probe prior to version 2.01, move to the Detached Configuration node after
upgrade. You can activate the monitors for each queue again from the Local Queue node.
The probe version 2.01 or earlier used the IP address of the system as QoS and alarm source while the versions 2.01 or later use the
hostname as the default source. After migration, you can update the Source Override field value at websphere_mq node to IP
address else two views are displayed on the USM for the same metric.
The QMs and their corresponding components, which are the part of a cluster, are now displayed under Cluster Queue Manager nod
e and their activated monitors are moved to the Detached Configuration node. Before version 2.01 of the probe, all QMs are
displayed under the Queue Manager node.
Supported Products
The probe can monitor the following WebSphere MQ products:
IBM WebSphere MQ version 8.0.X (tested on IBM WebSphere MQ versions 8.0.0.4 for 64-bit versions of RHEL 6, AIX 6, and Windows
2012)
IBM WebSphere MQ versions 7.5.X or 7.0.X (tested on IBM WebSphere MQ versions 7.5.0.3 and 7.0.1.10) for all supported OS (64-bit)
IBM WebSphere MQ version 7.5.x for AIX 6.x (64-bit) OS using robot 7.80. For more information, see Set Up Environment for AIX
Monitoring.
Subsystem ID Considerations
Alarms are classified by their subsystem ID, identifying which part of the system the alarm relates to. You can perform the following operations in
probe versions 2.01 and earlier:
Override the Subsystem ID
Update NAS Subsystem ID
Important! This step is not required if you are using CA UIM 8.0 or later.
Key Name
Value
2.3.7.
WebSphere_MQ
2.3.7.1
Resource
2.3.7.2.
QueueManager
2.3.7.3.
Queue
2.3.7.4.
Channel
2.3.7.5.
Subscription
2.3.7.6.
Topic
To update the Subsystem IDs using Admin Console, follow these steps:
1. In the Admin Console, open the NAS probe configuration interface.
2. Click the Subsystems folder.
3. Click the New Key Menu item.
4. Enter the Key Name in the Add key window and click Add.
The new key appears in the list of keys with a blank value.
5. Click in the Value column for the newly created key and enter the key value.
6. Repeat this process for all the required subsystem IDs for your probe.
7.
7. Click Apply.
To update the Subsystem IDs using Infrastructure Manager, follow these steps:
1. In Infrastructure Manager, right-click on the NAS probe, and select Raw Configure.
2. Click the Subsystems folder.
3. Click the New Key... button.
4. Enter the Key Name and Value and click OK.
5. Repeat this process for all of the required subsystem IDs for your probe.
6. Click Apply.
Businesses that are based on web depend on a high degree of quality and availability. Therefore, the security, availability, and performance of the
IBM WebSphere Application Servers (WAS) must be as high as possible. Monitoring and managing the servers, components, and services are
essential to ensure optimal performance of WAS.
The WebSphere Monitoring (websphere) probe handles all the common monitoring and data collection tasks on the WAS. The probe collects and
stores data and information from the monitored system at customizable intervals. The probe generates alarms when the specified thresholds are
breached. The websphere probe monitors the WAS version 7.0 or later.
Contents
Revision History
Probe Specific Hardware Requirements
Probe Specific Software Requirements
Installation Considerations
Install Java JRE
On UNIX
On Windows
PMI
WAS Versions
Known Issues
Revision History
This section describes the history of the revisions for the websphere probe.
Version
Description
State
Date
1.73
Fixed Defects:
GA
June 2015
Fixed a defect where the log size kept on increasing even after setting the log size key value. Salesforce case 00164635
1.72
Fixed Defects:
GA
April 2015
Probe did not handle a period (.) in the deployed application name and did not generate metrics for the application. Sale
sforce case 00151138
The probe kept restarting and eventually crashed if different credentials were provided to connect to the same server
and same port. Salesforce case 00157088
The probe took time to load checkpoints for J2EE applications. Salesforce case 00157088
The probe was not generating alerts with the Advanced Alarming feature. Salesforce case 00158147
In probe version 1.70, a new resource could not be added to the probe configuration. Salesforce cases: 00145653,
00149026, 00148816
The probe did not display metrics under automonitor node when applied through auto-configuration. Salesforce case
00149110
The user credentials are no longer stored in the probe logs. Salesforce case 00151834
1.71
GA
November
2014
1.70
GA
June 2014
GA
April 2014
GA
March
2014
1.65
Fixed the defect where the probe displays an error in the Portuguese environment while adding resource information or
fails to launch the probe GUI if a resource is already added.
Fixed the defect where the probe is generating two device id for one resource.
Fixed the defect where the TNT2 data does not match with the Metric Definition table data.
1.64
1.63
Fixed server name change on auto scan of profiles Fixed GUI error in multiple groups Fixed template functionality for
template from different profiles.
GA
October
2013
1.62
GA
November
2012
GA
September
2012
GA
August
2012
GA
December
2010
GA
June 2010
GA
April 2009
GA
December
2007
GA
September
2007
1.61
1.60
1.51
1.40
1.31
1.25
Upgrade libraries.
Added security file.
1.23
1.21
GA
December
2006
1.20
GA
November
2006
1.18
GA
October
2006
GA
September
2006
GA
July 2006
GA
May 2006
1.17
1.16
1.15
Java JRE 6 or later. By default Java comes with WAS 7 and above. Use the same path for java_home.
PMI: Performance Monitoring Infrastructure (PMI) must be enabled on the WebSphere Application Server for the probe to gather
performance data.
WebSphere Application Server version 7.0 or higher.
Note: WebSphere Community Edition (CE) is not supported.
Product Info: Some checkpoints of older version of probe cannot be upgraded with this version of probe.
Installation Considerations
The websphere probe can be installed on either a server running WebSphere or on a remote computer.
If installed on a server running WebSphere, install the relevant environment files on the server.
If installed on a remote computer, install the relevant environment files on that computer.
Note: Install WAS files/AppServer directory on the machine where the probe is deployed for all WAS version.
For WebSphere version 7.0, 8.0, and 8.5 copy the runtimes and plugins folders from the WebSphere directory of the WebSphere server.
Notes:
Copy the etc and lib folders from the WebSphere server to the remote computer, if the OS on the remote computer (running
websphere probe) and server (running WebSphere software) are different.
If the websphere probe is running on Windows OS, use the AppServer directory for Windows. Similarly, if the probe is running
on Linux or Solaris, use the AppServer directory from the respective environment.
Note: Java JRE 6 and above is available with WAS 7.0 and above and you can use the same path for java_home.
On UNIX
You can configure the JVM and Java Home settings for installing the websphere probe on a computer running the UNIX OS.
Follow these steps:
1. Set the JAVA_HOME environment variable to the directory in which IBM JVM is installed and export JAVA_HOME. For example,
export JAVA_HOME=/usr/lib/jvm/ibm-java2-i386-50/jre/bin
2. Make sure that the PATH variable includes $JAVA_HOME. For example,
export PATH=$JAVA_HOME:$PATH
3. Open a shell as user root and use the command java - version.
The output shows the IBM java version installed.
On Windows
You can configure the JVM and Java Home settings for installing the websphere probe on a computer running the UNIX OS.
Follow these steps:
1. To set the Java path, right-click My Computer and select Properties.
PMI
Enable Performance Monitoring Infrastructure (PMI) on the WebSphere Application Server so that the probe can gather performance data.
Read more about PMI:
http://www-01.ibm.com/support/knowledgecenter/SSTVLU_8.6.0/com.ibm.websphere.extremescale.doc/txsenablepmi.html?lang=en
WAS Versions
Certain WebSphere Application Server (WAS) versions prevent external PMI clients like the websphere probe from obtaining correct PMI values
from the server. The internal error is fixed in the following WAS versions:
For both WAS 5 and 6, the error is corrected.
For WAS 5, the error is fixed in version 5.1.1.10.
For WAS 6, the error is fixed in version 6.0.2.9, older versions like 6.0.2.5 and 6.0.2.7 contain the error.
Known Issues
This section describes the known issues of the probe.
The probe stops functioning if you save the probe configuration with an invalid Java Home path or Libraries path.
The Admin Console GUI of the probe has the following additional limitations.
The probe stops functioning if you do not refresh the web page after adding new resource.
The probe does not provide the ability to create Templates, Auto Monitor, and Auto Configuration.
The probe does not support any other QoS except Default.
The probe does not support the Rescan host option, which allows the user to scan the profiles in the host server after a pre-configured
interval (15 minutes) and loads any profiles available in the host server under the respective Resources nodes in the probe GUI.
Revision History
Requirements
Hardware Requirements
Software Requirements
The WINS Server Response monitoring probe can monitor the WINS response for one or more Windows Internet Name Service (WINS) servers,
based on individual monitoring profiles for the different WINS Servers. The probe can send Quality of Service messages on response time and
alarms if the service is unavailable or the lookup failed.
Revision History
Version
Description
State
Date
1.20
GA
Dec 2010
GA
Oct 2006
GA
Aug 2003
1.02
Requirements
This section contains the requirements for this probe.
Hardware Requirements
There are no additional hardware requirements for this probe.
Software Requirements
A WINS server is required.
Revision History
Requirements
Prerequisites
Hardware Requirements
Software Requirements
Considerations
Installation Considerations
General Use Considerations
Known Issues
Known Issues with Workarounds
Increasing Probe Heap Space for Large Implementations
Probe Not Collecting Data
Out of Memory Errors
Probe Delays in Fetching XenApp Data
Probe Delays in Fetching Farm Data
Probe Not Fetching Performance Metrics for Some Servers in Farm
Probe Not Fetching Performance Metrics even when WinRM Enabled
Probe Not Fetching ICA Latency Metrics
Troubleshooting Connection Issues
Verify the WinRM Connection
Verify PowerShell Access
Set Up Multi-Hop Authentication
Revision History
This table describes the revision history for the probe.
Version
Description
State
Date
1.12
Minor release
GA
Sept
2015
What's New:
When the probe is configured for localhost monitoring (that is, when the probe is deployed on a robot present in a Citrix XenApp
Server machine), the probe functions when you also enter localhost in the Hostname field while creating a resource.
1.10
Minor Release
GA
July
2014
1.00
Initial Release
Beta
June
2014
Requirements
Prerequisites
The probe requires these prerequisites:
CA Unified Infrastructure Management version 5.1.1 or later environment.
Windows PowerShell command line interface on the XenApp server. The latest version of PowerShell that comes with Windows Server
2008R2 or later is required. For information on configuring PowerShell for use with the XenApp Monitoring probe, see the Citrix XenApp
Guide.
Windows Remote Management (WinRM) enabled on the xenapp server. For instructions on enabling WinRM, see the xenapp Guide.
Hardware Requirements
The probe should be installed on a system with the following minimum resources:
Memory: 2-4 GB of RAM
CPU: 3 GHz dual-core processor, 32-bit or 64-bit
Software Requirements
The probe requires the following software environment:
Citrix XenApp v 6.5.
CA Unified Infrastructure Management Server 5.1.1 or later
CA Unified Infrastructure Management Robot 5.23 or later
Java Virtual Machine 1.6 or later (typically installed with CA Nimsoft Monitor server 5.0 and later)
Infrastructure Manager 4.02 or later
Microsoft .NET Framework 3.5 on the system where the Infrastructure Manager application is running
Important!: On 64-bit Linux systems, the Java jre included in the XenApp Monitoring probe package does not install successfully when
you deploy the XenApp Monitoring probe on a robot.
Considerations
This section describes important information to know about the probe.
Installation Considerations
Installation follows the standard probe distribution process.
Known Issues
This section describes known issues. These issues apply to the latest version of the probe unless otherwise noted.
Applies to all versions.
1. In locales where the system settings are in languages other than English, the probe may not be able to fetch data for metrics with decimal
points.
2. Authentication failure can happen intermittently when the SSL option is enabled while configuring a domain user for a resource
If you see this message, enter the following command on the XenApp server:
winrm set winrm/config/winrs @{MaxMemoryPerShellMB="nnnn"}
where nnnn is a number greater than 1024.
Solution 3:
You might get the Kerberos error, "Clock Skew too great while getting initial ticket".
This error occurs when there is a time difference between the XenApp server and the probe machine. Verify that both the XenApp server and
probe machines are in the same time zone, and the time difference between the machines is not more than 5 minutes.
more time.
Solution:
To resolve this issue, you can either provide the computer with internet access so it can verify the Authenticode signature, or disable the
Authenticode signature checking feature for Microsoft Management Console as described below.
1. On the XenApp server added under Resources, open Internet Options in the Control Panel or Internet Explorer.
2. Click the Advanced tab.
3. Scroll down to Security.
4. Uncheck Check for publisher's certificate revocation.
5. Uncheck Check for server certificate revocation.
6. Click OK.
Note: This is an optional configuration, and absence of this file would fetch metrics from all the servers in the farm. More information:
Server List Configuration.
Solution:
Go the probe installed location and check if the file serversForMetrics_<ResourceName>.txt is present. If the file is present, open the file and
check if the server name (from which metrics are not coming) is present in the list.If the name is not present, then append the name to the list,
save the file and close it.
Problem:
WinRM configuration is not enabled on the server from which metrics are not fetched. WinRM configuration must be enabled in all servers in the
farm from which metrics have to be collected.
Solution:
Enable Winrm configuration on the server from which metrics are not fetched. More information: Configure WinRM and PowerShelll.
Problem:
Probe fetching all performance metrics except those related to Citrix ICA Latency.
Solution:
1. On Windows servers, open Windows Registry Editor (go to Start->Run, and type "regedit" and Enter).
2. In the Registry Editor, traverse to the path mentioned below:
Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Citr
ixICA\Performance
Note: If any of the following commands having single quotations fail, then try to run the command without the quotation marks.
1. Enable CredSSP on the WinRM client system, either by setting it manually or through a Group Policy setting.
To set it manually enter the following command:
winrm set winrm/config/client/auth '@{CredSSP="true"}'
To set it through a Group Policy, follow these steps:
a. Enter the following command in a Command Prompt window to open the Group Policy dialog:
gpedit.msc
b. Navigate to Computer Configuration\Administrative Templates\Windows Components\Windows Remote Management
(WinRM)\WinRM Client.
c. Double-click on the Allow CredSSP authentication policy in the right pane to open its configuration dialog.
d. Edit the policy as necessary.
2. Enable CredSSP on the WinRM service, either by setting it manually or through a Group Policy setting.
To set it manually enter the following command:
winrm set winrm/config/service/auth '@{CredSSP="true"}'
To set it through a Group Policy, follow these steps:
a. Enter the following command in a Command Prompt window to open the Group Policy dialog:
gpedit.msc
b. Navigate to Computer Configuration\Administrative Templates\Windows Components\Windows Remote Management
(WinRM)\WinRM Service.
c. Double-click on the Allow CredSSP authentication policy in the right pane to open its configuration dialog.
d. Edit the policy as necessary.
Revision History
Requirements
Prerequisites
Hardware Requirements
Software Requirements
Considerations
Installation Considerations
General Use Considerations
Known Issues and Workarounds
Auto-Discovery Time for Large Implementations
Deploying the Probe on a 64-bit Linux Machine
Increasing Probe Heap Space for Large Implementations
xendesktop Probe is Not Collecting Data
xendesktop Probe Delays in Fetching Site Data
xendesktop Probe Not Fetching HealthInfo Metrics even when WinRM Enabled
xendesktop Probe Out of Memory Errors
xendesktop Probe Known Issues
Revision History
Version
Description
State
Date
3.03
What's New:
Beta
May 2015
Added support for metrics from v1 ODATA APIs for Citrix Xendesktop 7.6.
3.02
Fixed Defects:
Beta
April 2015
Beta
March
2015
What's New:
Version 3.0 of the probe supports monitoring of both Citrix XenApp v7.5 and Citrix XenDesktop v7.5 and is compatible with
only v8.0 and v8.1 of CA Unified Infrastructure Management.
Added the ability to look up the localhost for automated deployments.
Added monitoring capability for the following components:
Applications
Load Indexes
Added new metrics (collected through ODATA APIs) to the following resources:
Administrator
Broker Session
Catalog
Desktop
Desktop Group
Load Index
Machine
Hypervisor
Added the ability to query metrics for:
one specified Citrix server (controller) OR a list of specified controllers
one or more specified site metric groups for one or more controllers
Removed ICA session metrics that were available in prior releases.
For more information about specific metrics, see the Metrics page.
2.10
GA
July 2014
2.00
GA
June 2014
2.00
Beta
September
2013
Beta
December
2012
1.00
Requirements
This section describes the requirements for the probe.
Prerequisites
This section describes the prerequisites for the probe:
Kerberos Authentication
XenDesktop version 7.5.
Windows PowerShell command-line-interface on the XenDesktop Delivery Controller (DDC) server. The latest version of PowerShell that
comes with Windows Server 2008R2 or later is required. For information about configuring PowerShell for use with the XenDesktop
Monitoring probe, see the XenDesktop Monitoring Guide.
Windows Remote Management (WinRM) enabled on the DDC server. For instructions on enabling WinRM, see the xendesktop Guide.
Hardware Requirements
The probe should be installed on a system with the following minimum resources:
Memory: 2-4 GB of RAM
CPU: 3 GHz dual-core processor, 32-bit or 64-bit
Software Requirements
The probe requires the following software environment:
v3.0 requires CA Unified Infrastructure Management v8.0 or v8.1
Earlier versions require:
CA Unified Infrastructure Management Server 5.1.1 or later
CA Unified Infrastructure Management Robot 5.23 or later
Java Virtual Machine 1.6 or later (typically installed with UIM server 5.0 and later)
Infrastructure Manager 4.02 or later
Microsoft .NET Framework 3.5 on the system where the Infrastructure Manager application is running
Important! On 64-bit Linux systems, the Java jre included in the probe package does not install successfully when you deploy the
probe on a robot. You must manually install the glibc.i686 library or compatible 32-bit libraries on 64-bit Linux systems where you
deploy the probe.
Considerations
This section describes important information to know about the probe.
Installation Considerations
Installation follows the standard probe distribution process.
If you see the StackOverflowException message, on the DDC server added as a resource or alternate resource, open a command prompt and
enter the following command :
winrm set winrm/config/winrs '@{MaxMemoryPerShellMB="nnnn"}'
where nnnn is a number greater than 1024.
Solution 3:
Note: This solution is only valid when Kerberos Authentication is used in the probe. Kerberos Authentication is required for v3.0 and
Note: In the probe log, if you get the Kerberos error, "Clock Skew too great while getting initial ticket," there is a time difference between
the XenDesktop server and the probe machine. Verify that both the XenDesktop server and probe machines are in the same time zone,
and the time difference between the machines is not more than 5 minutes.
xendesktop Probe Not Fetching HealthInfo Metrics even when WinRM Enabled
Problem:
The probe will not fetch HealthInfo metrics even though WinRM is enabled.
Solution:
1. On the server from which you want the probe to fetch the HealthInfo metrics, in the Windows prompt, run the command :
If you have more than 3000 virtual desktops for a DDC server (represented as a resource in the probe), you may need to increase the heap space
for the probe. For instructions, see Increasing the Heap Space for the Probe.
Revision History
Requirements
Prerequisites
Hardware Requirements
Software Requirements
Considerations
Guest Disk Usage Metrics Are Static
Installation Considerations
Known Issues and Workarounds
QoS Collisions
The xenserver (XenServer Monitoring) monitoring probe performs all common monitoring and data collection tasks for Citrix XenServer systems.
The CA Unified Infrastructure Management XenServer Monitoring (xenserver) monitoring probe handles all common monitoring and data
collection tasks for Citrix XenServer systems. The probe collects and stores data and information from the monitored systems at customizable
intervals. You can easily define alarms to be raised and propagated to the CA Unified Infrastructure Management Alarm Console when specified
thresholds are breached.
Contents
Revision History
Requirements
Prerequisites
Hardware Requirements
Software Requirements
Considerations
Guest Disk Usage Metrics Are Static
Installation Considerations
Known Issues and Workarounds
QoS Collisions
Revision History
This section describes the history of the revisions for the probe.
Version
Description
State
Date
What's New:
GA
November
2015
2.30
Added support for Admin Console.
Added support for multiple resources to the same host
Enhanced metric processing performance
Added a metric for the HOST_CPU_GROUP: Host CPU Average Utilization. Salesforce case 00037522
Added the ability to send QoS values for the Virtual Machine State. Salesforce case 00071197
Fixed Defects:
Fixed a defect in which the probe puts a host prefix on the source field for hypervisors. Salesforce case 00071197
Fixed a defect in which the probe created monitors for devices that were not configured to have them. Salesforce case
70003268
Fixed a defect in which session connections were not closed at the end of a session. Salesforce case 0000131675
Fixed a defect in which the Unified Dashboards for the probe were not correctly populating with data.
Note: After you upgrade to v2.30, reapply the UMP templates for the Unified Dashboards to populate correctly.
Fixed a defect in which the Virtual Machine State metric for the VM and Host Template had a QOS value of "Default."
The correct possible values are: -1 (Unknown), 0 (Unrecognized), 1 (Halted), 2 (Paused), 3 (Running), 4 (Suspended).
2.03
Added ability to configure the probe using the Admin Console or Snap UI.
GA
December
2013
2.01
Fixed the master failover issue. Improved CPU metric collection. Improved performance. Added metrics.
GA
August
2013
1.22
GA
January
2013
1.21
GA
September
12 2012
1.10
Commercial Release
GA
September
30 2012
Requirements
This section contains the requirements for the probe.
Prerequisites
The probe requires:
An account with access to the XenServer pool master (if monitoring a pool) or host
A collection of metrics that are enabled on each XenServer host
Typically metrics are enabled by default. However, metrics are not enabled on some versions of XenServer. See the Citrix XenServer
documentation for information about your version. Most metrics are obtained using the Citrix XenServer RRD, and the rest are obtained
using the xenapi. Some versions of XenServer do not collect CPU utilization metrics by default. If this data is not being collected, enable
the collection of CPU utilization metrics (and the related probe checkpoints). Then restart the system. To enable the collection of CPU
utilization metrics, run the following command on the XenServer system or a remote xe client:
Substitute your host uuid for <HOST_UUID>. The xe host-list command can be used to find the uuid of the host.
Installation of XenServer tools on each VM:
We recommend creating XenCenter templates that have the XenServer tools. The probe collects some metrics without the XenServer
tools on each VM. However, some metrics, such as guest memory usage, can only be obtained with the XenServer tools.
Hardware Requirements
Deploy the probe only on robots with the following minimum resources:
Note: We recommend that you deploy storage and virtualization probes on robots, not on primary hubs.
Software Requirements
The probe requires the following software environment:
XenServer versions up to v6.5, service pack 1
CA Unified Infrastructure Management v8.2 or later
CA Unified Infrastructure Management Monitor Server 5.1.1 or later
CA Unified Infrastructure Management Robot 5.23 or later
Java Virtual Machine 1.71 or later (typically installed with CA Unified Infrastructure Management server 5.0 and later)
Microsoft .NET Framework 3.5.x on the host where Infrastructure Manager is running
Considerations
This section lists important information about the probe.
Installation Considerations
Install the package into your local archive.
Drop the package from your local archive onto the targeted robot.
As part of the distribution to the target robot, the CA Unified Infrastructure Management JRE package is included for the probe to use.
The master failover issue is fixed in this release of the probe. However, when you configure the resource pool for the first time, verify that you
connect to the correct master.
The first time that you apply a configuration with a new resource, the probe can take several minutes to start.
Value
asynch
no
hasmax
100
isbool
no
unit
percent
5. Click Apply.
You added the missing fields.
Symptom:
I installed the vmware probe on my CA UIM system on which I had the xenserver probe already installed. Now, I observe: 1) data missing for my
memory monitoring 2) the data_engine log has errors for QOS_MEMORY_PERC_USAGE.
Solution:
If you install the vmware probe on a CA UIM installation, on which the xenserver probe is installed, QoS collisions occur. Collisions can
occur because both probes have a QOS_MEMORY_PERC_USAGE monitor but the vmware probe has extra fields for the monitor. To fix the
problem, delete these extra fields from the vmware monitor configuration.
Follow these steps:
1. In Admin Console, select the vmware probe.
2. Click Raw Configure.
3. Click QOS_MEMORY_PERC_USAGE.
4. Click Remove key and remove each of the following key/value pairs:
Key
Value
asynch
no
hasmax
100
isbool
no
unit
percent
5. Click Apply.
You removed the extra fields.
Description
Modified the code for output field, field names.
State
Date
GA
Dec 2010
GA
Nov 2010
GA
Sep 2010
GA
Jun 2010
1.11
1.0
Initial Version
Requirements
The xmlparser probe has no additional software or hardware requirements.
Revision History
Probe Specific Software Requirements
Installation Considerations
Known Issues
Fixed Issues
Revision History
This section describes the history of the revisions for the zdataservice probe.
Version
Description
State
Version
1.11
This version fixes an issue that was recently detected. For more information, see Fixed Issues.
GA
December
2015
1.10
Updated Probe Specific Software Requirements for the z/VM Data Collector in the zdataservice (Data Service Probe)
Release Notes.
The z/VM Data Collector now includes support for the following operating systems:
GA
October 2015
GA
August 2015
Initial version
To implement the probe, verify that your system meets the following minimum system requirements:
z/VM Data Collector
Required privilege classes: G
Memory:
Red Hat: 256 MB for Red Hat
SUSE: 1.5 GB
Free disk space: 220 MB (70 MB for installation, 150 MB of free disk space for logging)
Supported Operating Systems
Red Hat Enterprise Linux 6.5, 6.6, or 7.0
SUSE Linux Enterprise Server 11 SP3 or 12
IBM Java Runtime Environment:
Java SE Version 7 SR1 FP1, or,
Java SE Version 8 SR1
CIM Data Collector
z/OS 1.13 with CIM version 2.11.2
z/OS 2.1 with CIM version 2.12.1
Supported Operating Systems for the Data Service Probe
Windows Server 2012 (Memory: 800 MB)
Windows Server 2008 R2 SP1 (Memory: 800 MB)
Red Hat Enterprise Linux 6.6
CA Unified Infrastructure Management Server
CA Unified Infrastructure Management server version 8.2 or later
Installation Considerations
The following considerations affect the zops (zMonitoring) Probe, the zstorage (zStorage Monitoring) Probe, and the zvm (zvm Monitoring) Probe.
Verify that the following tasks are complete before you install any of the aforementioned probes.
The CIM server is configured.
The zdataservice (Data Service) probe is deployed and connected to CIM Server.
The following packs are installed:
ci_defn_pack is installed on the robot hosting the nis_server infrastructure probe.
mps_language_pack is installed on the robot hosting service_host.
wasp_language_pack is installed on the robot hosting WASP.
For more information, see the following articles:
zops (zops Monitoring) Release Notes
zstorage (zstorage Monitoring) Release Notes
zvm (zvm Monitoring) Release Notes
Known Issues
The following issues are known to exist in this release of the probe:
Under various conditions, CIM Server reports the file system size for network and hierarchical file systems incorrectly. This behavior
causes the Percentage Utilized metric to equal 100% when the Available Space metric is greater than zero.
IBM recently released an APAR to address this problem. To correct this behavior, apply the APAR listed below that
Under various conditions, CIM Server reports empty Names for Address Spaces and the incorrect type of the address space.
If the probe queries the CIM server when it is in the process of starting, the CIM server can shut down unexpectedly. As a best practice,
we recommend that you start the CIM server before you configure the probe to collect data.
When you add and then immediately delete the added row/rows in the CIM Server Connections or z/VM Server Data Collector
Connections table on the zdataservice Probe Configuration window, and then save the configuration, deleted rows may not be deleted or
some of the existing rows may be replicated. To avoid this situation, always reopen the configuration window, delete the row/rows and
then click Save.
Fixed Issues
The following issue was fixed in version 1.11 of the probe:
Excessive, unclosed TCP/IP connections to the z/VM data collector host, the LPAR where CIM server is running, or both may cause the
Data Service server to stop responding. This behavior can prevent other products from connecting to the TCP/IP service on z/VM and
z/OS systems. To correct this behavior, download and upgrade to the 1.11 version of the Data Service Probe.
Revision History
Software Requirements
Revision History
This section describes the history of the revisions for this probe.
Version
Description
State
Date
1.31
Fixed Defects:
GA
February
2015
GA
December
2012
GA
May 2012
Corrected an issue where the probe version was incorrectly displayed. The update version message is now correct. Salesfo
rce case 00104805
1.30
What's New:
It is no longer necessary to use SSH to connect to zones host
Added default templates
1.23
Probe can now use public/private key-pair authentication for SSH login to Zones host Enhanced Help documentation.
1.20
GA
April 2010
1.11
GA
April 2010
1.10
Commercial Release
GA
September
2009
Added the choice to gather information from non-global zones using a direct SSH connection or using zlogin from the
global zone.
Replaced run queue length with vmstat pages scanned.
QoS for Resource Controls not enabled by default.
Software Requirements
CA Unified Infrastructure Management Robot 3.02 or newer
Java 1.5 or newer
Sun Solaris 10 (release level - Intel x86: 08/07 & Sparc: 11/06) with zones pre-configured.
One of the following:
SSH access to the Solaris Global zone
SSH access to the individual zones
A robot installed on the Solaris Global zone
Revision History
Probe Specific Software Requirements
Installation Considerations
Known Issues
Revision History
This section describes the history of the revisions for the zops probe.
Version
Description
State
Date
1.00
Initial version
GA
August 2015
UIM Server
CA Unified Infrastructure Management server version 8.2 or later
Installation Considerations
Ensure the following before you install the zops probe.
The CIM Server is configured.
The zdataservice (Data Service) probe is deployed and connected to CIM Server.
Note: For details about CIM Server configuration and Data Service deployment, refer to zdataservice (Data Service) Probe.
Known Issues
The following issue exists in version v1.0 of the probe:
When working with the tree view in the USM portlet, some alarms related to the resource inventory may not appear. You can, however,
display all alarms in the Integrated Alarm view when you click the Alarm View icon in USM.
Note: See zdataservice (Data Service Probe) Release Notes for a complete list of the issues that affect the Data Service Probe.
Revision History
Probe Specific Software Requirements
Installation Considerations
Known Issues
Revision History
This section describes the history of the revisions for the zstorage probe.
Version
Description
State
Date
1.00
Initial version
GA
August 2015
To implement zstorage v1.0, verify that your primary hub meets the following minimum requirements.
Operating System
Windows Server 2008 R2 SP1 or Windows Server 2012 R2
Or
Red Hat Enterprise Linux 6.6 (RHEL 6.6) on 64-bit x86 systems
UIM Server
CA Unified Infrastructure Management server version 8.2 or later.
Installation Considerations
Verify the following requirements before you install the zstorage probe.
The CIM Server is configured.
The zdataservice (Data Service) probe is deployed and connected to CIM Server.
Note: For details about CIM Server configuration and Data Service deployment, refer to zdataservice (Data Service) Probe.
Known Issues
The following issue exists in this release of the probe:
When working with the tree view in the USM portlet, some alarms related to the resource inventory may not appear. You can, however,
display all alarms in the Integrated Alarm view when you click the Alarm View icon in USM.
Note: See the zdataservice (Data Service Probe) Release Notes for a complete list of the issues that affect the Data Service Probe.
Revision History
Probe Specific Software Requirements
Installation Considerations
Known Issues
Revision History
This section describes the history of the revisions for the zvm probe.
Version
Description
State
Date
1.10
Revised the default Port on the Add New Profile dialog. The default is now 7161, which corresponds to the default port for the
zdataservice (Data Service) probe.
GA
October
2015
GA
August
2015
Initial version
Installation Considerations
Note: For details about z/VM Data Collector configuration and Data Service deployment, refer to zdataservice (Data Service)
Probe.
Known Issues
The following issue exists in this release of the zvm probe:
When working with the tree view in the USM portlet, some alarms related to the resource inventory may not display. You can display all
alarms in the Integrated Alarm view when you click the Alarm View icon in USM.
Note: See the zdataservice (Data Service Probe) Release Notes for a complete list of the issues that affect the Data Service Probe.
Configure ace
This probe has no configuration GUI and requires little to no user interaction, so there are no user instructions for this probe.
Contents
Verify Prerequisites
Configure Replication Objects Monitoring
Configure Response Time Monitoring
Verify Prerequisites
Verify that required hardware, and software is available before you configure the probe. For more information, see ad_response (Active Directory
Response Monitoring) Release Notes.
Note: The Test Write option is enabled only if you select the Write mode option.
12. In the Counter Name section, specify the following values to generate alarms and thresholds for a counter.
Enable to publish alarms for the counter.
12.
Enable or disable the defined threshold for the counter.
Define the threshold value and severity for the selected counter, and click Save.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
ad_response Node
Active Directory Node
Replication
<Profile Name> Node
<Server Name>_<Profile Name> Node
<Profile Name> Node
Response
<Profile Name> Node
<Server Name>_<Profile Name> Node
<Profile Name> Node
Search
<Profile Name> Node
<Server Name>_<Profile Name> Node
<Profile Name> Node
ad_response Node
This node lets you view the probe information and configure the log level information of the probe.
Navigation: ad_response
ad_response > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
ad_response > Log Level Configuration
This section lets you configure the log level of the probe.
Log level: Specifies the detail level of the log file. Log as little as possible during normal operation to minimize disk consumption, and
increase the amount of detail when debugging.
Default: 0-Normal
ad_response > Message Properties
This section is read-only and displays a list of alarm messages available in the probe.
Message: Indicates the variables that are used in the message.
SubSystem: Indicates the alarm subsystem. All profiles using the same connection use this ID as their alarm subsystem.
Replication
This node lets you add a profile to the Replication node. The added profile monitors replication of modified objects between servers, sites, and
domains.
Navigation: ad_response > Active Directory > Replication
Replication > Options (icon) > Add Profile
This section lets you add a profile in the Replication node.
Profile Name: Defines the name of the profile to be added in the Replication node.
Server Address / Domain Name: Defines the AD server address or the domain name that the profile monitors.
User: Defines the user name that the profile uses to connect to the AD server.
Password: Defines the password of the user name to connect to the AD server.
Domain: Defines the domain name with which the user connects to the AD server.
Default: Deselected
Server Address / Domain Name: Defines the AD server address or the domain name that the profile monitors. Specify the AD server
address of the profile if the Bind to Specific Server option is selected. Otherwise, specify the fully qualified domain name of the profile.
User: Defines the user name that the profile uses to connect to the AD server.
Password: Defines the password of the user name to connect to the AD server.
Domain: Defines the domain name with which the profile connects to the AD server.
Actions > Test Connection: Tests the connection and displays the response time of connecting to the server.
profile name > Object
This section lets you define the object the profile monitors and is available only for the Replication profiles.
Container: Specifies the container in which the profile looks for the specified object and attribute.
Object: Defines the object, which the profile searches.
Attribute: Defines the attribute the profile monitors.
Write Mode: Provides the write access to the selected object and attribute.
Actions > Test Read: Checks the existence of the specified object and attribute. The function also checks for the the age of the attribute
since the last modification.
Actions > Test Write: Writes the current time to the selected attribute of the object.
Note: Test Write enables only if you select the Write Mode check box.
Response
This node lets you add a profile to the Response node to calculate and monitor the response time and the connect time of the server.
Navigation: ad_response > Active Directory > Response
Response > Options (icon) > Add Profile
This section lets you add a profile in the Response node.
Note: The field descriptions are same as described in the Add Profile section in the Replication node.
The Server Name Profile Name node is used to identify the server that the probe monitors, and does not contain any field or section.
Note: The field descriptions for General, Connection and Counter Name sections are same as described in the <Profile Name>
Node section in the Replication node.
Search
This node lets you create a profile, which calculates the search time and number of objects found.
Navigation: ad_response > Active Directory > Search
Response > Options (icon) > Add Profile
This section lets you add a profile in the Search node.
Note: The field descriptions are same as described in the Add Profile section in the Replication node.
Navigation: ad_response > Active Directory > Search > Profile Name > Server Name_Profile Name > Profile Name
profile name > General
This section lets you configure the general properties of the profile.
profile name > Connection
This section lets you configure the connection details for the profile to connect to the AD server.
profile name > Query
This section lets you define the search query, which the profile executes for monitoring and is available only for Search profiles.
Search Root Container: Specifies the objects to search in this field.
Include Subcontainers: Executes the query in the subcontainers under the root container.
Filter: Defines an LDAP query for the search.
Examples:
User id starting with "e": (sAMAccountName=e*)
User id NOT starting with "e": (!sAMAccountName=e*)
Last name starting with "a" and first name starting with "b": (&(sn=a*)(givenName=b*))
Last name starting with "a" or "b": (|(sn=a*)(sn=b*))
Actions > Test Query: Tests the query by performing a search in the root container. If the objects are found, the response time and
number of records are displayed.
profile name > Counter Name
This section lists the counters available for the object the profile monitors.
Note: The field descriptions for General, Connection and Counter Name sections are same as described in the <Profile Name>
Node section in the Replication node.
Contents
Verify Prerequisites
Configure Replication Objects Monitoring
Configure Response Time Monitoring
Configure Object Monitoring
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see ad_response (Active Directory
Response Monitoring) Release Notes.
7. Click the Counters tab that lists the counters available for the object being monitored by the profile.
Select a counter in the list and click Add.
The Response time threshold dialog appears.
Define the threshold value and severity for the selected counter, and click OK.
8. Define the following field values in the Object tab.
The container where the profile searches for the specified object and attribute.
The object that the profile searches.
Example: a user in the AD server such as an administrator.
The attribute that the profile monitors.
Example: the department of an administrator user.
9. Select the required options for the following fields:
Write mode option to enable the profile to write to the selected attribute.
Test Read to verify if the object and attribute exists.
The profile checks and displays a message whether the object and attribute exist. The message also displays the age of the attribute
since the last modification. The age
Test Write to add current time to the selected attribute of the object.
Note: The Test Write option is enabled only if you select the Write mode option.
Actions > Test Write to add current time to the selected attribute of the object.
Note: The Test Write option is enabled only if you select the Write mode option.
configuration tool for the probe, containing two window panes, a menu bar and a status bar. This article describes the fields and features of the
ad_response probe.
Contents
Probe GUI
Profile Properties
General tab
Connection Tab
Object Tab
Counters Tab
Query Tab
Probe GUI
Menu bar
The menu bar provides options to activate, deactivate, restart and exit the probe. You can also click Probe > Options to define the Log level.
Log level
Specifies the detail level of the log file. Log as little as possible during normal operation to minimize disk consumption, and increase the
amount of detail when debugging.
Default: Normal (0)
The probe also lets you manage the alarm messages available in the probe. Click Tools > Message pool manager to display Message Pool
Editor. The Editor allows you to create a new message, edit or delete an existing message.
Name: Indicates the message name, which is specified when you define thresholds in the probe.
Text: Indicates the variables that are used in the message.
Subsystem: Indicates the alarm subsystem. All profiles using the same connection use this ID as their alarm subsystem.
Left pane
The left pane contains a node called Active Directory, which further contains three groups. These groups are used to place functional profiles
together.
Replication
Response
Search
Right pane
The contents on the right pane depend on your selection in the left pane. The pane lists all the profiles of the group selected in the left pane. The
following information display about a profile:
The name of the profile
The server address (the fully qualified domain name)
If the profile is enabled or not
The poll interval (how often the profile collects data)
A short description of the profile
If you select the the Active Directory in the left pane, all the profiles are listed.
Icons
The icon next to the profile name in the list indicates the profile status. When you restart the probe, no icon is visible during the short initialization
period. After a few seconds, the following icons appear:
indicates an active profile, but no data is yet sampled. The sample period is defined in the profile properties.
indicates an inactive profile.
After the first sample period, one of the following icons should appear. Apart from green, other colors indicate the severity level of the threshold
breach.
Green means OK (Clear)
Information
Warning
Minor
Major
Critical
Profile Properties
Select one of the folders in the left pane, right-click in the right pane and select New to open the Properties dialog for a new profile. The
Properties dialog lets you specify the profile parameters.
General tab
The General tab lets you define the general properties of the profile.
The fields in the tab are explained as follows:
Name
A descriptive name of the profile.
Description
A short description of the profile to identify what you are monitoring.
Sample data every
Specifies how often the profile collects data. You can specify the interval in seconds, minutes, hours or days.
Default: 1 minutes
Enabled (runs at specified interval)
Select the check box to activate the profile. If activated, the profile samples the data at the specified poll interval.
Default: Deselected
Connection Tab
The Connection tab lets you define the connection properties for the profile.
The fields in the tab are explained as follows:
Options
Type:
Specifies the connection type used by the profile to connect to the AD server. The available options are: LDAP and Global Catalog.
Default: LDAP
Use secure connection
Lets you use a secure connection ensuring that all communications are encrypted when connecting to the AD server.
Default: Selected
Bind to specific server
Binds the connection to the AD server specified in the Server address field. If the defined server is unavailable, the connection fails. On
deselecting the Bind to specific server option, the Domain name field replaces the Server address field. The connection is successful
if at least one domain-controller (LDAP) or Global Catalog is available on the specified domain.
Default: Selected
Server address: (Fully qualified domain name)
Defines the AD server address or the fully qualified domain name that the profile connects to for monitoring objects.
Authentication
User:
Defines the user name that the profile uses to connect to the AD server.
Password:
Defines the password of the user name to connect to the AD server.
Domain:
Defines the domain name with which the user connects to the AD server.
Test Connection
Tests the connection and displays the response time of connecting to the server.
Object Tab
The Object tab lets you define the object and attribute that the profile monitors. This tab is only applicable for the Replication profiles.
The fields in the tab are explained as follows:
Container
Specifies the container in which the profile looks for the object and attribute.
Object
Defines the object, which the profile searches.
Attribute
Defines the attribute that the profile monitors.
Write mode
Provides the write access to the selected object and attribute.
Default: Deselected
Test Write
Writes the current time to the selected attribute of the object.
Test Read
Checks the existence of the specified object and attribute. The function also checks for the the age of the attribute since the last
modification.
Note: The Test Write field enables only if you enable Write mode option.
Counters Tab
The Counters tab lists the counters available for the object that the profile monitors.
The fields in the tab are explained as follows:
Counters
Lists the counters available for the object.
Thresholds
Define one or more thresholds for the selected counter.
Select a counter in the list and click the Add button to set or modify the threshold value and severity for the selected counter.
Add
Lets you define a threshold value and associated severity level for the selected counter. You can define several thresholds and
also define the alarm messages.
Remove
Removes the selected threshold. This button enables only when you select a threshold in the list.
Send QoS
Sends QoS data on the counters defined in the profile.
Query Tab
The Query tab lets you define the search query, which the profile executes for monitoring and is available only for Search profiles.
The fields in the tab are explained as follows:
Search root container:
Specifies the objects to be searched for in this field.
Include subcontainers
Executes a query in the subcontainers under the root container.
Default: Deselected
Filter:
Defines an LDAP query for the search.
Examples:
User id starting with "e": (sAMAccountName=e*)
User id NOT starting with "e": (!sAMAccountName=e*)
Last name starting with "a" and first name starting with "b": (&(sn=a*)(givenName=b*))
Last name starting with "a" or "b": (|(sn=a*)(sn=b*))
Test Query
Tests the query by performing a search in the root container. If the objects are found, the response time and number of records are
displayed.
ad_response Metrics
The following table describes the checkpoint metrics that can be configured using the ad_response probe.
Monitor Name
Units
Description
Version
QOS_AD_CONNECT_RESPONSE
Milliseconds
Measures the response time of connecting to the Active Directory (AD) server.
1.0
QOS_AD_REPLICATION_AGE
Seconds
1.0
QOS_AD_SEARCH_OBJECTS
Count
1.0
QOS_AD_SEARCH_RESPONSE
Milliseconds
Measures the response time of the search query performed on the Active Directory (AD)
server.
1.0
Note: The probe does not support the WMI of Datatype Reference.
Health Monitor: monitors status and response time for all the objects. For example, monitor the operations master schema, fetch
number of lost and found objects, and fetch AD replication partner synchronization status.
Each of these groups can have more than one monitoring profile. The active directory server probe is delivered with a default configuration of a
selected set of profiles to be monitored. You can also define your own profiles containing more than one counter.
More information:
ad_server (Active Directory Server Monitoring) Release Notes
Note: The active directory server probe is a local probe, which monitors the AD server of the host system only.
The following diagram outlines the process to configure the probe to monitor Active Directory servers.
Contents
Verify Prerequisites
Create Monitoring Profile(s)
Add Monitoring Object Details
Add Monitor(s)
Alarm Thresholds
Precedence for Threshold Alarms
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see ad_server (Active Directory
Server Monitoring) Release Notes.
Notes: Select an existing profile and edit the details in the General Configuration section.
Eventlog
Enter the details of the Eventlog to be monitored.
Follow these steps:
1. Specify the log file to fetch the events for monitoring.
2. Enter the computer name where the event has occurred.
3. Define the source or the publisher of the event.
4. Specify the severity of the event.
5. Enter the Windows user account with the event is generated.
6. Define the event category.
7. Define the unique identification number of the event.
8. Specify the event message text.
File
Enter the details of the File to be monitored.
Follow these steps:
1. Define the path of the file to be monitored.
File System
Enter the details of the File System to be monitored.
Follow these steps:
1. Define the directory path or file system to be monitored.
2. Enter the regular expression specific pattern to search within the file system.
3. Select Include Subdirectories to search and include sub directories of the file system.
Health Monitors
Performance Counters
Enter the details of the Performance Counter to be monitored.
Follow these steps:
1. Specify the performance object of the host to monitor.
2. Select the performance object instance to fetch data.
Process
Enter the details of the Process to be monitored.
Follow these steps:
1. Select the process to be monitored.
Service
Enter the details of the Service to be monitored.
Follow these steps:
1. Select the service to be monitored.
WMI
Enter the details of the WMI to be monitored.
Follow these steps:
1. Select the WMI namespace to be monitored.
2. Specify a class of the selected namespace to be monitored.
3. Enter the instance of the selected class to be monitored (when more than one instance is found).
Add Monitor(s)
A counter defines the area monitored by the profile. There is a list of predefined counters for each type of profile. You can add more than one
counter to a profile.
Important! The probe does not support counters, which return a datetime value.The probe GUI displays unexpected results when such
a counter is added to the profile.
You can configure only two thresholds in probe version 1.71 or later. However, if more than two thresholds were configured in
the previous version(s), they are also available in the later versions..
Monitors can only be created for Performance Counter and WMI categories.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Note: The ad_server probe is a local probe, which monitors the Active Directory (AD) server of the host system only.
Important! If the probe is migrated using threshold_migrator, it displays the standard static threshold block instead of the probe-specific
alarm thresholds.
Contents
ad_server Node
Event Logs Node
<Profile Name> Node
File Node
<Profile Name> Node
File System Node
<Profile Name> Node
Health Monitor Node
<Profile Name> Node
Performance Counter Node
<Profile Name> Node
Process Node
<Profile Name> Node
Service Node
<Profile Name> Node
WMI Node
<Profile Name> Node
ad_server Node
The ad_server node lets you view the probe information and configure log level of the probe.
Navigation: ad_server
Set or modify the following values if needed:
ad_server > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
ad_server > General Configuration
This section is used to configure log level of the probe.
Log Level: specifies the detail level of the log file.
Default: 0 - Fatal
Note: Specify only those fields, which are required for filtering the event logs and keep all other fields blank. Do not use
asterisk (*) in any other field because the regex is unsupported for filtering event logs.
File Node
The File node is used for monitoring host system files. Each monitoring profile is displayed under the File node.
Note: The fields of the General and the Monitors sections are same as described in the profile name node under the Event
Logs node.
Note: The fields of the General and the Monitors sections are same as described in the profile name node under the Event
Logs node.
Object Found: calculates the number of objects in the target root folder of the server. For example, use this counter for fetching the AD
lost objects and number of domain controller replication partners.
Status: monitors the object availability. For example, use this counter for fetching status of the lost and found object container and AD
replication partners' synchronization status.
Note: The probe contains a default set of health monitoring profiles for demonstrating counters usage. All monitoring profiles
are deactivated, by default; you can activate them manually.
Notes:
The fields of the General and the Monitors sections are same as described in the profile name node under the Event Logs
node.
Use the Add Monitor option on the profile name node for adding any monitor to the list.
Severity: specifies the alarm message severity.
Operator: specifies the threshold operator for comparing the actual and threshold value.
Value: defines the counter threshold value for generating alarms.
Message: specifies the alarm message when the threshold value breaches.
Similarly, you can configure the second threshold for the counter in the corresponding Severity 2, Operator, Value,
and Message fields.
Process Node
The Process node lets you monitor the running processes of the host system (for example, notepad.exe).
Note: The fields of the General and the Monitors sections are same as described in the profile name node under the Event
Logs node.
Service Node
The Service node lets you monitor running services of the host system (for example, DNScache).
Note: The fields of the General and the Counter: Counter Name sections are same as described in the profile name node
under the Event Logs node.
WMI Node
The WMI node is used for creating a monitoring profile for fetching WMI-related data from the host system. The WMI data is used for
consolidating the management of devices and applications in a network from the Windows environment.
Note: Use the Add Monitor option on the profile name node for adding any monitor to the list.
Severity: specifies the alarm message severity.
Operator: specifies the threshold operator for comparing the actual and threshold value.
Value: defines the counter threshold value for generating alarms.
Message: specifies the alarm message when the threshold value breaches.
Similarly, you can configure the second threshold for the counter in the corresponding Severity 2, Operator, Value,
and Message fields.
Note: The active directory server probe is a local probe, which monitors the AD server of the host system only.
The following diagram outlines the process to configure the probe to monitor Active Directory servers.
Content
Verify Prerequisites
Create Monitoring Profile
Adding Counters
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see ad server (Active Directory
Server) Release Notes.
Notes:
When you rename a profile, save the configuration and restart the probe to activate it.
You cannot enter slash (/) or comma (,) in the profile name.
General and Counters tabs are common profiles, while the third tab is specific to the selected monitoring category.
The Counters list will only be generated, after the name and description has been entered for the profile.
The Description does not support multi-line text.
You cannot copy content into Name and Description fields.
Adding Counters
This tab lists the counters available for the category being monitored by the profile. This tab is valid for all types of profiles.
Follow these steps:
1. Select a Counter from the list.
2. Select Send QoS data to enable QoS.
3. Select Send Alarm to send alarms on the counters defined in the profile.
4. Click Add to specify a threshold value and associated severity level for the selected counter.
5. Select the Value of the counter.
6. Select the Severity of the counter.
7. Select the Message for the counter.
Note: The Message Text and the Subsystem is displayed automatically according to the message selected.
Probe GUI
The Application Window
General Tab
Variable Tab
Probe GUI
The probe is configured by double-clicking the probe in the Infrastructure Manager, which brings up the GUI (configuration tool).
The window consists of two panes, a menu bar at the top and a status bar at the bottom.
Notes:
Select Help > Show help from the menu bar or from the Help button at the bottom right for online help documentation.
Alternatively, you can press the F1 key for quick access.
Click the Apply button to activate any configuration modifications done.
Files
Filesystems
Health Monitor
Performance counters
Processes
Services
WMI
If you select active directory server node, all default profiles are displayed in the right pane. You can also define new profile(s).
Each profile can contain several counters.
The right pane
This pane lists the profiles. Select the Active Directory Server node in the left pane to open profiles in the right pane.
Select the active directory server node in the left pane to open categorized profiles under groups. The following information can be found in the
list:
The name of the profile
If the profile is enabled or not
The last status message of the profile
The group name for each profile is shown in the list.
Profile Status
The icons indicate the profile status.
Note: The profile status icon appears after a short initialization period.
indicates an active profile, but data is not sampled yet. The sample period is defined in the profile property dialog.
indicates an inactive profile.
After the first sample period, one of the following icons appears.
Green means OK (Clear).
Other colors indicate the severity level for breaching the threshold value of profile:
Information
Warning
Minor
Major
Critical
The Menu bar
The option in the menu bar helps in managing the probe.
File
Note: You can also save configuration modifications by simultaneously pressing Ctrl+S on your keyboard.
Exit
Quits the configuration tool. If you have modified your configuration, save the changes before exiting.
Probe
This menu entry contains four options:
Activate
Starts the probe again and the status Started is displayed in the Status bar.
This option is enabled only when the probe is stopped/deactivated.
Deactivate
Stops the probe and the status Stopped is displayed in the Status bar.
This option is enabled only when the probe is activated.
Restart
Stops and then starts the probe.
Options
Sets the log level details of the probe. Maintain log as less as possible, to minimize the disk consumption. Increase the log level
temporarily for debugging purposes.
Help
This Help menu contains two options:
Show help
Displays the probe documentation.
You can also press F1 key or click the Help button at the bottom right.
About
Displays the version of the probe.
General Tab
You can define the general properties for the profile.
This tab is valid for all types of profiles.
The fields in the dialog are explained below:
Name
Defines the name of the profile.
Notes:
Do not enter characters like slash (/) or comma (,) in the profile name.
Sve the configuration and restart the probe to activate the profile, if you rename a profile.
Description
Provides a short description of the profile.
This description is not displayed in the list of profiles in the right pane.
Sample data every
Specifies the profile poll interval in seconds, minutes, hours or days.
Startup delay
Defines a delay in seconds, minutes, hours or days, before a profile starts polling.
Enabled (runs at specified interval)
Enables the profile to sample the data at the poll interval specified.
Variable Tab
The variable tab is the 3rd tab in the New Profile window. The name of this tab(s) is dependent on the selected monitoring category.
The Eventlog tab
Note: Specify only those fields, which are required for filtering the event logs and keep all other fields blank. Do not use an asterisk (*)
in any other field because the regex is unsupported for filtering event logs.
Description
Indicates a read-only field containing a description of the selected performance object.
Instance
Specifies the instance of an object available on the system.
The field is disabled, if the the instance is not available.
The Process tab
This tab lets you define the process the profile will monitor.
This tab is valid for Processes profiles only.
The fields in this dialog are explained below:
Process name
Specifies the process that the profile will monitor.
The Service tab
This tab lets you define the service that the profile will monitor.
This tab is valid for Service profiles only.
The fields in the dialog are explained below:
Service
Specifies the service to that the profile will monitor.
Display name
Identifies the name of the service.
Description
Indicates a brief description of the selected service.
The WMI tab
This tab lets you define the WMI to that the profile will monitor.
This tab is valid for WMI profiles only.
The fields in the dialog are explained below:
Namespace
Specifies the namespace of the WMI to be monitored.
Class
Specifies the class to be monitored.
Instance
Specifies the instance, where more than one is found.
The Health Monitor tab
This tab lets you monitor the health of the ad_server.
The fields in the dialog are explained below:
Search Root Criteria
Defines the search root criteria to fetch Response Time, Objects Found, and Status counters value. Mention replication for a
replication profile and for the PDC profile leave it blank. Enter search criteria for a normal profile in the DC=demodomain,DC=local form
at.
Note: The probe does support counters, which return a date time value. The probe displays an error when you try adding a threshold
for an unsupported counter.
Thresholds
Select a counter in the list and click the Add button to set or modify the threshold value and severity for the selected counter.
Note: The probe contains a default set of health monitoring profiles for demonstrating counters usage. All monitoring
profiles are deactivated, by default; you can activate them manually.
WMI
As given for the ad_server configuration.
Default monitores wmi counters:
Microsoft_DomainTrustStatus (exist only on windows 2003 server), TrustIsOK, false is critical.
MSAD_ReplNeighbor(exist only on windows 2003 server), ModifiedNumConsecutiveSyncFailures, > 0 is critical, and
Select a message to be issued if the threshold specified is breached. The Text and Subsystem fields are read-only.
Send QoS data
The probe sends QoS data on the counters defined in this profile.
Send Alarm
The probe sends alarms on the counters defined in this profile.
Add
Allows you to define a threshold value and associated severity level for the selected counter (see above).
You can define several thresholds.
Remove
Removes the selected threshold. This button is enabled only when a threshold is selected in the threshold list.
Note: The Active Directory Server probe is a local probe, which monitors the AD server of the host system only
The following diagram outlines the process to configure the probe to monitor Active Directory servers.
Note: The Counters for each profile are specific to monitoring category.
Adding Counters
A counter defines the area, which the profile monitors. For each type of profile, there is a list of predefined counters. You can add more than one
counter to the profile.
Important! The probe does not support those counters, which returns a datetime value. The probe GUI shows unexpected results
when any such counter is added to the probe.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Delete Profile
You can delete a monitoring profile when you no longer want the probe to monitor it.
Follow these steps:
1. Click the Options icon next to the profile name node that you want to delete.
2. Click the Delete Profile option.
3. Click Save.
The profile is deleted.
Note: The active directory server probe is a local probe, which monitors the AD server of the host system only.
Content
ad_server Node
Event Logs Node
<Profile Name> Node
File Node
<Profile Name> Node
File System Node
<Profile Name> Node
Performance Counter Node
<Profile Name> Node
Process Node
<Profile Name> Node
Service Node
<Profile Name> Node
WMI Node
<Profile Name> Node
Health Monitor Node
<Profile Name> Node
ad_server Node
The ad_server node lets you view the probe information and configure log level of the probe.
Navigation: ad_server
Set or modify the following values as required:
ad_server > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
ad_server > General Config
This section is used to configure log level of the probe.
Log Level: specifies the detail level of the log file.
Default: 0 - Fatal
Notes:
This node is user-configurable and is named as the profile name node.
You cannot enter characters like slash (/) or comma (,) in the profile name.
Note: Specify only those fields, which are required for filtering the event logs and keep all other fields blank. Do not use
asterisk (*) in any other field because the regex is unsupported for filtering event logs.
File Node
The File node is used for monitoring host system files. Each monitoring profile is displayed under the File node.
Note: The fields of the General and the Monitors sections are same as described in the profile name node under the Event
Logs node.
Note: The fields of the General and the Monitors sections are same as described in the profile name node under the Event
Logs node.
Note: Use the Add Monitor option on the profile name node for adding any monitor to the list.
Severity: specifies the alarm message severity.
Operator: specifies the threshold operator for comparing the actual and threshold value.
Value: defines the counter threshold value for generating alarms.
Message: specifies the alarm message when the threshold value breaches.
Similarly, you can configure the second threshold for the counter in the corresponding Severity 2, Operator, Value,
and Message fields.
Process Node
The Process node lets you monitor the running processes of the host system (for example, notepad.exe).
Note: The fields of the General and the Monitors sections are same as described in the profile name node under the Event
Logs node.
Service Node
The Service node lets you monitor running services of the host system (for example, Dnscache).
Note: The fields of the General and the Counter: Counter Name sections are same as described in the profile name node
under the Event Logs node.
WMI Node
The WMI node is used for creating a monitoring profile for getting WMI-related data from the host system. The WMI data is used for consolidating
the management of devices and applications in a network from the Windows environment.
Object Found: calculates the number of objects in the target root folder of the server. For example, use this counter for fetching the AD
lost objects and number of domain controller replication partners.
Status: monitors the object availability. For example, use this counter for fetching status of the lost and found object container and AD
Note: The probe contains a default set of health monitoring profiles for demonstrating counters usage. All monitoring profiles
are deactivated, by default; you can activate them manually.\
Note: The Active Directory Server probe is a local probe, which monitors the AD server of the host system only.
The following diagram outlines the process to configure the probe to monitor Active Directory servers.
Note: You can also edit an existing profile. Refer Editing Monitoring Profile for more information.
4. Create a Counter for the new monitoring profile. You can also edit the counters for existing profiles.
Refer to Adding Counters to a Profile.
5. Save the configuration to start monitoring.
Notes:
General and Counters tab(s) is common for all types of profiles, while the third tab is specific to the monitoring category.
The Counters tab will only be enabled, after a profile has been created.
Probe GUI
The General Tab
The <variable> Tab
Probe GUI
The active directory server probe is configured by double-clicking the probe in the Infrastructure Manager, which brings up the GUI
(configuration tool).
The application window
The window consists of two panes, a menu bar at the top and a status bar at the bottom.
Notes:
Select Help > Show help from the menu bar or from the Help button at the bottom right for online help documentation.
Alternatively, you can press the F1 key for quick access.
Click the Apply button to activate any configuration modifications done.
Files
Filesystems
Performance counters
Processes
Services
WMI
Health Monitor
If you select active directory server node, all default profiles are displayed in the right pane.You can also define new profile(s).
Each profile can contain several counters.
The right pane
This pane lists the profiles. Select the active directory server node in the left pane to open profiles in the right pane.
Select the active directory server node in the left pane to open categorized profiles under groups. The following information can be found in the
list:
The name of the profile.
If the profile is enabled or not.
The last status message of the profile.
The group name for each profile is shown in the list.
Profile Status
The below mentioned icons indicate the profile status.
The profile status icon only occur after a short initialization period.
indicates an active profile, but data is not sampled yet. The sample period is defined in the profile property dialog.
indicates an inactive profile.
After the first sample period, one of the following icons appears.
Green means OK (Clear).
Other colors indicate the severity level for breaching the threshold value of profile:
Information
Warning
Minor
Major
Critical
The Menu bar
The option in the menu bar helps in managing the probe.
File
Note: You can also save configuration modifications by simultaneously pressing the Ctrl+S keys on your keyboard.
Exit
Quits the configuration tool. If you have modified your configuration, save the changes before exiting.
Probe
This menu entry contains four options:
Activate
Starts the probe again and the status Started is displayed in the Status bar.
This option is enabled only when the probe is stopped/deactivated.
Deactivate
Stops the probe and the status Stopped is displayed in the Status bar.
This option is enabled only when the probe is activated.
Restart
Stops and then starts the probe.
Options
Sets the log level details of the probe. Maintain log as less as possible, to minimize disk consumption. Increase the log level
temporarily for debugging purposes.
Help
This Help menu contains two options:
Show help
Displays the probe documentation.
You can also press F1 key or click the Help button at the bottom right.
About
Displays the version of the probe.
Notes:
You must not enter characters like slash (/) or comma (,) in the profile name.
If you rename any profile, you must save the configuration and restart the probe to activate the profile.
Description
Provides a short description of the profile.
This description will not be displayed in the list of profiles in the right pane.
Sample data every
Specifies the profiles poll interval in seconds, minutes, hours or days.
Startup delay
Defines a delay in seconds, minutes, hours or days, before a profile starts polling.
Enabled (runs at specified interval)
Enables the profile to sample the data at the poll interval specified.
Note: Specify only those fields, which are required for filtering the event logs and keep all other fields blank. Do not use asterisk (*) in
any other field because the regex is unsupported for filtering event logs.
Object
Specifies the performance object available on the host to be monitored.
Description
Indicates a read-only field containing a description of the selected performance object.
Instance
Specifies the instance of an object if available on the system. Otherwise, the field is grayed out.
The Process tab
This tab lets you define the process to be monitored by the profile.
This tab is valid for Processes profiles only.
The fields in the above dialog are explained below:
Process name
Specifies the process to be monitored by this profile.
The Service tab
This tab lets you define the service to be monitored by the profile.
This tab is valid for Service profiles only.
The fields in the above dialog are explained below:
Service
Specifies the service to be monitored by the profile.
Display name
Identifies the name of the service.
Description
Indicates a brief description of the selected service.
The WMI tab
This tab lets you define the WMI to be monitored by the profile.
This tab is valid for WMI profiles only.
The fields in the above dialog are explained below:
Namespace
Specifies the namespace of the WMI to be monitored.
Class
Specifies the class to be monitored.
Instance
Specifies the instance, if more than one is found.
The Health Monitor tab
This tab lets you monitor the health of the ad_server.
The fields in the above dialog are explained below:
Search Root Criteria
Defines the search root criteria to fetch Response Time, Objects Found, and status counters value. Mention replication for a
replication profile and for the PDC profile leave it blank. Enter search criteria for a normal profile in the DC=demodomain,DC=local form
at.
The fields of the Counters Tab are explained as follows:
Note: The probe does support counters, which return a date time value. The probe displays an error when you try adding a threshold
for an unsupported counter.
Thresholds
Select a counter in the list and click the Add button to set or modify the threshold value and severity for the selected counter.
The counters available depends on category:
EventLog
EventFound (true/false)
NumberOfEventsFound
Files
Changed: seconds since file was changed
Created : seconds since file was created
NotFound (true / false)
Filesystems
Directories: Number of directories in the directory specified.
NotFound (true / false). This is set to true is critical for default Filesystem profiles.
FileAgeNewest: Age in seconds of the newest file in the directory.
FileAgeOldest: Age in seconds of the oldest file in the directory.
Files: Number of files in the directory specified.
TotalSize: Size of all the files in the directory specified, in bytes.
The ad_server configuration: (pre-configured)
C:\ , TotalSize > 50000000000.
And C:\Windows\NTDS\ ntds.dit , TotalSize > 50000000000.
Performance counters
Will vary with the type of counter.
As given for the ad_server configuration.
LSASS: % Processor time, default threshold value: = 50 is warning.
IO Read bytes per sec: > 100000 is major.
IO Write bytes per sec: > 100000 is major.
Page faults/sec: > 1000 is major.
NTFRS: the same as for LSASS.
Processes
As given for the ad_server configuration: none pre-configured.
Services
As given for the ad_server configuration.
Default monitored services:
Lan Manager, File replication service, DNS client, Security accounts manager, Intersite messaging, Kerberos key distribution center,
net logon.
Threshold for all Services: State != Running is minor.
Health Monitor
The Health Monitor tab lets you configure counters for monitoring the AD Server health. These counters help you understand that
current activities of the AD Server are within normal and healthy parameters. The Health Monitor tab contains the following counters:
Response Time: calculates the connection (bind) time for getting a response from the object. For example, use this counter for
fetching the last bind time of the Operations Master Infrastructure, Operations Master Domain naming, Operations Master
Primary Domain Controller (PDC), Operations Master Relative Identifier (RID), and Operations Master Schema objects.
Important! The Response Time counter is not applicable for replication profiles and returns a NULL value.
Note: The probe contains a default set of health monitoring profiles for demonstrating counters usage. All monitoring
profiles are deactivated, by default; you can activate them manually.
WMI
As given for the ad_server configuration.
Default monitores wmi counters:
Microsoft_DomainTrustStatus (exist only on windows 2003 server), TrustIsOK, false is critical.
MSAD_ReplNeighbor(exist only on windows 2003 server), ModifiedNumConsecutiveSyncFailures, > 0 is critical, and
NumConsecutiveSyncFailures, > 0 is critical.
Select a message to be issued if the threshold specified is breached. The Text and Subsystem fields are read-only.
Send QoS data
The probe sends QoS data on the counters defined in this profile.
Send Alarm
The probe sends alarms on the counters defined in this profile.
Add
Allows you to define a threshold value and associated severity level for the selected counter (see above).
You can define several thresholds.
Remove
Removes the selected threshold. This button is enabled only when a threshold is selected in the threshold list.
ad_server Metrics
The following section describes the metrics that can be configured with the Active Directory Server Monitoring (ad_server) probe.
Contents
QoS Metrics
Alert Metrics Default Settings
QoS Metrics
The following table describes the QoS metrics that can be configured using the ad_server probe.
Resource
Monitor Name
Units
Description
Version
Eventlogs
QOS_NUMBEROFEVENTSFOUND
v1.1
Eventlogs
QOS_CHANGED
v1.1
Eventlogs
QOS_CREATED
v1.1
Filesystems
QOS_DIRECTORIES
v1.1
Filesystems
QOS_FILEAGENEWEST
v1.1
Filesystems
QOS_FILEAGEOLDEST
v1.1
Filesystems
QOS_FILES
Number of files.
v1.1
Filesystems
QOS_TOTALSIZE
v1.1
Performance
Counters
QOS_SQLCLIENT:_CURRENT_#_CONNECTION_POOLS
v1.1
Performance
Counters
QOS_SQLCLIENT:_CURRENT_#_POOLED_AND_NONPOOLED_CONNECTIONS
v1.1
Performance
Counters
QOS_SQLCLIENT:_CURRENT_#_POOLED_CONNECTIONS
v1.1
Performance
Counters
QOS_SQLCLIENT:_PEAK_#_POOLED_CONNECTIONS
v1.1
Performance
Counters
QOS_SQLCLIENT:_TOTAL_#_FAILED_COMMANDS
v1.1
Performance
Counters
QOS_SQLCLIENT:_TOTAL_#_FAILED_CONNECTS
v1.1
Performance
Counters
QOS_WORKFLOWS_ABORTED
Workflows aborted.
v1.1
Performance
Counters
QOS_WORKFLOWS_ABORTED/SEC
v1.1
Performance
Counters
QOS_WORKFLOWS_COMPLETED
Workflows completed.
v1.1
Performance
Counters
QOS_WORKFLOWS_COMPLETED/SEC
v1.1
Performance
Counters
QOS_WORKFLOWS_CREATED
Workflows created.
v1.1
Performance
Counters
QOS_WORKFLOWS_CREATED/SEC
v1.1
Performance
Counters
QOS_WORKFLOWS_EXECUTING
Workflows executing.
v1.1
Performance
Counters
QOS_WORKFLOWS_IDLE/SEC
v1.1
Performance
Counters
QOS_WORKFLOWS_IN_MEMORY
Workflows in memory.
v1.1
Performance
Counters
QOS_WORKFLOWS_LOADED
Workflows loaded.
v1.1
Performance
Counters
QOS_WORKFLOWS_LOADED/SEC
v1.1
Performance
Counters
QOS_WORKFLOWS_PENDING
Workflows pending.
v1.1
Performance
Counters
QOS_WORKFLOWS_PERSISTED
Workflows persisted.
v1.1
Performance
Counters
QOS_WORKFLOWS_PERSISTED/SEC
v1.1
Performance
Counters
QOS_WORKFLOWS_RUNNABLE
Workflows runnable.
v1.1
Performance
Counters
QOS_WORKFLOWS_SUSPENDED
Workflows suspended.
v1.1
Performance
Counters
QOS_WORKFLOWS_SUSPENDED/SEC
v1.1
Performance
Counters
QOS_WORKFLOWS_TERMINATED
Workflows terminated.
v1.1
Performance
Counters
QOS_WORKFLOWS_TERMINATED/SEC
v1.1
Performance
Counters
QOS_WORKFLOWS_UNLOADED
Workflows unloaded.
v1.1
Performance
Counters
QOS_WORKFLOWS_UNLOADED/SEC
v1.1
Performance
Counters
QOS_FRAGMENTATION_FAILURES
Fragmentation failures.
v1.1
Processes
QOS_EXECUTIONSTATE
Execution state.
v1.1
Processes
QOS_HANDLECOUNT
Handle count.
v1.1
Processes
QOS_KERNELMODETIME
100ns
v1.1
Processes
QOS_MAXIMUMWORKINGSETSIZE
Kilobytes
v1.1
Processes
QOS_MINIMUMWORKINGSETSIZE
Kilobytes
v1.1
Processes
QOS_OTHEROPERATIONCOUNT
v1.1
Processes
QOS_OTHERTRANSFERCOUNT
Bytes
v1.1
Processes
QOS_PAGEFAULTS
Page faults.
v1.1
Processes
QOS_PAGEFILEUSAGE
Kilobytes
v1.1
Processes
QOS_PARENTPROCESSID
v1.1
Processes
QOS_PEAKPAGEFILEUSAGE
Kilobytes
v1.1
Processes
QOS_PEAKVIRTUALSIZE
Bytes
v1.1
Processes
QOS_PEAKWORKINGSETSIZE
Kilobytes
v1.1
Processes
QOS_PRIORITY
Process priority.
v1.1
Processes
QOS_PRIVATEPAGECOUNT
v1.1
Processes
QOS_PROCESSID
Process Id.
v1.1
Processes
QOS_QUOTANONPAGEDPOOLUSAGE
v1.1
Processes
QOS_QUOTAPAGEDPOOLUSAGE
v1.1
Processes
QOS_QUOTAPEAKNONPAGEDPOOLUSAGE
v1.1
Processes
QOS_QUOTAPEAKPAGEDPOOLUSAGE
v1.1
Processes
QOS_READOPERATIONCOUNT
v1.1
Processes
QOS_READTRANSFERCOUNT
Bytes
v1.1
Processes
QOS_SESSIONID
Session Id.
v1.1
Processes
QOS_THREADCOUNT
Thread count.
v1.1
Processes
QOS_USERMODETIME
100ns
v1.1
Processes
QOS_VIRTUALSIZE
Bytes
Virtual size.
v1.1
Processes
QOS_WORKINGSETSIZE
v1.1
Processes
QOS_WORKINGOPERATIONCOUNT
v1.1
Processes
QOS_WRITETRANSFERCOUNT
v1.1
WMI
QOS_COUNTERVALUE
Counter value.
v1.1
Health Monitor
QOS_RESPONSETIME
milliseconds
v1.1
Health Monitor
QOS_OBJECTFOUND
count
(number)
v1.1
Health Monitor
QOS_STATUS
count
(number)
Status of connection (4
indicates failure and 0
indicates success).
v1.1
Alert Metric
Warning
Threshold
Warning
Severity
Error
Error
Description
Threshold Severity
Version
Eventlogs
Event Found
TRUE
Critical
Event matched
v1.1
Eventlogs
Critical
Event matched
v1.1
Files
Not Found
TRUE
Critical
File failure
v1.1
Files
Changed
Critical
File failure
v1.1
Files
Created
Critical
File failure
v1.1
Filesystems
Directories
Critical
v1.1
Filesystems
Critical
v1.1
Filesystems
Critical
v1.1
Filesystems
Files
Critical
v1.1
Filesystems
Total Size
Critical
v1.1
Filesystems
Not Found
True
Critical
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
% Processor Time
50
Warning
Performance failure
v1.1
Performance
Counters
100000
Major
Performance failure
v1.1
Performance
Counters
100000
Major
Performance failure
v1.1
Performance
Counters
1000
Major
Performance failure
v1.1
Performance
Counters
Fragmentation failures
Critical
Performance failure
v1.1
Performance
Counters
Active Lines
Critical
Performance failure
v1.1
Performance
Counters
Active Telephones
Critical
Performance failure
v1.1
Performance
Counters
Client Apps
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Lines
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Telephone Devices
Critical
Performance failure
v1.1
Performance
Counters
Active Sessions
Critical
Performance failure
v1.1
Performance
Counters
Inactive Sessions
Critical
Performance failure
v1.1
Performance
Counters
Total Sessions
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Workflows Aborted
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Workflows Completed
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Workflows Created
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Workflows Executing
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Workflows in Memory
Critical
Performance failure
v1.1
Performance
Counters
Workflows Loaded
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Workflows Pending
Critical
Performance failure
v1.1
Performance
Counters
Workflows Persisted
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Workflows runnable
Critical
Performance failure
v1.1
Performance
Counters
Workflows suspended
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Workflows Terminated
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
Workflows Unloaded
Critical
Performance failure
v1.1
Performance
Counters
Critical
Performance failure
v1.1
Performance
Counters
HiPerf Classes
Critical
Performance failure
v1.1
Performance
Counters
HiPerf Validity
Critical
Performance failure
v1.1
Processes
Caption
Critical
Process failed
v1.1
Processes
Command line
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
CS Name
Critical
Process failed
v1.1
Processes
Description
Critical
Process failed
v1.1
Processes
Executable Path
Critical
Process failed
v1.1
Processes
Execution State
Critical
Process failed
v1.1
Processes
Handle Count
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Name
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
OS Name
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Page Faults
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Parent Process ID
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Priority
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Process ID
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Session ID
Critical
Process failed
v1.1
Processes
Status
Degraded
Critical
Process failed
v1.1
Processes
Thread Count
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Virtual Size
Critical
Process failed
v1.1
Processes
Windows Version
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Processes
Critical
Process failed
v1.1
Services
State
Continue
Pending
Critical
v1.1
WMI
Counter Value
Critical
WMI failure
v1.1
Health
Monitor
ResponseTime
15000
Major
30000
Critical
v1.1
Health
Monitor
ObjectFound
10
Warning
100
Major
v1.1
Health
Monitor
status
Warning
Connection status
v1.1
Contents
Prerequisites
Configure General Properties
Select Event Log File
Add Profiles
(Optional) Add Exclude Profile
Alarm Thresholds
Prerequisites
Verify that required hardware, software, and information is available before you configure the probe. For more information, see adevl (Active
Directory Events Monitoring) Release Notes.
5. Configure the various non-English event severity strings with appropriate severities in English in the Language String Configuration sec
tion.
Note: The Language String Configuration is applicable when the probe is deployed on Windows Vista or Windows Server
5.
Note: The Event Log drop-down displays only those log files that are selected in the adevl > Log Files Configuration section
.
4. Click Submit.
The logs of selected event log files are displayed in the Event Log Status section.
5. Select the appropriate event in the Event Log Status list and the event details are displayed below the list.
6. Select the New Profile option from the Actions drop-down list.
7. Define the New Profile Name and click Submit.
The new profile appears under the Profile node. You can select the profile name node and can verify in the Event Selection section that
event details are already filled.
Similarly, select the Exclude Profile option from the Actions drop-down list for creating an exclude profile for the event. Select Clear
Log to delete the log from the list.
Add Profiles
You can add a monitoring profile which is displayed as a child node under the Profiles node.
Follow these steps:
1. Click the Options (icon) beside the Profiles node.
2. Click the Add New Profile option.
3. Add the field information, activate the profile, and click Submit.
The profile is saved and you can configure the profile properties to monitor the event log status.
Important! Do not use slash (/) in the profile name; else the probe trims the profile name from the slash (/) character and
discards the profile properties. For example, if the profile name is My/Profile then the probe only saves My as the profile name.
4. Add criteria for event selection in the Event Selection Criteria section.
5. Select QoS properties and alarm messages in QoS and Alarm sections, as applicable.
6. Click the New button to define variables with a set of conditions for each profile in the Variables section.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
adevl Node
<Host Name> Node
Exclude Node
<Exclude Profile> Node
Profiles Node
<Profile Name> Node
adevl Node
The adevl node is used to configure the general settings of the probe. These settings are applicable to all monitoring profiles of the probe.
Navigation: adevl
Set or modify the following values if needed:
adevl > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the vendor who created the probe.
adevl > Properties
This section lets you configure the general properties of the Active Directory Events probe.
Description Delimiter: defines an ASCII character to replace the existing character as delimiter. Recommendation is to use a special
character as delimiter.
Remove Recurring Delimiter: removes the repetition of delimiter.
Default: Not selected
Generate New Metric Id: displays different metric ID when adevl and ntevl probes are deployed on the same robot.
Run Type: specifies the condition when the probe is triggered for updating the events list. You can select one of the following options:
Poll: lets you configure the time interval (in the Poll interval (Seconds) field) for fetching the details of new events.
Default: 30
Alarm Timeout (Seconds): lets you configure field for generating an alarm, when the probe is unable to fetch new event details
within the specified time limit.
Default: 10
Event: updates the event list when the new event is logged in the event log file. Configure the Alarm Timeout (Seconds) field for
generating an alarm, when the probe is unable to fetch new event details within the specified time limit.
Default: 10
Default post subject: defines the default subject of the alarm messages, which the probe generates. This default post subject can be
overridden while creating a monitoring profile.
Default: adevl
A subject, which is used internally in CA UIM for alarm messages, cannot be used in this field:
alarm
alarm_new
alarm_update
alarm_close
alarm_assign
alarm_stats
QOS_MESSAGE
QOS_DEFINITION
In case, any of the given subject is used then the probe uses the evl_ as the message subject. If the field is left blank, probe uses adevl as the
default post message subject.
Note: This field only defines the default post message subject, select the Post Message option in the Alarm section of the profile name
node for sending the message. You can even override the message subject there.
Column Prefix: defines a text for appending after each column name of the alarm message.
Default: evl_
Log File: defines the log file where the probe logs information about its internal activity.
Default: adevl.log
Log File Size (KB): sets the size of the log file.
Default: 100
Log Level: defines how much information is written to the log file.
Default: 0-fatal
Maximum Events to fetch: defines the maximum number of latest events that the probe fetches from each event log file and displays in
the Event Log section. If the field is left blank, the probe displays all the events.
Default: 1000
Important! Do not configure the field value to more than 1000 else the probe can stop responding. Refer the Known Issues section of
adevl (Active Directory Events Monitoring) Release Notes for more detail.
Output Encoding: specifies the character encoding for generating alarms and QoS messages when the probe is deployed in a
non-English locale.
Default: blank
System Encoding: specifies the system and output encoding where the probe is installed.
Default: blank
Note: The probe auto-detects the system and output encoding when these field values are blank. However, the recommendation is to
specify the appropriate encoding in the fields. You can use UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE, Shift_JIS,
ISO-2022-JP, ISO-2022-CN, ISO-2022-KR, GB18030, GB2312, Big5, EUC-JP, EUC-KR, ISO-8859-1, ISO-8859-2, windows-1250, and
windows-1252 encodings.
Alarm List Size: defines the buffer size for storing the event details that match the monitoring profile criteria. This field is useful when a
profile generates an alarm when a certain number of events are found. For example, a monitoring profile generates an alarm when the
matching events count reaches 50. Since the event count is up to 49; probe keeps the events detail in the buffer.
Default: 1000
WMI Query Timeout: defines the time-out interval of WMI query for fetching the monitoring data. The probe uses WMI queries when
hosted on operating systems earlier than Windows Server 2008.
Default: 1
Note: The WMI service must be enabled on the host system for this option to work.
WMI Timeout Interval: specifies the unit of WMI query time-out interval.
Default: Seconds
adevl > Log Files Configuration
This section lets you select the log files, which the probe monitors. Select a log file from the Available list and add it to the Selected list.
adevl > Subsystems Configuration
This section lets you define a different alarm subsystem ID for each monitored log file. Use the New button and define the new subsystem ID by
configuring the following fields:
Subsystem Key: defines a subsystem key for the appropriate log file. This key must be identical to the corresponding log file name,
contain only small characters, and the slash (/) character is replaced with $$. For example, the key is microsoft-iis-configuration$$adm
inistrative for the Microsoft-IIS-Configuration/Administrative log file.
Subsystem Value: defines a different alarm subsystem ID for each monitored log file. The recommendation is to use the default
subsystem ID pattern (1.1.11.1.X) for other log files too. This pattern is mandatory to view the metric details under the Event Log node of
the Unified Management Portal (UMP).
You can also define an appropriate name of newly defined subsystem value in the nas probe, else subsystem value is displayed as is on
UMP.
The default configuration of the probe monitors dns, filerep, and directory log files with subsystem ID 2.1.2.
Important! Do not delete or modify any of the default subsystem IDs.
Note: The Language String Configuration is applicable when the probe is deployed on Windows Vista or Windows Server 2008 R2 or a
later version.
Note: Use the Options icon of the adevl node and specify the log file for displaying the events list in this Event Log Status grid.
This option allows you to select an event log that you want to the probe to monitor.
<Host Name> Node
The host name node is used to identify the host of the system on which the probe is deployed. This node does not contain any field or section and
is used for classifying the exclude and monitoring profiles.
Exclude Node
The Exclude node is used to create a profile for excluding the events from monitoring by the probe. This node does not contain any field or
section, but contains only child nodes where each child node is a different exclude profile.
Navigation: adevl > Exclude
Set the following values if needed:
Exclude > Options (icon) > + Exclude Profile
This option allows you to create and activate an exclude profile using the Options icon.
<Exclude Profile> Node
The exclude profile node is used to define the event selection criteria. The Active Directory Events excludes the matching events from monitoring.
Note: This node is referred to as exclude profile in this document and is user-configurable.
The Profiles node is used to create a monitoring profile for generating alarms and QoS for the events that match the monitoring criteria. The
monitoring profile caters to the monitoring requirements and alerts the user when something unexpected happens. This node does not contain
any field or section, but it contains only child nodes where each child node is a different monitoring profile.
Navigation: adevl > Profiles
Set or modify the following values if needed:
Profiles > Options (icon) > Add New Profile
This option allows you to create and activate a monitoring profile by clicking the Options icon.
<Profile Name> Node
This node allows you to configure the event selection criteria of the profiles. You can also configure the QoS settings of the Active Directory
Events probe.
Separator: defines a field separator character for the event message text. This field is useful for segregating the event message text in
multiple columns and then uses those column numbers in the Variables section. For example, if your event message text is
ABCD:EFGH:IJKL:MNOP and the separator is : (colon) then probe segregates the message text in four different columns (0 through 3).
You can use these column numbers for fetching the appropriate text to the variable.
Note: The critical level is only supported on the Windows Server 2008.
Subsystem: defines a custom subsystem ID for overriding the default subsystem ID. For example, you can give the profile name for
identifying each alarm source. You can also use variables in this field, which are explained for the Alarm Message field.
Set Suppression Key: activates the message suppression feature to avoid multiple instances of the same alarm event.
Optional Key: defines a suppression key for the alarm messages, which overrides the default key.
Time Frame(Value): specifies the time interval for the monitoring of the events.
Time Frame(Unit): specifies the unit of the time frame value.
Event Count: defined the number of events and generates alarms when this number breaches the threshold limit.
Post message: posts the event log message as the alarm.
Post Message Subject: defines the subject of the alarm. This value overrides the default message subject, which is defined in the adevl n
ode.
profile name > Variables
This section allows you to define variables with a set of conditions for each profile. These conditions populate the variable value on real time from
the selected event log message. These variables are then used for generating the alarm messages.
Notes:
This option is available for Vista and later version of Windows OS only.
The log files DNS Server, Directory Service, and File Replication Service are added by default.
Add Profile
You can define a monitoring profile for the selected log file.
Follow these steps:
1. Click Setup > Profiles.
2. Right-click the left pane and click New.
3. Add a profile name in the New Profile Name dialog and click OK.
The new profile is listed in the left pane.
Note: Two default profiles that appear in the left pane are:
allevents: Monitors all events of the log file, which are selected for monitoring.
allerrors: Monitors all events where the event severity is error.
4. Select the check-box to the left of the profile to enable it for monitoring.
5. Enter a description of the profile in the Description field.
6. Configure the event properties for filtering the Windows events and Active Directory events that the profile monitors. Select the Log that
you want to monitor from the drop-down options using the Event Selection tab.
Note: The event log files, which are selected in the Properties tab are displayed here.
7. Configure the alarms and QoS messages as desired using the Alarm/Post and QoS tabs, respectively.
8. Define the variables with a set of conditions for each profile using the Variables tab.
Refer How to Set Variable for more details.
Note: You can use both ranges and commas in the same entry, such as, 1-5, 9-20.
Events matching all the criteria in an exclude profile are excluded from monitoring by the defined profiles.
The Event ID field does not support regular expressions.
Use format as shown in the following examples:
*
114
1, 5,10
1, 10-12
115-12
Probe Defaults
At the time of deploying a probe for the first time on robot, some default configuration gets deployed automatically. These probe defaults could be
Alarms, QoS, Profiles, and so on, which save time to configure the default settings. These probe defaults are seen on a fresh install, that is no
instance of that probe is already available on that robot in activated or deactivated state.
The Active Directory Events probe has following default properties:
Setup > Properties
Poll Interval: 30 Seconds
Alarm Timeout: 10 Seconds
Log File: adevl.log
Log File Size: 100 KB
Maximum Events to Fetch: 1000
Fetch Alarms on Configurator Startup: Selected
WMI Query Timeout: 1
WMI Timeout Interval Unit: Seconds
Alarm List Size: 1000
Log Files to Be Monitored: Directory Service, DNS Server, File Replication Service
Setup > Profiles
allevents: Monitors all events of the log file, which are selected for monitoring.
allerrors: Monitors all events where the event severity is error.
Setup Tab
Properties Tab
Profiles Tab
Event Selection Tab
Alarm / Post Tab
QoS Tab
Variables Tab
Exclude Tab
Language String Configuration Tab
Setup Tab
When you double-click on the probe name in Infrastructure Manager, the GUI for the adevl probe is displayed with Setup tab (Profiles sub tab)
opened by default.
This tab contains the following tabs:
Properties
Profiles
Exclude
Language String Configuration
Subsystem Configuration
Properties Tab
The Properties tab lets you configure all initial and basic configuration of the probe, which is applicable for all monitoring profiles. The tab
contains the following fields:
Probe Active
Activates the probe if checked. Clear the check box to deactivate the probe.
Default: Selected
Description Delimiter
Adds any ASCII character including special characters to replace with new line character of the event log message.
Remove Recurring Delimiter
Removes repetition of delimiter, if selected.
Default: Not selected
Run type
Select Event to trigger the probe every time Windows NT puts a new message into the event log. Select Poll and specify a Poll Interval
and Alarm Timeout to check at regular intervals.
Default Poll Interval: 30 seconds
Default Alarm Timeout: 10 seconds
Logging
Specifies the file (Log File) to which the probe logs information about its internal activity and the level of details that are written to the log
file (Log Level). Log as little as possible during normal operation (to minimize disk consumption), and increase the amount of detail when
debugging. Sets the size of the log file (Log File Size). Large log files can cause performance issues, therefore use caution when
changing this size.
Default: 100 KB
Post Event Log Message Setup
When event log messages are posted, the default message subject is specified here (Default Post Subject). Do not use the following
subjects as they are used internally in CA UIM for alarm messages:
alarm
alarm_new
alarm_update
alarm_close
alarm_assign
alarm_stats
QOS_MESSAGE
QOS_DEFINITION
In case, any of the given subjects are used then the probe uses the evl_ as the message subject. If the field is left blank, the probe uses
adevl as the default post message text. The subject can be overridden in a profile.
Note: This field only defines the default post message subject, select the Post Message option in the Profiles > Alarm / Post
tab to send the message. You can even override the message subject at profile level
Column Prefix
Defines the text, which is added with each field name of the event log when the probe posts a message. This prefix and field name are
set for identifying the field in the posted message.
Fetch Event Setup
Maximum Events to Fetch
Specifies the maximum number of events that are fetched from the event log in the Status tab.
Default: 1000
The limit is defined to avoid timeout situations when fetching events from the probe.
Fetch Alarms on Configurator Startup
Fetches all alarms at configuration start-up (select the Status tab to see the alarm list).
If the option is not checked, this list is empty at configurator startup. Click the Refresh button of the Status tab to fetch the alarms.
Default: Selected
Output Encoding
Specifies the character encoding for generating alarms and QoS messages when the probe is deployed in a non-English locale. The
recommendation is to use same encoding as the monitored system, unless necessary.
System Encoding
Specifies the system encoding where the probe is installed.
Note: The probe auto-detects the system and output encoding when these field values are blank. However, the
recommendation is to specify the appropriate encoding in the fields. You can use UTF-8, UTF-16BE, UTF-16LE, UTF-32BE,
UTF-32LE, Shift_JIS, ISO-2022-JP, ISO-2022-CN, ISO-2022-KR, GB18030, GB2312, Big5, EUC-JP, EUC-KR, ISO-8859-1,
ISO-8859-2, windows-1250, and windows-1252 encodings.
WMI Query Timeout
Defines the time-out interval of WMI query for fetching the monitoring data. The probe uses WMI queries when hosted on earlier than
Windows Server 2008 operating systems.
Default: 1
WMI Timeout Interval Unit
Specifies the unit of WMI query time-out interval.
Default: Seconds
Alarm List Size
Defines the buffer size for storing the event details that match the monitoring profile criteria. This field is useful when a profile generates
an alarm when some events are found. For example, a monitor profile generates an alarm when the matching events count reaches 50.
Since the event count is up to 49, probe keeps the events detail in the buffer.
Default: 1000
Generate New Metric ID
Displays different metric ID when adevl and ntevl probes are deployed on the same robot.
Default: Selected
Available Log Files
Provides a list of available Log files, which you can select to be monitored. Select any of the log files and click the >> button to start
monitoring. This option is available for Vista and later version of Windows OS only.
Log Files to be Monitored
Displays a list of log files that the probe monitors. The log files DNS Server, Directory Service, and File Replication Service are added by
default. However, you can add/remove other log files from the Available Log Files list view. This option is available for Vista and later
version of Windows operating systems only.
Profiles Tab
When you select the Profiles tab, the GUI is displayed which contains the list of profiles in the left pane and some sub tabs in the right pane.
These sub tabs are used to configure the selected profile.
The Profiles tab contains the following fields:
<List>
Displays all the defined setup profiles. The check box to the left of the profile name must be checked to enable the profile. Select a profile
to display/ modify its parameters.
The first profile in the list is processed first and then the next one. Right-clicking in the list allows you to move a profile up or down.
allevents: Monitors all events of the log file, which are selected for monitoring.
allerrors: Monitors all events where the event severity is error.
Description
Defines a text string identifying the watcher.
There are four sub tabs in the Profiles tab:
Event Selection
Alarm / Post
QoS
Variables
The Event Selection tab lets you configure the event properties for filtering the Windows events that the profile monitors. The tab contains the
following fields:
Event Selection Criteria
Defines the event selection criteria for filtering the event list and identifying the event for monitoring. An asterisk (*) in one of these fields
means that the profile processes all log messages regardless of the contents in the field.
No Propagation of Events
Excludes an event matching the selection criteria of one of the monitoring profiles, which is made unavailable for the other profiles.
Note: You can change the order of the profiles and the corresponding processing order.
Log
Specifies the log file from where the probe monitors the event. The event log files, which are selected in the Properties tab are
displayed here.
Computer
Defines the computer name on which the event has occurred.
Note: You can use local host in the Computer field to get only local messages. You can also use both ranges and commas
in the same entry, such as 1-5 and 9-20.
Source/Publisher Name
Defines the source or the publisher from where the event has logged.
Severity
Specifies the event severity.
Note: The audit success and audit failure severity options are applicable only for Windows earlier than Vista and 2007.
Microsoft has moved these options to the keyword field from Windows Vista and 2007 onwards. The severity level of these
events is shown as Informational in the event viewer. The current implementation of the adevl probe does not support
monitoring on basis of the keyword field.
User
Defines the Windows user account for which the event is generated.
Category
Defines the event category, for example, Service State Event.
Event ID
Defines the event ID you are monitoring. Use * for monitoring all event of the select log file.
Message String
Defines the alarm message text when the event selection criteria matches an event.
Run Command on Match
Allows you to run the command when an event matches the selected criteria.
Command Executable
Specifies the command to execute when an event matches the profile. You can use the Browse button to configure a batch file path.
For example, you can execute a script for sending an email to the support executive for resolving the issue.
Command Arguments
Defines the parameters which are required for executing the command or the batch file. For example, define the email ID of the
support executive for sending an email. This field is optional.
Default: Selected
Time Interval
Specifies the time interval (in seconds) for event detection that is used by the QoS option.
Default: 3600 seconds
Variables Tab
The Variables tab is used for defining the variables with a set of conditions for each profile. These conditions populate the variable value on real
time from the selected event log message. These variables are then used for generating the alarm messages.
The Variables tab contains the following fields:
<Variable List>
Lets you view the existing variables list and select any variable for editing the variable definition.
Field Separator
Defines a field separator character for the event message text. This field is useful for segregating the event message text in multiple
columns and then use those column numbers in the Variable settings dialog. For example, if your event message text is
ABCD:EFGH:IJKL:MNOP and the separator is : (colon) then probe segregates the message text in four different columns (1-4). You can
use these column numbers for fetching the appropriate text to the variable.
Activate a Profile, Right-click in the grid, and select New from the context menu in the Variables tab to create new variable settings.
The fields in the Variable settings dialog are:
Name
Defines the name for the variable. Duplicate variable names are not allowed.
Default: var
Source Line
The source line of the variable where the threshold alarm needs to be defined. Select the FROM and TO positions.
Default: Not Selected
Source FROM position
The probe must start searching the threshold from this position in the event description.
Column
Select this option to define the column number from where the probe must start searching for the specified threshold in the event
description.
Default: 1
Character position
Select this option to define the character position from where the probe must start searching for the specified threshold in the event
description.
Default: 1
Match expression
Select this check box to define the regular expression that the probe must search in the event description.
Source TO position
The probe must stop searching the event description at this position.
Ignore 'To'
Select this option to allow the probe to search the event description until the end.
To Column
Select this option to define the column number until where the probe must search the specified threshold in the event description.
To End of Line
Select this option to allow the probe to search the event description until the end of defined Source Line.
Threshold Alarm Definition
Operator
Select a comparison operator from the drop-down menu. You can also choose the re option if you want to use regular expressions.
Note: The >, <, >=, and <= operators support only integer and float type values. These operators do not work with string
values. Only the = operator works with string values.
Threshold
Set the threshold value for the variable.
Example: If you set the threshold value for column 0 equal to 10, then an alarm is generated every time the value in column 0 equa
ls 10.
Exclude Tab
The Exclude tab enables you to specify the profiles that you want to exclude from monitoring by the probe.
Right-click in the left-hand section and select New from the context menu to create a new Exclude profile. The entry appears in the left-hand
pane. Also, the fields in the right-hand pane are enabled.
<List>
Displays all the defined exclude profiles. Select a profile to display/modify its parameters.
Event selection criteria
Specify regular expressions identifying the event log messages that you are looking for. An asterisk (*) in one of these fields means all
log messages regardless of the contents in the field.
Note: You can also use both ranges and commas in the same entry, such as 1-5, 9-20.
Events matching all the criteria in an exclude profile are excluded from monitoring by the defined profiles.
The Event ID field does not support regular expressions.
Use format as shown in the following examples:
*
114
1, 5,10
1, 10-12
115-12
Language String Configuration Tab
The Active Directory Events probe displays all event severity as Information, when deployed in a non-English locale. When the probe is installed
on Windows Vista, Windows Server 2008 R2, or a later version, Windows returns event severity string in their specific locales and the probe is not
able to compare these values with an equivalent English string.
The Language String Configuration tab lets you configure the locale-specific severity strings when the probe is deployed in a non-English
locale. This tab contains the following fields:
Critical
Defines an appropriate string for identifying the event severity as Critical. For example, define critique for the French locale.
Information
Defines an appropriate string for identifying the event severity as Information. For example, define informations for the French locale.
Warning
Defines an appropriate string for identifying the event severity as Warning. For example, define avertissement for the French locale.
Verbose
Defines an appropriate string for identifying the event severity as Verbose. For example, define verbeux for the French locale.
Error
Defines an appropriate string for identifying the event severity as Error. For example, define erreur for the French locale.
Audit Success
Defines an appropriate string for identifying the event severity as Audit Success. For example, define chec de l'audit for the French
locale.
Audit Failure
Defines an appropriate string for identifying the event severity as Audit Failure. For example, define Succs de l'audit for the French
locale.
Subsystems Configuration Tab
The Subsystems Configuration tab lists the existing alarm subsystem ID for each monitored log file. You can also define a new subsystem ID for
any custom log file, which is selected for monitoring. The default configuration of the probe monitors Directory Service, DNS Server, and File
Replication Service log files, with the following subsystem IDs:
2.1.2
2.1.2
2.1.2
Important! Do not delete or modify any of the default subsystem IDs.
You can right-click the subsystem ID list and select New for adding a subsystem ID.
Subsystem Key
Defines a subsystem key for the appropriate log file. This key must be identical to the corresponding log file name, contains only small
characters, and the slash (/) character is replaced with $$. For example, the key is microsoft-iis-configuration$$administrative for the
Microsoft-IIS-Configuration/Administrative log file.
Subsystem Value
Defines a different alarm subsystem ID for each monitored log file. The recommendation is to use the default subsystem ID pattern
(2.1.2.X) for other log files too. This pattern is mandatory to view the metric details under the Event Log node of the Unified Management
Portal (UMP).
Note: You can also define an appropriate name of newly defined subsystem value in the nas probe, else subsystem value is
displayed as is on UMP.
Status Tab
The Status tab lets you view the events of the log files which are selected for monitoring in the Setup > Properties tab. This tab displays latest
event logs when the total event count is greater than the Maximum Events to Fetch field value. In case, the alarm list remains empty at start-up,
click the Refresh button and fetch the event list. You can control the default behavior for fetching event by configuring the Fetch Alarms on
Configurator Startup option in the Setup > Properties tab.
Important! The probe throws the Failed to get events error while fetching the event list when the event count is higher, for example,
3000 or more. The actual event count varies due to your system configuration and performance. In such case, reduce the value of Maxi
mum Events to Fetch field in the Properties tab.
The following right-click menu selections are available by right-clicking in the window:
Refresh - Fetches the event log messages again.
New profile - Creates a monitoring profile using values from the current event.
Exclude from monitoring - Creates an exclude profile using values from the current event.
Clear log - Removes all event messages from the current event log.
Type
Value
column prefixwatcher
Text
column prefixlog
Text
column prefixseverity
Text
column prefixseverity_str
Text
Event severity.
column prefixsource
Text
column prefixcategory
Text
column prefixevent_id
Number
column prefixuser
Text
column prefixcomputer
Text
column prefixdescription
Text
column prefixdata
Text
None
column prefixtime_stamp_epoc
Number
column prefixtime_stamp
Date/Time
Note: Depending on the number of variables created in the profile the parameters gets displayed.
adevl Metrics
The following table describes the QoS metrics that can be configured using the Active Directory Events (adevl) probe.
Monitor Name
Units
Description
Version
QOS_EVL_COUNT
Count
v1.1
Meta
Character
Description
Sample Log
File Text
1.
[ ] Square
Brackets
Sample 1:
Nimsoft12 IM
is a CA
product
Sample 2:
Nimsoft
IM0123456 is a
CA product
2.
3.
4.
- Dash
^
Circumflex
or Caret
$ Dollar
Sample:
Nimsoft12 IM
is a CA
product
Sample:
Nimsoft12 IM
is a CA
product
Sample 1:
Nimsoft12 IM
is a CA
product
Sample 2:
Nimsoft12 IM
is a product of
CA
5.
. Period
Sample:
Nimsoft12 IM
is a CA
product
6.
? Question
Sample 1:
7.
* Asterisk
Sample 1:
Nimmsoft IM
is a CA
product
Sample 2:
Nisoft IM is a
CA product
Sample 3:
Nimsoft IM is
a CA product
8.
+ Plus or
Addition
Sample 1:
Nimmsoft IM
is a CA
product
Sample 2:
Nimsoft IM is
a CA product
{n}
10.
{n,m}
Sample:
Sample 2:
11.
12.
{n, }
\\ Escape
Sequence
Sample 1:
Sample 1: A Match
(whole string)
Sample 2:
Sample 2:
Sample 2:
A No Match because
count of n is 1 which is
less than 2.
Sample 1:
Expression /\\\\Technologies
/ returns:
Nimsoft IM is a
CA
Sample 1: A match because
\\Technologies \\Technologies matches in
product
the target string
Sample 2:
Expression /\\\Technologies/
returns:
Nimsoft IM is a
CA
Sample 2: A match because
\Technologies
\Technologies matches in
product
the target string
Expression *\\\\Technolog
ies * matches with Sample
1.
Expression *\\\Technologi
es* matches with Sample
2.
13.
14.
\ Back
Slash
"(" or ")"
Nimsoft IM is a
CA
Technologies
product but is
its support
always happy
to help?
Sample:
Nimsoft IM is a
CA
Sample: A Match because g Sample: A No Match as
Technologies
ood is repeated 3 times in the (good){3} is not expanded
product and
to goodgoodgood
target string.
indeed its
support is
goodgoodgood
General:
[^<]*
Anything (or nothing) not a '>'. Useful for getting the rest of a tag without knowing what it is!
\"([^\"]*?)\"
Capture anything (or nothing) inside quotes. The data inside the parentheses is available for use in variables on the logmon probe. Use these
variables in alarm messages, validate values, and send QoS messages with the contents.
Note: The quotes are also required here and the quote must be escaped with the backslash \ character.
<A[^>]*?>
Matches and anchor tag up until the end of the tag. This tag generalizes for other tags as required.
<\/A[^<]*?
Matches an end anchor tag and anything up until the start of the next tag.
Note: A slash / must be escaped with a backslash because it is a reserved character in regular expressions.
Note: For adding space, you can use \s on Solaris but it does not work on the Linux and Windows systems.
More information:
adogtw (ADO Gateway) Release Notes
Configure a Node
Add Connection
Manage Profiles
Delete Connection
Delete Profile
The ADO Database Gateway probe is configured to establish an ADO connection between the probe and the database. You can create a
connection using different types of database providers. You can also create profiles for monitoring database transactions.
Configure a Node
This procedure provides the information to configure a section within a node.
Each section within the node lets you configure properties of the probe for connecting to the database.
Follow these steps:
1. Navigate to the section within a node that you want to configure.
2. Update the field information and click Save.
The specified section of the probe is configured.
Add Connection
You can create an ADO connection of the ADO Database Gateway probe for monitoring a database.
Follow these steps:
1. Click the Options icon next to the adogtw node in the navigation pane.
2. Select Add New Connection.
3. Update the field information and click Submit.
The new ADO connection is available for database monitoring and is visible under the adogtw node in the navigation pane.
Manage Profiles
You can add a profile of the ADO connection in the ADO Database Gateway probe for database monitoring.
Follow these steps:
1. Click Options next to the connection name node in the navigation pane.
2. Select Add New Profile.
3. Update the field information and click Submit.
The new monitoring profile is visible under the Metric or Publish node in the navigation pane.
Delete Connection
You can delete an ADO connection to stop database monitoring.
Follow these steps:
1. Click the Options icon next to the Connection-connection name node that you want to delete.
2. Select Delete Connection.
3. Click Save.
The ADO connection is deleted.
Delete Profile
You can delete a Metric or Publish profile of the connection to stop database monitoring.
Follow these steps:
1. Click the Options icon next to the profile name node that you want to delete.
2. Select the Delete Profile.
3. Click Save.
The monitoring profile is deleted from the resource.
Note: This node is referred to as connection name node in the document and is user-configurable.
Metric Node
This node represents the Metric type profile for ADO Database Gateway probe. All custom Metric type profiles are displayed under this node.
This node does not contain any fields or sections. The Metric type profile generates the following QoS messages:
Query Response QoS
Row Count QoS
Value QoS
Navigation: adogtw > Connection-connection name > connection name > Metric
Note: This node is referred to as metric profile name node and is user-configurable.
Navigation: adogtw > Connection-connection name > connection name > Metric > metric profile name
Set or modify the following values as required:
metric profile name > General Setup
This section lets you configure the profile properties and set the timeout values.
Active: activates the monitoring profile.
Profile Name: identifies the profile name.
Connection Name: identifies the provider name.
Query Timeout: specifies the time limit for fetching the SQL query output.
Run Interval: specifies the time interval between two successive SQL Query executions.
Default: 10 minutes
Note: The Test option lets you verify if the SQL query runs successfully without displaying any database items or number of
rows returned.
metric profile name > Query Response QoS
This section lets you configure the threshold properties for generating QoS messages and alarms for the SQL query response time.
Severity: specifies the alarm severity.
Default: information
Message: defines the alarm message text.
Subsystem: specifies the alarm subsystem ID that defines the alarm source.
Default: 1.1.13-Database
Threshold: specifies the alarm threshold operator.
Threshold Value (ms): specifies the time interval for fetching SQL query output, exceeding which alarms are issued.
Note: Similarly, you can configure the Row Count QoS and Value QoS.
Note: This node is referred to as publish profile name node and is user-configurable.
Navigation: adogtw > Connection-connection name > connection name > Publish > publish profile name
Set or modify the following values as required:
publish profile name > General Setup
This section lets you configure the profile properties and set the timeout values. Refer to the General Setup section of the metric profile
name node for field description.
publish profile name > SQL Query Setup
This section lets you define the Simple Query for accessing database tables and reads data from them. The Test option under Actions l
ets you execute the defined query.
publish profile name > Publish Message Subject
This section lets you configure the Subject field. This field defines the message content.
publish profile name > Publish Message Setup
This section lets you create a variable to map with the message.
Variable Name: defines a unique name in the message.
Variable Mapping: defines the variable value. If no mapping is given, then the column name is used as variable to map with the
message.
Publish Node
This node represents the Publish type profile for ADO Database Gateway probe. All custom Publish type profiles are displayed under this node.
This node does not contain sections or fields.
Navigation: adogtw > Connection-connection name > connection name > Publish
adogtw Node
This node lets you view the probe information and configure log properties of the probe. You can also add an ADO connection to access a
database.
Navigation: adogtw
Note: If you select ODBC as the connection type, then instead of Provider, you can select the Data Source Name (DSN).
DSN is a data structure that contains information about a database.
Initial Catalog: defines the database name.
Data Source: defines the database server.
User ID: defines the database user login ID.
Parameters: defines the additional parameters for establishing an ADO connection.
Probe Configuration
This section describes how to configure the adogtw probe. The adogtw probe contains three tabs:
Setup
Connections
Profiles
You can create four types of profiles - Alarm, Publish, QoS, and Subscribe.
2. Enter the name of the new profile in the Name box, select the profile type as Alarm, and click OK.
The New Alarm [profile name] dialog appears.
2.
Alarm Definition
Publish Message
Quality of Service Definition
Subscribe
There are three different QoS types supported by the adogtw probe.
Query Response: Specifies how long (in milliseconds) it took to run the SQL query.
Row Count: Specifies how many rows the SQL query returned.
Value QoS: Sends the value of selected column (must be a numeric value) returned by the SQL query.
You can send a QoS message and/or send an alarm if the threshold is breached.
The fields in the above dialog are explained below:
Send QoS Message on query response time
Enables or disables quality of service messages.
Send Alarm
Enables or disables alarm.
Severity
Specifies the severity of the alarm message.
Message
Specifies the alarm message text.
Subsystem
Specifies the alarm subsystem.
Threshold
Specifies the alarm condition and threshold value.
Note: All the fields in the remaining tabs - Row Count QoS and Value QoS are same as above but the QoS generation criteria are different.
Units
Description
QOS_SQL_RESPONSE
Milliseconds
QOS_SQL_ROWS
Rows
QOS_SQL_VALUE
Value
3. Click the General tab and select the connection that you created.
4. Click the Subscribe tab and select Subject as nas_transaction and Table as AlarmTransactionLog.
5. Activate the profile, save it, and watch the table gets filled.
adogtw Troubleshooting
adogtw Tips
The query
There are some rules that you should follow when you create a query:
1. Use column names in the query. For example, SELECT a, b FROM table1.
This makes it easier to determine the variables available in the profile. In this case $a and $b.
2. Try to limit the number of rows that are returned by the query. Receiving 12432 alarms every 5 minutes is excessive.
For example: SELECT a, b FROM table1 WHERE somedate < DATEADD(n,-10,GETDATE())
3. Use queries that return one row if you can. For example, SELECT count(*) as rows FROM table1.
Remember that each row returned by a query results in one alarm or on message.
Using column variables
It is possible to use variables in most fields in Alarm and Publish profiles. The number of variables available depends on the select statement
used in the query. Always use column names in the query. For example, SELECT a, b FROM table1. This makes it easier to determine the
variables available in the profile. In this case $a and $b.
The example above could in an Alarm profile result in a Message definition like this: $a contains $b. Each variable will be replaced with the
corresponding value from the select.
Using message variables
This applies to the Subscribe profile only. When you create a table that you are going to use to insert Nimsoft messages try to use the same name
on the columns as the variables used in the message (PDS). You can use a sniffer tool (the hub) to find out what the message contains.
Some data types are treated specially:
1. Numbers that look like this: 1022649974 is most likely an EPOC value. This number corresponds to the date Wednesday, May 29, 2002
05:26:14. EPOC is a system date used by computers and starts on Thursday, January 01, 1970 00:00:00 with the value 0. This value can
be inserted into the database as a datetime value.
2. Formats for number and decimal types must contain only one variable and that value must contain numbers only.
3. The string can contain anything, including multiple variables.
adogtw Metrics
The following table describes the checkpoint metrics that can be configured using the ADO Gateway Monitoring (adogtw) probe.
Monitor Name
Units
Description
Version
QOS_SQL_RESPONSE
Milliseconds
v2.5
QOS_SQL_ROWS
Rows
v2.5
QOS_SQL_VALUE
Value
v2.5
aggregate_alarm
The aggregate_alarm probe lets you build an expression that includes one or more QoS metrics published by any CA UIM probe and a threshold
point for each QoS in the expression. When aggregate_alarm evaluates the expression, if it is true, it generates an alarm of the configured alarm
severity and alarm text.
More Information:
aggregate_alarm Release Notes
Note: An aggregate alarm's expression can have QoS metrics from a single or multiple sources. Therefore, an aggregate alarm is not
associated with a specific source.
The purpose of the aggregate_alarm probe is to allow you to generate a single event alarm based on multiple alarm conditions. Using this probe
lets you customize alarms based on your environment.
You can create an aggregate alarm with one or more QoS metrics that are generated by any CA UIM probe. Select the Publish Data option in a
probe's GUI for all QoS metrics included in an aggregate alarm expression. When this option is selected, the probe puts QoS messages on the
UIM message bus. The aggregate_alarm probe listens on the UIM message bus and gathers all QoS messages identified in all the aggregate
alarm conditions expressions. Probes must be publishing QoS messages for the conditions expression to be evaluated properly.
The following diagram shows how QoS messages from probes are used to generate an aggregate alarm that is based on multiple alarm
conditions.
Configuration Overview
The following diagram shows the process of creating an aggregate alarm.
Sample Configuration
As an example, to generate an aggregate alarm of minor severity (level 3) when Disk Usage for a specified QoS source and target exceeds 85
percent AND CPU Usage for a specified QoS source and target exceeds 50 percent, here are the actions to perform:
1. Using the Admin Console, access the desired probe configuration settings.
a. Access the cdm probe. For Disk Usage > Disk Usage (%), select the Publish Data check box.
b. For cdm probe Processor > Total CPU > Total CPU Usage, select the Publish Data check box.
2. In Admin Console, access the aggregate_alarm GUI and create a new aggregate alarm profile.
a. By default, the new aggregate alarm is Enabled. You can clear this check box and can enable it at a later time.
b. Select the Query UIM database check box (cleared is the default setting) to indicate that you want the QoS Name, Source, Target,
and Origin drop-down lists populated at the next Refresh Interval.
3. In the QoS Conditions table, click New and enter the QoS data for each QoS to be included in the aggregate alarm conditions
expression.
Each condition requires:
a unique identifier
the QoS name
the source host name or IP address included in the emitted QoS message
a target which can be a host name, IP address, or valid identifier (such as a drive identified C:\)
a threshold operator
the alarm threshold the aggregate_alarm uses to evaluate in the condition expression.
id1
id2
QOS_DISK_USAGE_PERC
QOS_CPU_USAGE
QA-Test-VM
QA-Test-VM
10.1.1.1
10.1.1.1
>
>
85
50
4. Enter the QoS condition expression the aggregate_alarm evaluates to determine if it needs to generate an alarm. Based on the QoS
conditions that are shown in the previous step, here's a sample expression that evaluates true when QOS_DISK_USAGE_PERC for the
entered source and target goes above 85 percent AND QOS_CPU_USAGE for the entered source and target goes above 50 percent.
id1&&id2
5. Select the alarm severity level for the aggregate alarm and enter the alarm text. Following the sample configuration, a minor alarm
(severity level 3) with the configured alarm text appears in Unified Service Manager when the QoS condition expression evaluates to true.
The aggregate alarm is now defined and will be active if the Enabled option is selected in the aggregate_alarm probe GUI.
Probes generate QoS messages at different intervals. For example, one probe might generate a QoS message every 15 minutes while another
probe generates a QoS message every hour. aggregate_alarm collects QoS messages from the UIM message bus for all QoS metrics added in
all QoS conditions tables and stores the data. When there is at least one collected QoS message for each QoS included in a conditions
expression, aggregate_alarm can start the expression evaluation process.
The aggregate_alarm evaluates the QoS messages included in each conditions expression to determine if the expression is true. When the
conditions expression is true, the aggregate_alarm sends an alarm of the configured severity with the configured alarm text. The generated alarm
and configured alarm text display in Infrastructure Manager and Unified Service Manager.
Note: An identifier may not begin with the character sequence QOS_, as this sequence is reserved for use with identify Qos
values published on the UIM message bus.
Note: You can use parenthesis to indicate order of processing. For example, you can enter "id1 || (id2 || (id3&&id4))". This
expression evaluates true when id3 AND id4 OR id2 QoS conditions breach a configured threshold, or id1 QoS condition
breaches a threshold.
4. Click Submit.
Note: Depending on the size of your infrastructure, it may take several minutes for the drop-down lists to be populated with
data. While the data is populating, you can use Add Item (+ icon) to manually type a QoS Name, a Source, a Target, or an
Origin for a QoS. These fields are case-sensitive. Therefore, you must enter the string correctly to generate an aggregate
alarm. For example, "QoS CPU Usage" is incorrect, but "QOS_CPU_USAGE" is correct.
3.
If the Query UIM Database option is selected, this interval indicates how often the data in the drop-down lists is updated.
Default: 60 minutes
4. Click Save.
Note: Deactivate and then Activate the probe to immediately start the process of populating the QoS Name, Source, Target,
and Origin drop-down lists. Otherwise, the process of populating the drop-down lists will occur at the next Refresh Interval.
Probe Interface
aggregate_alarm Probe Setup
Probe Information
Probe Setup
Aggregate Alarm General Settings
Aggregate Alarm Profile
QoS Conditions
Condition Expression
Alarm Configuration
Probe Interface
The probe interface is divided into a navigation pane and a details pane. The left navigation pane contains a hierarchical representation of the
existing aggregate alarm profiles. The right details pane shows how aggregate alarms are configured.
Probe Setup
When you manually add a QoS name, it must match the name displayed in the probe's GUI. When you manually enter a
source, target, or origin, it must match the information emitted in a QoS message. Otherwise, an aggregate alarm will not be
generated because the aggregate_alarm probe will be listening for QoS messages that don't exist.
Note: In environments with a large number of probes, it might take several minutes to populate drop-down lists. However,
selecting this option and then selecting values from the populated drop-down lists ensures that the QoS Name, Source, Target,
or Origin is correct.
Use the QoS Conditions table to define all the QoS metrics to be included in the aggregate alarm condition expression.
Identifier
A unique name to identify a condition, which includes a QoS name, a source, a target, an operator, and a threshold.
QoS Name
The exact name of a QoS. The name must match the name displayed in the probe's configuration pages. For example,
QOS_DISK_AVAILABLE is correct, but QoS_disk_Available is incorrect.
Source
The source host name or IP address emitted in a probe's QoS message.
If you manually enter a source using the Add Item (+ sign), the source data must match the source a probe emits in its QoS message.
The source is case sensitive. To determine a QoS source, you must look at a generated QoS message.
Default: No Selection
Target
The target emitted in a probe's QoS message.
If you manually enter a target using the Add Item (+ sign), the target data must match the target a probe emits in its QoS message. The
target is case sensitive. To determine a QoS target, you must look at a generated QoS message.
Default: No Selection
Note: The QoS name, source, and target must match what is an emitted in a QoS message. Otherwise, the aggregate_alarm probe will
fail to generate an aggregate alarm. Incorrect QoS name, source, and target data results in the aggregate_alarm probe listening for the
wrong QoS messages.
Operator
Choose an operator for the alarm threshold.
> An alarm occurs when the metric is greater than the set threshold.
>= An alarm occurs when the metric is greater than or equal to the set threshold.
< An alarm occurs when the metric is less than the set threshold.
< = An alarm occurs when the metric is less than or equal to the set threshold.
= An alarm occurs when the metric is equal to the set threshold.
!= An alarm occurs when the metric is not equal to the set threshold.
Threshold
The user-defined threshold for the alarm. The number entered for this threshold should correspond to the units used for the QoS.
Condition Expression
The Expression field indicates the desired condition for generating an alarm.
Valid operators include:
&& - Evaluates true when both of the relational expressions on either side of the operator evaluate true.
|| - Evaluates true when either one or both of the relational expressions on either side of the operator evaluate true.
! - Inverts the result of the relational expression to the operator's right. For example, "! (id1 || id2)" inverts the result of the composite
expression contained within the parentheses, but "! id1 || id2" inverts only the expression to be "id1".
Parentheses
Use parentheses to indicate the order in which the expression is evaluated.
For example, the expression "((id1 && id2) || id3))" indicates the condition is true (meaning an alarm is generated) when the thresholds
set for both id1 and id2 are reached OR the threshold set for id3 is reached.
Alarm Configuration
Alarm Severity
The level of alarm generated for the aggregate alarm event.
Critical Level 5
Major Level 4
Minor Level 3
Warning Level 2
Information Level 1
Alarm Text
The alarm message on an emitted alarm.
Origin
A system-generated alarm origin or another origin you enter.
$ALARM
A reserved keyword that Introduces an alarm definition you intend to emit when the expressions following the alarm definition evaluate
true.
Syntax: Reserved keywords are all uppercase and start with $. $ALARM is the only defined keyword.
alarm name (Alarm_Name1)
Tha name of the alarm.
Syntax: An upper or lower case letter followed by any combination of zero or more upper and lower case letters (a-z or A-Z), digits (0-9)
or underscores (_).
alarm level (4)
The alarm severity level.
Syntax: An integer between 1 and 5 inclusive.
1 = information level
2 = warning level
3 = minor level
4 = major level
5 = critical level
"alarm text" ("Alarm text message.")
The alarm message as it will appear in the alarm console. Beginning and ending quotes are required in the command but are removed
before sending the alarm.
"origin" ("originhubname")
The system-generated alarm origin or another origin you enter. To display the system-generated origin, leave the origin field blank by
setting the origin to a blank string (""). To display an origin you enter, enclose the origin string in quotes ("hub_name"). Beginning and
ending quotes are required for the origin string but are removed before sending the alarm.
identifier
A unique name that identifies a unique combination of the QoS triple values (QoS name, source, and target). Identifiers are
case-sensitive, meaning Id3 is different than id3. Windows drive identifiers entered as a target are also case sensitive (C:\ is different than
c:\). Windows drive identifiers for a target are not quoted (id3 : QOS_DISK_USAGE, "82dsmMhub5", C:\ is a valid expression).
Syntax: An upper or lower case letter followed by any combination of zero or more upper and lower case letters (a-z or A-Z), digits (0-9)
or underscores (_).
Note: Do not use the string QOS_ as the initial characters in an identifier. However, identifiers such as Qos_1, qos_1, or
QoS_1 are valid.
QoS name
The exact QoS name as it appears in a Qos message in UIM.
Syntax: The string QOS_ followed by any combination of at least one upper or lower case letters, digits, or underscores.
source ("hubsource3")
The host name, IP address, or valid identifier (such as a drive identified C:\) of a source included in a QoS message.
target ("hubtarget3")
The host name, IP address, or valid identifier (such as a drive identified C:\) of a target included in a QoS message.
operator
Operators used to compare QoS values to threshold constants or other QoS values.
Valid operators are:
== Operands are exactly equal.
!= Operands are not equal.
! (id31 >= 0.0 || id32 <= 0.0) // inverts the result of the composite
expression contained within the parentheses
! id31 >= 0.0 && id32 <= 0.0
// inverts only the result of the expression
"id31 >= 0.0"
Comments
The character sequence // begins a single-line comment. For example:
! (id31 >= 0.0 || id32 <= 0.0) // Text from here to the end of this line is
discarded
// QOS_CPU_USAGE,"82dsmMhub5", d:\ >= 0
The character sequence /* begins a multi-line comment, and the character sequence */ ends a multi-line comment. Both the beginning
and ending multi-line sequences may be on the same line. For example:
/*
All of the text in this block is discarded during expression evaluation.
$ALARM Alarm1:1, "This is an alarm \"with quoted text\"", ""
id31:QOS_CPU_USAGE,"82dsmMhub3","82dsmMhub3"
id32:QOS_CPU_USAGE,"82dsmMhub4","82dsmMhub4"
id33:QOS_CPU_USAGE,"82dsmMhub5", c:\
Qos_1:QOS_CPU_USAGE,"82dsmMhub5", d:\
id31 >= -1.0 || id32 < 3.14 || id33 != 0.0
*/
! id31 >= 0.0 && id32 <= 0.0
Alarms are cleared when the conditions that originally caused an alarm to be published no longer exist.
IDENTIFIER:QOS_IDENTIFIER,
IDENTIFIER:QOS_IDENTIFIER,
IDENTIFIER:QOS_IDENTIFIER,
IDENTIFIER:QOS_IDENTIFIER,
QUOTED_TEXT, QUOTED_TEXT
QUOTED_TEXT, DRIVE_IDENTIFIER
DRIVE_IDENTIFIER, QUOTED_TEXT
DRIVE_IDENTIFIER, DRIVE_IDENTIFIER
Expression Evaluation
When the probe parses an expressions file, any referenced QOS values register themselves to be processed as they are transmitted across the
UIM message bus. Expressions are evaluated at the rate of the slowest occurring QOS. For example, say that QOS_1 is published at 5-minute
intervals, and QOS_2 is published at 1-hour intervals. An expression referencing only QOS_1 would evaluate each time QOS_1 is published as
soon as the probe sees QOS_1 on the UIM message bus. An expression referencing both QOS_1 and QOS_2 would evaluate each time QOS_1
and QOS_2 are published, so the expression would evaluate at QOS_2's publication rate. In addition, the expression evaluates using the latest
values of each referenced QOS. In this example, QOS_1 would be published 12 times for each time QOS_2 is published. The expression would
evaluate using the twelfth QOS_1 value and the first QOS_2 value. After successful evaluation, the referenced QOS values are cleared, so
subsequent evaluations would follow the same schedule as the first evaluation.
Examples
For the examples in this section, the aggregate_alarm probe collects the QoS messages included in each identifier string (for example,
id31:QOS_CPU_USAGE,"82dsmMhub3","82dsmMhub3","") from the UIM message bus. When the probe has collected the QoS messages
needed to evaluate an expression, it executes the evaluation. When the expression evaluates as true, the probe emits a single alarm of the
configured severity level and with the configured alarm text (for example, "This is an alarm.").
After expression evaluation, the aggregate_alarm probe clears any data values used in the evaluation instance, then resumes processing QoS
messages.
Aggregate Alarm With a Single Alarm Threshold
The aggregate_alarm probe collects the QoS messages with QOS_CPU_USAGE, "82dsmMhub3", "82dsmMhub3" from the UIM message bus.
All QOS_CPU_USAGE messages are evaluated against the conditions expression (id31>=-1.0). When the QoS message contains a value
greater than or equal to -1.0, the expression for Alarm1 expression would evaluate true, and the probe would emit a level 1 alarm with the text
"This is an alarm".
The aggregate_alarm probe collects the QoS messages with QOS_CPU_USAGE, "82dsmMhub3", "82dsmMhub3" and QOS_CPU_USAGE,
"82dsmMhub4", "82dsmMhub4" from the UIM message bus. When aggregate_alarm evaluates the QoS messages and conditions id31 >= -1.0
and id32 < 3.14 for Alarm1 are true, the probe emits a single level 1 alarm with the text "This is an alarm".
The aggregate_alarm probe collects the QoS messages with QOS_CPU_USAGE, "82dsmMhub3", "82dsmMhub3" and QOS_CPU_USAGE,
"82dsmMhub4", "82dsmMhub4" from the UIM message bus. When aggregate_alarm evaluates the QoS messages, and either condition id31 >=
-1.0 or id32 < 3.14 for Alarm1 are true, the probe emits a single level 1 alarm with the text "This is an alarm". Both expressions are evaluated,
even though it is not strictly necessary to evaluate id32 < 3.14 after id31 >= -1.0 has already evaluated true.
For this expression, parenthesis are used in the conditions expression to specify an evaluation order. In the expression shown below the
expression within the parenthesis (id32 < 3.14 || id33 != 0.0) is evaluated first. The order of expression evaluation without parentheses would be
left to right.
The aggregate_alarm probe collects the QoS messages specified for id31, id32, and id33 from the UIM message bus. When aggregate_alarm
evaluates the QoS messages, and condition id32 < 3.14 or id33 !=0.0 or id31 >= -1 for Alarm1 is true, the probe emits a single level 1 alarm with
the text "This is an alarm".
More Information:
v1.0 aggregate_alarm AC Configuration
alarm_enrichment
The alarm_enrichment probe is a pre-processor probe for the nas probe. Alarm_enrichment attaches itself to a permanent queue and receives
alarm messages distributed by the Hub.
This probe is documented along with the nas probe.
apache AC Configuration
Configure the Apache HTTP Server Monitoring (apache) probe to monitor the status and performance of the Apache web server. You can add the
target Apache web server to the probe and can configure the required checkpoints.
The following diagram outlines the process to configure the probe to monitor Apache web server.
Verify Prerequisites
Create a Profile
Activate the Checkpoints
Alarm Thresholds
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see apache (Apache HTTP Server
Monitoring) Release Notes.
Create a Profile
Add Apache web server to this probe to connect and request necessary information from that server. The probe can connect and request
necessary information from the apache server, after the host is added.
Follow these steps:
1. Click the Options (icon) next to the apache node.
2. Select Add New Host option.
The General Profile Configuration dialog appears.
3. Set or modify the following values, as required:
3.
Hostname or IP address: Specify the IP address or the hostname of the Apache web server system in the Hostname field
Active: Select Active to activate the profile and start monitoring the Apache server, on profile creation.
Alarm Message: Select the alarm message to be generated, when the Apache web server host does not respond.
Override Default Suppression ID: Select this checkbox to override the default suppression ID with the specified suppression ID.
Suppression ID: (If Override Default Suppression ID checkbox is selected) Define the new suppression ID for filtering alarm
messages.
Server Address for HTTP Response and Server Status: Define the apache web server address in the <server
address>/server-status?auto. For Example: www.apache.org/server-status?auto
Data Collection Interval: Specify the time interval for collecting data from the Apache web server.
Default: 5 minutes
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Extended Status: Select this checkbox to collect detailed information on the status (for example, detailed connection and request
information) from the Apache web server. The extended status information is displayed in the right pane, when selecting the Apache
Connection sub-node for the Apache server.
Default: Not selected
Note: Enabling the Extended Status option can result in increased server load.
Use SSL: Select this checkbox to allow the probe to use HTTPS to connect with the Apache web server.
Default: Not selected
Peer Verification: (If Use SSL checkbox is selected) Select this check box to enable peer verification. Peer is the certification
authority which issues the SSL certificates.
Default: Not selected
Certification Authority Bundle Path: (If Use SSL checkbox is selected) Specifies the certification bundle path for the SSL
verification.
Note: The bundle contains certificates of all the issuing authorities. For the verification of SSL certificates, the certification
bundle path is necessary.
Host Verification: Select this check box to enable the host verification. This option verifies whether the hostname matches the names
that are stored in the Apache web server certificate.
Default: Not selected
Host Verification Level: Select one of the following levels for verifying a host:
Loose: The host name is not verified against the CN (Common Name) attribute appearing in the SSL certificate. The host
verification checks if the IP address or host name points to the same Apache web server
Strict: The host name is verified against the CN (Common Name) attribute appearing in the SSL certificate. If the host name
does not match the CN field, the session request gets rejected.
Note: Host Verification Level is enabled only when Host Verification is selected.
Activate the required checkpoints to fetch monitoring data, after you create a profile. The checkpoints allow you to generate alarms when the
specified thresholds are breached and also generates QoS data at specified intervals.
Note: The probe does not generate alarms even if Publish Alarms is selected.
4. Select the Publish Data option, if available, for generating QoS data for the monitor.
5. Click Save to apply these changes.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
apache Node
Profile-Host Name
Apache Server Node
Connection Node
apache Node
You can configure the general properties of the probe. These properties apply to all the monitoring checkpoints of an Apache web server.
Navigation: apache
apache > Probe Information
This section provides information about the probe name, version, start time, and the probe vendor
apache > General Configuration
This section allows you to configure the log properties and timeout settings for the Apache HTTP Server Monitoring probe.
Log level: specifies the level of details that are written to the log file.
Default: 0 - Fatal
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when
debugging.
Total Operation Timeout (Seconds): defines the time in seconds the probe waits to get a response from the server.
Default: 300
Connect Timeout (Seconds): defines the time limit for establishing a connection between the Apache web server and the probe.
Default: 10
apache > Message Pool: displays default alarm messages that are generated at the different error conditions.
Profile-Host Name
The Profile-host name node is used to configure the host name or the IP address of the system, where the Apache web server is deployed. This
node is displayed as a child node under the group name node.
Navigation: apache > Profile-host name
Profile-host name > Apache Host Information
This section is used to update the host name or IP address of the Apache HTTP server.
Navigation: apache> Options (icon)> Add New Host
The Add New Host option configures the properties of the monitored host. Each Apache host is displayed as a child node under the host name no
de.
Navigation: apache > Profile-host name > host name > Application Server
Application Server > Host Configuration
This section configures the basic properties, required for the probe to connect and start a communication with the Apache host.
Hostname or IP address: defines the host name or IP address of the system where the Apache web server is deployed.
Alarm Message: specifies the alarm message to be generated when the Apache web server host does not respond.
Override Default Suppression ID: allows you to override the default suppression ID with the specified suppression ID.
Suppression ID: defines the new suppression ID to filter certain alarm messages. The defined suppression ID will override the default
suppression ID.
Server Address for HTTP Response and Server Status: defines the Apache web server address in the <server
address>/server-status?auto format.
Data Collection Interval: specifies the time interval for collecting data from the Apache web server.
Default: 5 minutes
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Extended Status: enables you to collect detailed information on the status (for example, detailed connection and request information)
from the Apache web server. This option works only if the Apache server configuration file is configured for providing the necessary
details.
Default: Not selected
Use SSL: allows the probe to use HTTPS to connect to the Apache web server securely.
Default: Not selected
Peer Verification: enables or disables the peer verification. Peer is the certification authority who issues the SSL certificates.
Default: Not selected
Certification Authority Bundle Path: specifies the certification bundle path for the SSL verification. The server path contains
certificates of all the issuing authorities. For the verification of SSL certificates, the certification bundle path is required.
Host Verification: enables or disables the host verification. This option verifies whether the hostname matches the names that are
stored in the server certificate.
Default: Not selected
Host Verification Level: specifies one of the following levels for verifying a host:
Loose: The host name is not verified against the CN (Common Name) attribute appearing in the SSL certificate. The verification
checks if the IP address or host name points to the same server
Strict: The host name is verified against the CN (Common Name) attribute appearing in the SSL certificate. If the host name does
The Apache Server node (default for each host) configures the checkpoints for the hosts being monitored. These checkpoints are classified
under the following nodes:
Connection
Connection Mode
ScoreBoard
ScoreBoard %
Server
Navigation: apache> Profile-localhost> localhost> Application Server> Apache Server
Notes:
Each category checkpoint is available under the specific category.
For more information on checkpoints, refer to the apache Metrics.
Connection Node
apache IM Configuration
Configure the Apache HTTP Server Monitoring (apache) probe to monitor the status and performance of the Apache web server. You can add the
target Apache web server to the probe and can configure the required checkpoints.
The following diagram outlines the process to configure the probe.
Contents
Verify Prerequisites
Create a Profile
Activate the Required checkpoints
Create Alarm Messages
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see apache (Apache HTTP Server
Monitoring) Release Notes.
Create a Profile
Add Apache web server to this probe to connect and request necessary information from that server.
Follow these steps:
1. Select an existing group. You can also create a new group for the host.
2. Click New Host button.
The General Profile Configuration dialog appears.
3. Set or modify the following values, as required:
Hostname or IP address: Specify the IP address or the hostname of the Apache web server system in the Hostname field
Active: Select Active to activate the profile and start monitoring the Apache server, on profile creation.
Alarm Message: Select the alarm message to be generated, when the Apache web server host does not respond.
Override Default Suppression ID: Select this checkbox to override the default suppression ID with the specified suppression ID.
Suppression ID: Define the new Suppression ID for filtering alarm messages.
Notes:
If Override Default Suppression ID option is selected, it is mandatory to enter the Suppression ID,
The Suppression ID textbox is enabled, only if Override Default Suppression ID option is selected.
Server Address for HTTP Response and Server Status: Define the apache web server address in the <server
address>/server-status?auto. For Example: www.apache.org/server-status?auto
Data Collection Interval: Specify the time interval for collecting data from the Apache web server.
Default: 5 minutes
Notes:
Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
This specified time interval overrules the Common minimum data collection interval that is defined on the Setup dialog.
Extended Status: Select this checkbox to collect detailed information on the status (for example, detailed connection and request
information) from the Apache web server. The extended status information is displayed in the right pane, when selecting the Apache
Connection sub-node for the Apache server.
Default: Not selected
Note: Enabling the Extended Status option can result in increased server load
Use SSL: Select this checkbox to allow the probe to use HTTPS to connect with the Apache web server.
Default: Not selected
Peer Verification: (If Use SSL checkbox is selected) Select this check box to enable peer verification. Peer is the certification
authority which issues SSL certificates.
Default: Not selected
Certification Authority Bundle Path: (If Use SSL checkbox is selected) Specifies the certification bundle path for the SSL
verification.
Information: The certification bundle path is required for the verification of SSL certificates. The bundle contains certificates
of all the issuing authorities.
Host Verification: Select this checkbox to enable the host verification. This option verifies whether the hostname matches the names
that are stored in the Apache web server certificate.
Default: Not selected
Host Verification Level: Select one of the following levels for verifying a host:
Loose: The host name is not verified against the CN (Common Name) attribute appearing in the SSL certificate. The host
verification checks if the IP address or host name points to the same Apache web server
Strict: The host name is verified against the CN (Common Name) attribute appearing in the SSL certificate. If the host name does
not match the CN field, the session request gets rejected.
Note: A Host Verification Level is enabled only when Host Verification is selected.
4.
4. Click the Test button to verify the connection between the host and the probe.
5. If the credentials are correct, a connection is established
6. Click OK.
The profile is created successfully.
Show Summary Database Status: displays the database status of the monitored hosts in the right pane.
The graph displays the backlog and summary data, only of the first and the last date which are recorded in the database.
Create a New Group Folder: enables you to create a folder in the left window pane. Use the folder to group the monitored hosts
logically.
Create a New Host Profile: enables you to create a host to be monitored in the folder selected from the left pane..
Message Pool Manager: enables you to modify the alarm text. You can also create your own messages.
View Apache Summary: displays the Apache Summary Report for the selected host. The graph displays the HTTP response time and
the cache hits and misses per day.
Show/Hide checkpoints that are not available: enables you to hide or show the checkpoints or monitors that are not available for the
selected host.
General Setup
Click the General Setup button to open the Setup dialog. General Setub tab has three sections - General, Advanced, and Timeout.
General tab
This section allows you to configure the log level for the probe.
Log-level - specifies the level of details that are written to a log file.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when
debugging.
Advanced Tab
Autorefresh GUI: Select this checkbox to automatically update the GUI to display the most recent measured values.
Maximum Summary Storage (used for local monitoring - will not affect QoS data): enables you to select the maximum data storage
time from the drop-down list. The monitored values in the selected time range are used to calculate the average value. The average is
used when setting the alarm threshold. For example, 24 hours means that only the values that are stored within the last 24 hours are
used.
Maximum concurrent threads: specifies the maximum number of profiles that the probe runs simultaneously. The valid range is 0 - 100.
Timeout Tab
Total operation timeout: defines the time (in seconds) the probe waits to get a response from the server.
Connect timeout: specifies the timeout (in seconds) the probe takes to connect with the server.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Extended Status: allows you to collect extended status (including detailed connection and request information) from the Apache web
server. This option works only if the Apache server configuration file is configured for providing the necessary details.
Right-click in the pane to open a context menu to display the following options:
New Host: allows you to create a host for monitoring
Available only when a group is selected.
New Group: Enables you to create a folder in the left pane. Use the folder to group the monitored hosts logically.
Edit: allows you to modify the host properties.
Available only when a host is selected.
Delete: allows you to delete the selected host or group.
Rename: enables you to rename the selected group or host.
Refresh: displays the current values of the objects that are listed in the right pane.
Note: If you attempt to refresh a host that does not respond, the checkpoint description in the right pane appears in red.
The following icons appear in the right pane, indicating the status of the checkpoint:
Green: the monitor is active.
Red: indicates an error situation.
Black: the monitor is not active.
Yellow: the last measurement failed.
Right-clicking in the pane allows you to perform the following functions,
Edit: enables you to modify the properties of the selected monitor.
Activate: activates the enabled monitors monitored by the probe. You can also select the required monitors.
Deactivate: deactivates the selected monitor.
Monitor: displays the values that are recorded because the probe was started.
Note: The horizontal red line in the graph indicates the alarm threshold (in this case 90 percent) defined for the checkpoint.
If you click inside the graph, a red vertical line appears. Continue to hold the line to read the exact value at different points in the graph.
The value is displayed in the upper part of the graph in the format: <Day> <Time> <Value>
Note: Right-click in the monitor window to select the backlog or the time range. The horizontal blue line in the graph
represents the average sample value. The time range in the graph cannot be greater than the defined Maximum Data
storage time in the Setup dialog.
Contents
Note: The server-status module must have been installed and configured on the Apache HTTP server to be monitored. The server-stat
us module allows the computer hosting the monitoring probe to access the server-page.
Access the URL http://www.apache.org/server-status and it returns the verbose version of the status page. You can add the "?auto" at the end of
the URL and access the less verbose version of the page; for example, http://www.apache.org/server-status?auto. The less verbose version is
more suitable for programmatic use.
Note: The less verbose version does not return connection level details. Therefore, it is not possible to monitor individual resources
using this option.
You can restrict access to authorized users or computers (the IP addresses), while configuring the server-status page. You can avoid the
availability of the information to the intruders by restricting the access.
The Extended Status module must be installed on the Apache HTTP server for retrieving detailed worker thread information. A worker thread is
used for handling individual requested resources. However, this option is not required to achieve the server level monitoring.
Note: All official documentation for the Apache HTTP server is available online and can be found here: http://httpd.apache.org/docs.
ExtendedStatus On
<Location /server-status>
SetHandler server-status
Order Deny,Allow
Deny from all
Allow from .nimsoft.no
</Location>
Note: Replace .nimsoft.no in the preceding example with your domain (or part of it). The ExtendedStatus On in the preceding
example is optional and it is included only if you want to receive extended status. The extended status includes the detailed
connection and request information of the server.
4. Restart the server after updating the configuration file for activating the new configuration settings.
Use the command din/apachectl - k restart or use the apache service monitor for restarting the server.
Note: Select the ExtendedStatus on the probe GUI for each of the Apache servers.
apache Metrics
This article describes the metrics that can be configured for the Apache HTTP Server Monitoring (apache) probe.
Contents
QoS Metrics
Alert Metrics Default Settings
QoS Metrics
The following table describes the QoS metrics that can be configured using the Apache HTTP Server Monitoring probe.
Metric Name
Units
Description
Version
QOS_APACHE_BUSYWORKERS
Count
1.5
QOS_APACHE_BYTESPERREQ
Bytes/Seconds
1.5
QOS_APACHE_CHILDAVEMBYTES
MBytes
1.5
QOS_APACHE_CHILDMAXMBYTES
MBytes
1.5
QOS_APACHE_CLOSINGCONNECTION
Count
1.5
QOS_APACHE_CLOSINGCONNECTIONPCT
Percent
1.5
QOS_APACHE_CONNAVEKBYTES
KBytes
1.5
QOS_APACHE_CONNMAXKBYTES
KBytes
1.5
QOS_APACHE_CPULOAD
Percent
1.5
QOS_APACHE_DNSLOOKUP
Count
1.5
QOS_APACHE_DNSLOOKUPPCT
Percent
1.5
QOS_APACHE_GRACEFULLYFINISHING
Count
1.5
QOS_APACHE_GRACEFULLYFINISHINGPCT
Percent
1.5
QOS_APACHE_HTTPRESTIME
Milliseconds
1.5
QOS_APACHE_HTTPRESVALUE
State
1.5
QOS_APACHE_IDLECLEANUPOFWORKER
Count
1.5
QOS_APACHE_IDLECLEANUPOFWORKERPCT
Percent
1.5
QOS_APACHE_IDLEWORKERS
Count
1.5
QOS_APACHE_KEEPALIVE
Count
1.5
QOS_APACHE_KEEPALIVEPCT
Percent
1.5
QOS_APACHE_LOGGING
Count
1.5
QOS_APACHE_LOGGINGPCT
Percent
1.5
QOS_APACHE_OPENSLOTNOCURRENTREQUEST
Count
1.5
QOS_APACHE_OPENSLOTNOCURRENTREQUESTPCT
Percent
1.5
QOS_APACHE_READINGREQUEST
Count
1.5
QOS_APACHE_READINGREQUESTPCT
Percent
1.5
QOS_APACHE_REQAVETIME
Millisecond
The average time required to process most recent request from all
current connections.
1.5
QOS_APACHE_REQMAXTIME
Millisecond
1.5
QOS_APACHE_REQPERSEC
Count/Second
1.5
QOS_APACHE_SENDINGREPLY
Count
1.5
QOS_APACHE_SENDINGREPLYPCT
Percent
1.5
QOS_APACHE_SLOTAVEMBYTES
MBytes
The average value of the total megabytes transferred in this slot from
all current connections.
1.5
QOS_APACHE_SLOTMAXMBYTES
MBytes
1.5
QOS_APACHE_STARTINGUP
Count
1.5
QOS_APACHE_STARTINGUPPCT
Percent
1.5
QOS_APACHE_SSAVETIME
Seconds
1.5
QOS_APACHE_SSMAXTIME
Seconds
1.5
QOS_APACHE_STATECSSAVETIME
Seconds
1.5
QOS_APACHE_STATECSSMAXTIME
Seconds
1.5
QOS_APACHE_STATEDSSAVETIME
Seconds
1.5
QOS_APACHE_STATEDSSMAXTIME
Seconds
1.5
QOS_APACHE_STATEKSSAVETIME
Seconds
1.5
K = Keepalive (read)
QOS_APACHE_STATEKSSMAXTIME
Seconds
1.5
QOS_APACHE_STATELSSAVETIME
Seconds
1.5
L = Logging
QOS_APACHE_STATELSSMAXTIME
Seconds
1.5
QOS_APACHE_STATERSSAVETIME
Seconds
1.5
R = Reading Request
QOS_APACHE_STATERSSMAXTIME
Seconds
1.5
QOS_APACHE_STATEWSSAVETIME
Seconds
1.5
W = Sending Reply
QOS_APACHE_STATEWSSMAXTIME
Seconds
1.5
QOS_APACHE_WAITINGFORCONNECTION
Count
1.5
QOS_APACHE_WAITINGFORCONNECTIONPCT
Percent
1.5
Warning
Threshold
Warning
Severity
Error
Threshold
Error
Severity
Description
MsgAgentError -
Critical
If the server is not running on $host, MsgAgentError is generated. The default value of the
MsgAgentError is defined in the Message Properties dialog.
MsgWarning
Warning
If the checkpoint breaches threshold, MsgWarning is generated. The default value of the
MsgWarning is defined in the Message Properties dialog.
MsgError
Critical
If the checkpoint breaches threshold, MsgError is generated. The default value of the
MsgError is defined in the Message Properties dialog.
Verify Prerequisites
Create a Profile
Activate the Checkpoints
Alarm Thresholds
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see apache (Apache HTTP Server
Monitoring) Release Notes.
Create a Profile
Add Apache web server to this probe to connect and request necessary information from that server. The probe can connect and request
necessary information from the apache server, after the host is added.
Follow these steps:
1. Click the Options (icon) next to the apache node.
2. Select Add New Host option.
The General Profile Configuration dialog appears.
3. Set or modify the following values, as required:
Hostname or IP address: Specify the IP address or the hostname of the Apache web server system in the Hostname field
Active: Select Active to activate the profile and start monitoring the Apache server, on profile creation.
Alarm Message: Select the alarm message to be generated, when the Apache web server host does not respond.
Override Default Suppression ID: Select this checkbox to override the default suppression ID with the specified suppression ID.
Suppression ID: (If Override Default Suppression ID checkbox is selected) Define the new suppression ID for filtering alarm
messages.
Server Address for HTTP Response and Server Status: Define the apache web server address in the <server
address>/server-status?auto. For Example: www.apache.org/server-status?auto
Data Collection Interval: Specify the time interval for collecting data from the Apache web server.
Default: 5 minutes
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Extended Status: Select this checkbox to collect detailed information on the status (for example, detailed connection and request
information) from the Apache web server. The extended status information is displayed in the right pane, when selecting the Apache
Connection sub-node for the Apache server.
Note: Enabling the Extended Status option can result in increased server load.
Use SSL: Select this checkbox to allow the probe to use HTTPS to connect with the Apache web server.
Default: Not selected
Peer Verification: (If Use SSL checkbox is selected) Select this check box to enable peer verification. Peer is the certification
authority which issues the SSL certificates.
Default: Not selected
Certification Authority Bundle Path: (If Use SSL checkbox is selected) Specifies the certification bundle path for the SSL
verification.
Note: The bundle contains certificates of all the issuing authorities. For the verification of SSL certificates, the certification
bundle path is necessary.
Host Verification: Select this check box to enable the host verification. This option verifies whether the hostname matches the names
that are stored in the Apache web server certificate.
Default: Not selected
Host Verification Level: Select one of the following levels for verifying a host:
Loose: The host name is not verified against the CN (Common Name) attribute appearing in the SSL certificate. The host
verification checks if the IP address or host name points to the same Apache web server
Strict: The host name is verified against the CN (Common Name) attribute appearing in the SSL certificate. If the host name
does not match the CN field, the session request gets rejected.
Note: Host Verification Level is enabled only when Host Verification is selected.
Note: The probe does not generate alarms even if Publish Alarms is selected.
4. Select the Publish Data option, if available, for generating QoS data for the monitor.
5.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
apache Node
Profile-Host Name
Apache Server Node
Connection Node
apache Node
You can configure the general properties of the probe. These properties apply to all the monitoring checkpoints of an Apache web server.
Navigation: apache
apache > Probe Information
This section provides information about the probe name, version, start time, and the probe vendor
apache > General Configuration
This section allows you to configure the log properties and timeout settings for the Apache HTTP Server Monitoring probe.
Log level: specifies the level of details that are written to the log file.
Default: 0 - Fatal
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when
debugging.
Total Operation Timeout (Seconds): defines the time in seconds the probe waits to get a response from the server.
Default: 300
Connect Timeout (Seconds): defines the time limit for establishing a connection between the Apache web server and the probe.
Default: 10
apache > Message Pool: displays default alarm messages that are generated at the different error conditions.
Profile-Host Name
The Profile-host name node is used to configure the host name or the IP address of the system, where the Apache web server is deployed. This
node is displayed as a child node under the group name node.
Navigation: apache > Profile-host name
Profile-host name > Apache Host Information
This section is used to update the host name or IP address of the Apache HTTP server.
Navigation: apache> Options (icon)> Add New Host
The Add New Host option configures the properties of the monitored host. Each Apache host is displayed as a child node under the host name no
de.
Navigation: apache > Profile-host name > host name > Application Server
Application Server > Host Configuration
This section configures the basic properties, required for the probe to connect and start a communication with the Apache host.
Hostname or IP address: defines the host name or IP address of the system where the Apache web server is deployed.
Alarm Message: specifies the alarm message to be generated when the Apache web server host does not respond.
Override Default Suppression ID: allows you to override the default suppression ID with the specified suppression ID.
Suppression ID: defines the new suppression ID to filter certain alarm messages. The defined suppression ID will override the default
suppression ID.
Server Address for HTTP Response and Server Status: defines the Apache web server address in the <server
address>/server-status?auto format.
Data Collection Interval: specifies the time interval for collecting data from the Apache web server.
Default: 5 minutes
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Extended Status: enables you to collect detailed information on the status (for example, detailed connection and request information)
from the Apache web server. This option works only if the Apache server configuration file is configured for providing the necessary
details.
Default: Not selected
Use SSL: allows the probe to use HTTPS to connect to the Apache web server securely.
Default: Not selected
Peer Verification: enables or disables the peer verification. Peer is the certification authority who issues the SSL certificates.
Default: Not selected
Certification Authority Bundle Path: specifies the certification bundle path for the SSL verification. The server path contains
certificates of all the issuing authorities. For the verification of SSL certificates, the certification bundle path is required.
Host Verification: enables or disables the host verification. This option verifies whether the hostname matches the names that are
stored in the server certificate.
Default: Not selected
Host Verification Level: specifies one of the following levels for verifying a host:
Loose: The host name is not verified against the CN (Common Name) attribute appearing in the SSL certificate. The verification
checks if the IP address or host name points to the same server
Strict: The host name is verified against the CN (Common Name) attribute appearing in the SSL certificate. If the host name does
not match the CN field, the session request gets rejected.
Apache Server Node
The Apache Server node (default for each host) configures the checkpoints for the hosts being monitored. These checkpoints are classified
under the following nodes:
Connection
Connection Mode
ScoreBoard
ScoreBoard %
Server
Navigation: apache> Profile-localhost> localhost> Application Server> Apache Server
Notes:
Each category checkpoint is available under the specific category.
For more information on checkpoints, refer to the apache Metrics.
Connection Node
Contents
Verify Prerequisites
Create a Profile
Activate the Required checkpoints
Create Alarm Messages
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see apache (Apache HTTP Server
Monitoring) Release Notes
Create a Profile
Add Apache web server to this probe to connect and request necessary information from that server.
Follow these steps:
1. Select an existing group. You can also create a new group for the host.
2. Click New Host button.
The General Profile Configuration dialog appears.
3. Set or modify the following values, as required:
Hostname or IP address: Specify the IP address or the hostname of the Apache web server system in the Hostname field
Active: Select Active to activate the profile and start monitoring the Apache server, on profile creation.
Alarm Message: Select the alarm message to be generated, when the Apache web server host does not respond.
Override Default Suppression ID: Select this checkbox to override the default suppression ID with the specified suppression ID.
Suppression ID: Define the new Suppression ID for filtering alarm messages.
Notes:
If Override Default Suppression ID option is selected, it is mandatory to enter the Suppression ID,
The Suppression ID textbox is enabled, only if Override Default Suppression ID option is selected.
Server Address for HTTP Response and Server Status: Define the apache web server address in the <server
address>/server-status?auto. For Example: www.apache.org/server-status?auto
Data Collection Interval: Specify the time interval for collecting data from the Apache web server.
Default: 5 minutes
Notes:
Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
This specified time interval overrules the Common minimum data collection interval that is defined on the Setup dialog.
Extended Status: Select this checkbox to collect detailed information on the status (for example, detailed connection and request
information) from the Apache web server. The extended status information is displayed in the right pane, when selecting the Apache
Connection sub-node for the Apache server.
Default: Not selected
Note: Enabling the Extended Status option can result in increased server load
Use SSL: Select this checkbox to allow the probe to use HTTPS to connect with the Apache web server.
Default: Not selected
Peer Verification: (If Use SSL checkbox is selected) Select this check box to enable peer verification. Peer is the certification
authority which issues SSL certificates.
Default: Not selected
Certification Authority Bundle Path: (If Use SSL checkbox is selected) Specifies the certification bundle path for the SSL
verification.
Information: The certification bundle path is required for the verification of SSL certificates. The bundle contains certificates
of all the issuing authorities.
Host Verification: Select this checkbox to enable the host verification. This option verifies whether the hostname matches the names
that are stored in the Apache web server certificate.
Default: Not selected
Host Verification Level: Select one of the following levels for verifying a host:
Loose: The host name is not verified against the CN (Common Name) attribute appearing in the SSL certificate. The host
verification checks if the IP address or host name points to the same Apache web server
Strict: The host name is verified against the CN (Common Name) attribute appearing in the SSL certificate. If the host name does
not match the CN field, the session request gets rejected.
Note: A Host Verification Level is enabled only when Host Verification is selected.
4. Click the Test button to verify the connection between the host and the probe.
5. If the credentials are correct, a connection is established
6. Click OK.
The profile is created successfully.
View Apache Summary: displays the Apache Summary Report for the selected host. The graph displays the HTTP response time and
the cache hits and misses per day.
Show/Hide checkpoints that are not available: enables you to hide or show the checkpoints or monitors that are not available for the
selected host.
General Setup
Click the General Setup button to open the Setup dialog. General Setub tab has three sections - General, Advanced, and Timeout.
General tab
This section allows you to configure the log level for the probe.
Log-level - specifies the level of details that are written to a log file.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when
debugging.
Advanced Tab
Autorefresh GUI: Select this checkbox to automatically update the GUI to display the most recent measured values.
Maximum Summary Storage (used for local monitoring - will not affect QoS data): enables you to select the maximum data storage
time from the drop-down list. The monitored values in the selected time range are used to calculate the average value. The average is
used when setting the alarm threshold. For example, 24 hours means that only the values that are stored within the last 24 hours are
used.
Maximum concurrent threads: specifies the maximum number of profiles that the probe runs simultaneously. The valid range is 0 - 100.
Timeout Tab
Total operation timeout: defines the time (in seconds) the probe waits to get a response from the server.
Connect timeout: specifies the timeout (in seconds) the probe takes to connect with the server.
Create a New Host
Host Name or IP address: defines the host name or IP address of the system where the Apache web server is deployed.
Alarm Message: specifies the alarm message to be generated when the Apache web server host does not respond.
Override Default Suppression ID: allows you to override the default suppression id with the specified suppression id.
Suppression ID: defines the new suppression id overriding the default suppression ID to filter certain alarm messages.
Server Address for HTTP Response and Server Status: defines the Apache web server address in the <server
address>/server-status?auto format.
Data Collection Interval: specifies the time interval for collecting data from the Apache web server.
Default: 5 minutes
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Extended Status: allows you to collect extended status (including detailed connection and request information) from the Apache web
server. This option works only if the Apache server configuration file is configured for providing the necessary details.
Default: Not selected
Use SSL: allows the probe to use HTTPS to connect to the Apache web server.
Default: Not selected
Peer Verification: enables or disables the peer verification. Peer is the certification authority who issues the SSL certificates.
Default: Not selected
Certification Authority Bundle Path: specifies the certification bundle path for the SSL verification. The path contains certificates of all
the issuing authorities. For the verification of SSL certificates, the certification path is required.
Host Verification: enables or disables the host verification, which verifies the hostname with the names that are stored in the server
certificate.
Default: Not selected
Host Verification Level: specifies one of the following levels for verifying a host:
Loose: The host name is not verified against the CN (Common Name) attribute appearing in the SSL certificate. The verification
checks if the IP address or host name points to the same server
Strict: The host name is verified against the CN (Common Name) attribute appearing in the SSL certificate. If the host name does not
match the CN field, the session request gets rejected.
The Left Pane
The left pane displays the various groups and hosts that belong to a group. Each host has two sub-nodes, which are displayed in the left pane:
Apache Server
Displays all checkpoints, that include the measured values displayed on the server status.
Apache Connection
Displays the extended status information (includes the detailed connection and request information) from the server.
The Apache Connection node also displays the connection status for the host as follows:
Indicates that the host responds.
Indicates that the host does not respond.
Initializing, waiting for measured values from the profile.
Indicates that the host responds, but has no monitored interfaces.
Activate the monitors that you want to monitor. Click Show/Hide checkpoints that are not available checkbox.
Right-click in the pane to open a context menu to display the following options:
New Host: allows you to create a host for monitoring
Available only when a group is selected.
New Group: Enables you to create a folder in the left pane. Use the folder to group the monitored hosts logically.
Edit: allows you to modify the host properties.
Available only when a host is selected.
Delete: allows you to delete the selected host or group.
Rename: enables you to rename the selected group or host.
Refresh: displays the current values of the objects that are listed in the right pane.
Note: If you attempt to refresh a host that does not respond, the checkpoint description in the right pane appears in red.
The right pane of the probe displays the following options, depending on the selection in the navigation pane:
Hosts: displays the host in a group that is selected in the left-pane.
All checkpoints: displays the checkpoint or monitors of a host that is selected in the left-pane.
The following icons appear in the right pane, indicating the status of the checkpoint:
Green: the monitor is active.
Red: indicates an error situation.
Black: the monitor is not active.
Yellow: the last measurement failed.
Right-clicking in the pane allows you to perform the following functions,
Note: The horizontal red line in the graph indicates the alarm threshold (in this case 90 percent) defined for the checkpoint.
If you click inside the graph, a red vertical line appears. Continue to hold the line to read the exact value at different points in the graph.
The value is displayed in the upper part of the graph in the format: <Day> <Time> <Value>
Note: Right-click in the monitor window to select the backlog or the time range. The horizontal blue line in the graph
represents the average sample value. The time range in the graph cannot be greater than the defined Maximum Data
storage time in the Setup dialog.
The apache summary report displays the HTTP response time for the selected period. You can select a graph from the drop-down that displays
the following values for the selected time period.
Per hour: displays the values of one hour for the period that is selected using the From and To fields.
Per day: displays the values of one day for the period that is selected using the From and To fields.
Last day: displays the values of one hour of the last day.
Last month: displays the values of one hour of the last month.
Show Summary Database Status
The Show Summary Database Status button displays a summarized view of the database status of the monitored hosts. The list displays the
first and the last day summary data that are recorded in the database.The data also displays the number of records in the period.
Delete Summary Data: allows you to delete the database of the selected or complete time period.
The Checkpoint Monitoring Properties
The Checkpoint Monitor Properties dialog for a checkpoint or monitor, enables you to modify the monitoring properties for the monitor.
Description: provides additional information of the checkpoint.
Monitoring Object: defines the name of the monitoring object.
Enable Monitoring: activates the monitoring of the checkpoint.
Value
Last/Current Sample
Uses the last measured value to compare with the specified threshold value.
Compute Average
Computes the average of the measure values in the selected time interval. Select one of the predefined intervals from the drop-down
list. You can type your own value in the field.
Operator: specifies operator to use when defining the alarm threshold.
Examples:
= 90 means generate an alarm if the measured value is 90.
=> 90 means generate an alarm if the measured value is equal to or above 90.
Threshold Value: specifies the alarm threshold value.
Unit: specifies the unit of the monitored value. For example, %, Mbytes.
Message Token: allows you to select the alarm message to be generated if the specified threshold value is breached.
Publish Quality of Service (QoS): allows you to generate a QoS data.
Message Pool Manager
The Message Pool Manager displays the list of alarm messages that are available in the probe. You can also create or edit new messages.
Identification Name: defines the name of the alarm message.
Token: identifies the predefined alarms.
Error Alarm Text: allows you to specify the alarm message text, which is generated if the defined threshold is breached.
Clear Alarm Text: allows you to specify the clear alarm message text.
Severity: specifies the alarm messages severity level.
Subsystem: identifies the alarm subsystem ID that defines the alarm source.
Related topics:
apmgtw (CA APM Gateway) Release Notes
Overview
Set Up the apmgtw Probe
Define the Probes and Corresponding QoS Measurements for Reporting
Verify apmgtw Probe Function in CA APM
View Reports in CA APM
Supported CA Unified Infrastructure Management Probes
Overview
1. Install the probe only on a robot on a primary hub.
2. Configure the probe in the Admin Console UI.
probe to send QoS messages from those probes that you select.
Follow these steps:
1. In Admin Console, select the apmgtw probe> Configure.
In the navigation pane, each probe for which you can configure the QoS messages is listed as a separate node.
2. Select the node that corresponds to the probe for which you want to configure the QoS messages.
3. Check or clear the box next to the Enable QoS publishing field to enable or disable QoS message publishing for the probe.
4. In the detail pane below the Enable QoS publishing field for each probe, select which QoS messages you want to enable for the probe
by checking the appropriate boxes.
5. Repeat the previous steps until all the probes and their QoS messages are configured.
6. Click Save.
7. Deactivate and activate the probe for the changes to take effect.
You configured the apmgtw probe to send QoS messages from the desired probes
This section describes the configuration of the CA APM Gateway (apmgtw) probe in CA Unified Infrastructure Management.
Contents
Overview
Specify the CA APM Enterprise Manager Server
Specify the Port Number of the CA APM Enterprise Manager Server
Set the Probe Log Level
Define Included Origins for the Probe
Specify Included Hosts for the Probe
Display or Hide Probe Origin
Set Expiration for Publishing Data
Set Prefetch Hosts Value
PrimaryHub Host
Manage QoS Messages
Define the Probes and Corresponding QoS Measurements for Reporting
Supported CA Unified Infrastructure Management Probes
Overview
1. In Infrastructure Manager, double-click or right-click the apmgtw probe> Configure.
2. The configuration window opens.
In the configuration window, the following Probe Configurations can be set:
Note: The default value is 3; the minimum value is 1 and the maximum value is 5. The maximum value yields the most detail.
Note: If you define included origins, the probe fetches metrics from only those origins.
Note: If you specify included hosts, then the apmgtw probe sends QoS messages from only those hosts.
3. (Optional) If you select true, enter one or more origin (hub) names separated by a comma.
You set the visibility of the origin.
2. In the configuration window Expiration field, enter the desired value in minutes.
Note: The expiration applies to all QoS messages that the apmgtw probe is configured to send. The probe stops sending QoS
messages to APM after this expiration time is met. However, if a UIM probe under monitor starts sending a QoS message again after
this, the apmgtw probe automatically resumes sending that data to CA APM.
PrimaryHub Host
You can specify the Primary Hub host name of the CA Unified Infrastructure Management Infrastructure Manager. This is optional when the
apmgtw probe is deployed on a robot present in Primary Hub. This is mandatory field, when apmgtw probe is deployed on a robot present in a
Secondary Hub.
You can specify the primary hub host value.
Follow these steps:
1. In Infrastructure Manager, double click or right click the apmgtw probe.
2. In the configuration window, in the PrimaryHub Host field, enter the name or IP Address of the primary hub.
2.
Note: Every time you click Config QoS, the probes and QoS present in Nimbus at that instance are fetched and shown in the
ShowQoS window.
3. In the resultant ShowQoS window, each probe is shown as a tab in the window. All the QoS available in Nimbus for each probe are listed
under each of the tabs.
4. Select the probe which apmgtw probe wants to monitor by clicking on the appropriate tab.
5. Under the selected tab, select the first checkbox item in the list, which is Enable QoS publishing for <Probe Name> probe.
6. Select the QoS for this probe by selecting the checkboxes of the QoS messages.
7. Repeat the previous steps until all the probes and their QoS messages are configured.
8. Click Save button in the ShowQoS window.
9. Click Save button in Config UI.
10. Click Yes when the probe asks for restart.
The apmgtw probe then reports on any metric that matches the format, and includes the target.
In this case, the probe reports on any QOS_MEMORY_USAGE metric from the vmware probe that has a target containing the string
'MemoryOverallUsage':
You configured a metric so that it includes a target.
format.cdm.QOS_CPU_USAGE={host}|CPU Usage:{target:/:1}
format.cdm.QOS_CPU_USAGE={host}|CPU Usage:{source:-:0}
Note: Set the probe to inactive before you run the console.bat utility.
This section describes the configuration of the CA APM Gateway (apmgtw) probe in CA Unified Infrastructure Management.
Contents
Overview
Specify the CA APM Enterprise Manager Server
Specify the Port Number of the CA APM Enterprise Manager Server
Set the Probe Log Level
Define Included Origins for the Probe
Specify Included Hosts for the Probe
Display or Hide Probe Origin
Set Expiration for Publishing Data
Set Prefetch Hosts Value
PrimaryHub Host
Manage QoS Messages
Define the Probes and Corresponding QoS Measurements for Reporting
Supported CA Unified Infrastructure Management Probes
Overview
1. In Infrastructure Manager, double-click or right-click the apmgtw probe> Configure.
2. The configuration window opens.
In the configuration window, the following Probe Configurations can be set:
Note: The default value is 3; the minimum value is 1 and the maximum value is 5. The maximum value yields the most detail.
Note: If you define included origins, the probe fetches metrics from only those origins.
Note: If you specify included hosts, then the apmgtw probe sends QoS messages from only those hosts.
3. (Optional) If you select true, enter one or more origin (hub) names separated by a comma.
You set the visibility of the origin.
1.
Note: The expiration applies to all QoS messages that the apmgtw probe is configured to send. The probe stops sending QoS
messages to APM after this expiration time is met. However, if a UIM probe under monitor starts sending a QoS message again after
this, the apmgtw probe automatically resumes sending that data to CA APM.
PrimaryHub Host
You can specify the Primary Hub host name of the CA Unified Infrastructure Management Infrastructure Manager. This is optional when the
apmgtw probe is deployed on a robot present in Primary Hub. This is mandatory field, when apmgtw probe is deployed on a robot present in a
Secondary Hub.
You can specify the primary hub host value.
Follow these steps:
1. In Infrastructure Manager, double click or right click the apmgtw probe.
2. In the configuration window, in the PrimaryHub Host field, enter the name or IP Address of the primary hub.
2.
Note: Every time you click Config QoS, the probes and QoS present in Nimbus at that instance are fetched and shown in the
ShowQoS window.
3. In the resultant ShowQoS window, each probe is shown as a tab in the window. All the QoS available in Nimbus for each probe are listed
under each of the tabs.
4. Select the probe which apmgtw probe wants to monitor by clicking on the appropriate tab.
5. Under the selected tab, select the first checkbox item in the list, which is Enable QoS publishing for <Probe Name> probe.
6. Select the QoS for this probe by selecting the checkboxes of the QoS messages.
7. Repeat the previous steps until all the probes and their QoS messages are configured.
8. Click Save button in the ShowQoS window.
9. Click Save button in Config UI.
10. Click Yes when the probe asks for restart.
The apmgtw probe then reports on any metric that matches the format, and includes the target.
In this case, the probe reports on any QOS_MEMORY_USAGE metric from the vmware probe that has a target containing the string
'MemoryOverallUsage':
You configured a metric so that it includes a target.
format.cdm.QOS_CPU_USAGE={host}|CPU Usage:{target:/:1}
format.cdm.QOS_CPU_USAGE={host}|CPU Usage:{source:-:0}
Note: Set the probe to inactive before you run the console.bat utility.
apmgtw Metrics
The CA APM Gateway (apmgtw) probe is a gateway probe that does not generate any QoS. Therefore, there are no probe checkpoint metrics to
be configured for this probe.
Setup
The Setup tab enables you to specify the depth (level) to which the messages should be logged as well as the interval at which the report will be
generated.
Field
Description
Log Level
Sets the level of messages written to the log file. By default, the slider is set to minimum log level.
Report Interval
Choose the time interval at which the report is to be generated. By default, this is set to 4 minutes.
Email configuration
The Email Configuration tab allows you to connect to an email server.
Field
Description
Primary
Mail
Server
Ignore
TLS
Allows you to specify that TLS (Transport Layer Security) negotiation should NOT be attempted even if the server announces that
the capability exists. This because some servers will announce TLS capability even if is not there, usually due to a missing
certificate. By default, this checkbox is unselected.
Username
Password
Database configuration
The Database Configuration tab allows you to specify the database configuration to the data_engine probe.
On selecting the Choose Date Engine option, the drop-down menu is populated with all the data engines available in the domain. By default, the
first data engine is displayed.
You can also manually specify the database details by choosing the Set Database Configuration option. Provide the necessary details such as
name of the database, initial catalog, data source, user ID and password in the respective fields. A default string is preconfigured in the Parameter
s field.
You can verify the settings by clicking the Test Connection button.
Report
The Report tab displays the cumulative count of server probes for all the domains found, and the server probe count for individual robots.
assetmgmt Metrics
The Asset Management (assetmgmt) probe does not generate any QoS. Therefore, there are no metrics to be configured for this probe.
audit
The audit probe maintains data structures for UIM Server monitoring. When deployed, the audit probe performs the following tasks:
Creates AUDIT tables in the UIM database.
Adds an audit queue to the hub and stores audit messages in the database.
Monitors changes to your UIM environment and sends audit messages to the Primary Hub.
Audit messages, or events, are generated from the following actions:
User Commands - User activities such as activating probes.
Probe Commands - Probe activities such as distsrv probe package distributions.
On initial probe startup, the audit probe does the following:
1. Checks if an audit queue is available on the hub. If there is no audit queue, the queue is created.
Important! If the audit probe cannot create a permanent hub queue, it subscribes to audit messages and a temporary queue is
created. This can lead to a loss of audit messages if the audit probe is stopped.
2. Retrieves the UIM database connection string from the data_engine probe.
3. Creates the tables for the audit messages in the UIM database.
The audit probe can collect audit messages from robots on one hub or from robots on all of the hubs in a UIM Domain.
More information:
audit Release Notes
Tip: If you need descriptions of the fields in the audit probe GUI, see the article v1.2 audit GUI Reference
Configuration Overview
The following diagram shows the tasks you should complete to configure the audit probe.
Configuration Overview
Verify Prerequisites
Deploy the audit Probe
(Optional) Change the Database Connection
Apply Auditing to the UIM Robots
(Optional) Add Filters
Verify Prerequisites
The audit probe requires the appropriate version of the compat-libstdc++ runtime library to be installed on your system. This is required for C++
runtime compatibility. The copy of distribution specific compat-libstdc++ runtime library can be obtained from one of the following links:
http://rpm.pbone.net
http://rpmfind.net
).
2. Click Find robots, configure your robot search settings, and click OK. A list of available robots appears in the Audit administration
window. The status of each robot is denoted by the icon next to it:
- The robot does not support auditing.
- The audit probe cannot communicate with the robot. This can occur if the system the robot is deployed to is turned off.
- The robot is being audited.
- The robot is not being audited.
3. Select the robots you want to audit and click Enable audit.
Tip: The Audit administration window supports multi-select using <shift> or <ctrl>.
).
).
2. In the Log Setup section, change the Level field desired logging level. The log levels that you can specify are:
Level 0 - Fatal - Logs severe messages
Level 1 - Error - Logs errors
Level 2 - Warn - Logs warnings
Level 3 - Info - Logs informational messages
Level 4 - Debug - Logs debugging messages
Level 5 - Trace - Logs tracing/low-level debugging messages
3. Change the Size(KB) field to the desired log size in KB.
).
2. In the Data administration section, change the Drop data after: field to your desired retention period in days. You can select one of the
defined values or enter your own.
3. Change the Administration time to the desired time of day and interval length. For example, if you enter 03:00:00 interval 24:00:00,
data administration takes place every day at 3AM.
Note: You cannot set the value of the audit key to 1,2, or 4. The write settings value (1 or 2) must be combined with the enable auditing
base value (4).
The 8 value does not follow this formula, it is a unique value that defers to the hub settings for auditing.
Controller Events
Hub Events
Infrastructure Manager Events
Note: The file size limit for configuration file comparison is set by in the controller configuration file using the audit_max_config_size
key. The default is 0 (unlimited).
Controller Events
Probe
Message
controller
controller
controller
User Command
controller
_stop
controller
_shutdown
controller
sethub
controller
hubcall_robotup
controller
Robot stop
controller
controller
Hub contacted
controller
<probe name>
Probe removed
<probe name>
<probe name>
<probe name>
Audit initiated
<probe name>
<probe name>
File change detected, but file is too large for change details comparison
<probe name>
<probe name>
probe_register
<probe name>
probe_config_set
<probe name>
remote_config_set
<probe name>
inst_execute
<probe name>
probe_verify
<probe name>
probe_unregister
<probe name>
probe_activate
<probe name>
probe_deactivate
<probe name>
probe_config_lock
<probe name>
probe_config_set
<probe name>
probe_start
<probe name>
probe_stop
<probe name>
<parameter>=<value>
probe_change_par
<probe name>
inst_pkg_remove
<package name>
inst_request
controller
maint_until
controller
_audit_type
<probe name>
_audit_restore
<as specified>
_audit_send
Hub Events
Probe
Message
hub
hub
hub
hub
hub
hub
User Command
_audit_send
_audit_send
_audit_send
_audit_send
_audit_send
Setup
Clicking the Setup button opens the Setup dialog. The Setup dialog contains general probe parameters such as the log levels and data
administration settings.
Level - Sets the level of detail written to the log file. We recommend logging as little as possible during normal operation in order to
minimize disk consumption.
Size (KB) - Sets the size of the probes log file. The default size is 100 KB. When this size is reached, the contents of the file are cleared.
Data Administration Fields
Drop data after - Specifies the number of days the data is kept in the audit tables before they are deleted. You can select one of the
defined values or enter your own value.
Administration time specification - Specifies the time and interval for data administration using the format 03:00:00 interval 24:00:00.
During data administration, old data is deleted from the audit tables in the database.
Search for data_engine - The audit probe searches for the data_engine probe in your environment and uses its connection string.
Specify data_engine - If the search for data_engine option does not work in your environment, you can specify the data_engine
address.
Specify database connection information - Use this option if you want to store the audit data in a separate database. You should use
this option if you do not want UIM administrators to have modify or delete access for the audit tables.
All other fields are dependent on the database software that you are using:
Microsoft SQL Server
You can use the Find robots button to search for the available robots in either of the following locations:
The the upper part of the window lists the events collected by the filter selected in the Navigation pane. The columns in the window can be sorted
ascending or descending by clicking the column header, and the columns can also be moved, using drag and drop.
Right-clicking in the list, a small menu opens, allowing you to:
Create a filter.
Select all entries in the list
Copy the selected entries to the clipboard.
The entries in the list are tagged with an icon, depending on type of event:
The icon for events.
The icon for user commands.
This icon for probe commands. For example, probe package distribution executed by the distsrv probe.
The list has the following columns:
Event type: Describes the type of event; event, user command or probe command.
Note: Events only appear in the lower window if the event caused a configuration change (the column Has config change is set to True)
.
The Navigation pane lists the filters for determining which events are shown in the Main window pane. The probe is delivered with a set of
standard filters, but you can also define your own filters.
Right-clicking in the list lets you add, edit or delete filters.
Note: The small button next to the input fields let you invert the selection (NOT expression). The buttons will turn red when selected.
Description
Event type
Event
description
This is a description of the event, and filtering is done on events messages containing the description specified If the Event type
field is empty, all valid descriptions for all even types are listed. If one of the three valid event types is selected, the
corresponding descriptions will be available.
Origin
Normally the name of the hub from which the event occurs (if the origin name has not been overruled on the controller probe).
Domain
Hub
Robot
Probe
For Events: The name of the probe that issued the event (the controller probe).
For User commands: The name of the probe to which the command was issued.
For Probe commands: The name of the probe that issued the command.
User name
User
command
User IP
The IP address of the computer from which the command triggering the event was issued.
Status
Configuration
changes
True or False. Match for events where the configuration file for the probe involved has changed as a result of the event.
Checkpoint
ID
Time
limitations
Select the time limitations for the filter, i.e. the time limitations for the entries shown in the Main window pane when the filter is
selected in the Navigation pane.
Valid entries are:
Last hour
Last day
Last week
Last month
Specify range
If specify range is selected, the From time and To time must be specified on the format.
<day><month><year> <hour><minute><second>.
Calendar functionality is available when selecting from the list. Select the preferred day and optionally edit the time specification
as required.
You can modify the following values using the audit probe Raw Configure menu.
pool_keep_alive - Specify a time-interval in seconds, for which a database connection will be kept open. For example, a value of 10
indicates that the connection(s) will be kept open for 10 seconds.
threads_query_pool - Specify the number of simultaneous connections allowed for the audit probe. For example, a value of 5 indicates
that a pool of 5 database connections can be maintained at the same time.
automated_deployment_engine
The automated_deployment_engine probe provides powerful functions to deploy robots using XML distribution, as well as behind-the-scenes
functionality for probe and package distribution in Admin Console.
Note: Admin Console distribution was handled by the distsrv probe prior to UIM Server 8.0. Distsrv continues to provide distribution in
Infrastructure Manager.
configure your own customized metrics, and store these metrics in the AWS CloudWatch for viewing, or monitoring purpose. These
metrics, which AWS does not generate, are called custom metrics. The AWS Monitoring probe lets you configure the custom metrics for
QoS generation.
AWS Simple Queue Service (SQS): This AWS service lets you transmit data to other services using message queues. The AWS
Monitoring probe lets you configure the message queue properties for QoS generation.
AWS Simple Notification Service (SNS): This AWS service lets you manages the notification messages that a publisher sends and a
subscriber receives through a communication channel. The AWS Monitoring probe monitors the communication channel and generates
QoS data based on the status of the notifications.
AWS Elastic Load Balancing (ELB): This AWS service lets you route the traffic that comes from various applications across multiple
available EC2 instances. The AWS Monitoring probe monitors the ELB layer at group level and generates QoS data based on the status
of the ELB layer.
AWS Auto Scaling: This AWS service lets you accumulate different EC2 instances in a group. You can create an auto scaling group
according to the usage of the EC2 instances in various applications.The AWS Monitoring probe monitors the instance status at group
level.
Important! Amazon charges the AWS account which the probe uses to monitor the AWS services. You must consider this fact while
configuring the probe for monitoring various AWS services.
More information:
AWS Monitoring (aws) Release Notes
The AWS Monitoring (aws) probe is configured to create monitoring profiles for accessing AWS resources and fetching data from AWS
CloudWatch. You can also configure health monitors to generate alarms on the basis of the availability of services in various geographical
regions.
The probe also lets you configure the Auto Discovery functionality. If any service instance is added or deleted in the AWS resource, then the Aut
o Discovery functionality updates the list of instances in the probe.
The probe fetches data of instances, or services and provides you with various monitors for generating QoS. You can also fetch the list of custom
metrics, created for a specific service in the AWS CloudWatch.
The following diagram outlines the process to configure the aws probe.
Contents
Prerequisites
Create a Profile
Verify the Credentials
Activate a Service
Create a Template
Create Filter Rules
Copy Templates to Another Robot
Using Regular Expressions
Alarm Thresholds
Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see aws (Amazon Web Services
Monitoring) Release Notes.
Create a Profile
Each profile represents one AWS resource. There can be multiple instances of an AWS resource. The following procedure enables you to add a
profile for monitoring the AWS services.
Follow these steps:
1. Click Options (icon) next to the AWS node in the navigation pane.
2. Click Add New Profile.
3. Define a profile name.
4. Activate the profile.
5. Specify the time interval (in seconds) after which the probe collects the data from the AWS cloud for the profile.
6. Select the Alarm Message to be generated when the connection to AWS services fails.
7. Enter the Access Key.
8. Enter Secret Access key.
Note: Valid user-credentials, such as Access Key and Secret Access Key are mandatory for creating a profile.
Activate a Service
You can activate the service you want to monitor on the resource group.
Follow these steps:
1. Click the required <Service>
2. Select Active in the <Service Name> node.
You can activate any of the following services:
AutoScaling
Custom Metrics
ElastiCache
ELB Node
ELB-<Region Name>
RDS
S3
SNS
SQS
The selected service is activated.
Create a Template
You can create and use templates in the probe to configure multiple profiles and services with the same monitor configuration.
Follow these steps:
1. Open the probe configuration interface.
2. Click Template Editor.
The Template Editor - <probeName> <Version> page is displayed.
3. Click the Options (icon) next to the aws probe node.
4. Click Create Template.
5. Specify a name, description for the template.
6. Specify the precedence for the template.
6.
Notes:
A numeric value is set as precedence.
The default value is 0 (highest precedence).
The precedence of a template decreases as the value increases.
Example: 1 has higher precedence than 2 and so on.
The precedence is applied on multiple templates. The scenarios are describes as follows:
When the precedence is different for the templates: The precedence works from the lower to higher hierarchy.
Example: If the precedence is set at 0 for one template, and 1 for another template, the template with 0 precedence has
higher priority.
When precedence is same for all templates: The precedence works in alphabetical order of template name.
When filters are applied on templates: The precedence works according to the applied filters. If no filter is applied, the
precedence is applied on available templates.
Note: You can skip steps 8 and 9 if you do not want to configure the applicable monitors in those nodes.
11. Select Include in Template to include the monitor or section in the template.
12. The monitor or section is available for editing.
13. Configure the required monitor or section, as needed.
14. Select the <templateName>node.
15. Select Active to activate the template.
Select Save.
The template is created and applied to the probe.
Note: The section of the profile(s) configured using templates are not available for individual configuration. Clear the Active checkbox
to deactivate the template on the profile to unlock it. You can also exclude the profile using filter rules.
Important! The templates are available only for profiles excluding the Custom Metrics and Health Services.
1.
Standard (PCRE)
[A-Z:\\]
Custom
Matches with the Uppercase character type of the local disk available on the respective box
Standard (PCRE)
[*.\\]
Custom
Standard (PCRE)
\d*
Custom
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Value
2.19.
Amazon
2.19.1.
AWS
2.19.1.1.
Resource
2.19.1.2.
ServiceStatus
2.19.1.3.
EC2
2.19.1.4.
S3
2.19.1.5.
EBS
2.19.1.6.
RDS
2.19.1.9
ElastiCache
2.19.1.8
SNS
2.19.1.7
SQS
2.19.1.11
ELB
2.19.1.12
Auto Scaling
To update the Subsystem IDs using Admin Console, follow these steps:
1. In the Admin Console, click the black arrow next to the NAS probe, select Raw Configure.
2. Click the Subsystems folder.
3. Click the New Key Menu item.
4. Enter the Key Name in the Add key window, click Add.
The new key appears in the list of keys with a blank value.
5. Click in the Value column for the newly created key and enter the key value.
6. Repeat this process for all the required subsystem IDs for your probe.
7. Click Apply.
To update the Subsystem IDs using Infrastructure Manager, follow these steps:
1. In Infrastructure Manager, right click the NAS probe, select Raw Configure.
2. Click the Subsystems folder.
3. Click the New Key (...) button.
4. Enter the Key Name and Value.
5. Click OK.
6. Repeat this process for all the required subsystem IDs for your probe.
7.
7. Click Apply.
aws Node
<profile name> Node
EC2 Node
ElastiCache Node
<Instance Name> node
SQS Node
SQS <Region Name> node
AutoScaling Node
Auto Scale-<Region Name> node
<Auto Scale Group Name> node
ELB Node
ELB-<Region Name> node
<ELB Layer Name> node
RDS Node
<Database Name> node
SNS Node
SNS-<Region Name> node
Custom Metrics
<AWS-Service Name> node
S3 Node
Template Editor
<templateName> Node
Auto Filter or <filterName> Node
Include in Template
aws Node
This node lets you view the probe information and configure the logging properties. You can also set the polling interval for Auto Discovery functi
onality and configure the proxy settings.
Note: The aws services nodes are visible in the navigation pane only after you create a monitoring profile. Initially, only the AWS node
and the AWS Service Health node are visible.
Navigation: aws
Set or modify the following values if needed:
aws > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
aws > Probe Setup
This section lets you configure the detail level of the log file. The default value is 3-info.
aws > Auto Discovery
This section lets you set the value of Discovery Interval (minutes). If any instance is added or deleted in the AWS resource, then the Auto
Discovery functionality updates the list of instances in the probe. The Discovery Interval (minutes) specifies the time between each interval the
probe runs the auto discovery functionality.
aws > Proxy Settings
This section enables you to connect to the AWS cloud through a proxy server on the network. You need proxy server settings when your network
is not an open network.
Enable Proxy: lets you use a proxy server for connecting to the AWS cloud.
IP: defines the IP address of the proxy server.
Port: specifies the port on the proxy server through which the connection is established.
Username: defines the user name for accessing the proxy server.
Note: The AWS probe is certified for use in Squid proxy environment.
Alarm Message: specifies the alarm to be generated when the connection to AWS services fails.
Default: ResourceCritical
Access Key: defines the login credential of the AWS user-account for accessing the AWS resource.
Secret Access Key: specifies the additional login credential of the AWS user-account.
Note: The probe uses the combination of the Access Key and Secret Access Key for accessing the AWS resource.
This node represents the profile which is created to monitor the health and performance of AWS services. Each profile is mapped with an AWS
account. You can check the connection between the probe and the AWS resource through the Verify Credentials button under the Actions drop
down.
Note: This node is referred to as profile name node in the document and is user-configurable.
The AWS EC2 service of a specific region stores the instance data in AWS CloudWatch. For a specific profile, the probe fetches the data from
AWS CloudWatch.
This node lets you configure the probe for interacting with the EC2 service and collect data about the instances of the AWS resource. The probe
generates QoS based on the instance data that is collected from AWS CloudWatch.
Navigation: AWS > profile name > EC2
Set or modify the following values if needed:
EC2 > EC2 Configurations
This node lets you configure EC2 service properties.
Active: activates the addition of instances of the AWS resource.
Time Duration: specifies the time duration (in minutes) for collecting sample values from the AWS CloudWatch. The probe starts
collecting the values that were calculated during the time period which is specified here.
Statistics: defines one of the following operations to be performed on the sample values that the probe fetches:
Calculate minimum value.
Calculate maximum value.
Calculate the sum of all the values.
Calculate the average of all the values.
Default: Average
Note: When you change the Statistics value, the QoS graphs on the UMP portal are changed.
Period (minutes): specifies a time interval which is used to divide the collected values into groups.
For example, if the Time Duration is specified as 10 minutes and the Period is specified as 2 minutes, then the values are retrieved for 5
minutes time interval.
Note: This node is referred to as instance name node in the document and each instance has a unique ID.
Note: This node is referred to as EC2-monitor name node in the document and it represents various EC2 performance counters.
Navigation: AWS > profile name > EC2 > instance name > EC2-monitor name
Set or modify the following values if needed:
monitor name > Monitors
This section lets you configure the performance counters for generating QoS.
Note: The performance counters of an EC2 instance are visible in a tabular form. You can select any one counter in the table and can
configure its properties.
Similarly, you can configure the other performance counters that are visible under the CPU, Disk, and Network nodes.
<EBS Volume> node
This node represents the Elastic Block Storage (EBS) which is linked to a specific EC2 instance. The EBS node is visible in the navigation panel
only if you have added a storage block with the EC2 instances, or when you have assigned a storage block to the instances.
Note: This node is referred to as EBS Volume node in the document and it represents an EBS storage volume.
Navigation: AWS > profile name > EC2 > instance name > EBS
Set or modify the following values if needed:
EBS > Monitors
This section lets you configure the performance counters for generating QoS.
Note: The performance counters of an EBS volume are visible in a tabular form. You can select any one counter in the table and can
configure its properties.
ElastiCache Node
The AWS ElastiCache service provides a scalable cache for storing temporary data. The probe generates QoS based on the instance data which
is collected from CloudWatch.
This node lets you configure the probe to fetch ElastiCache instance information.
Navigation: AWS > profile name > ElastiCache
Set or modify the following values if needed:
ElastiCache > ElastiCache Configurations
This node lets you configure the ElastiCache service properties. For field descriptions, refer to the EC2 node section.
<Instance Name> node
This node represents an AWS instance that uses the ElastiCache service. The ElastiCache service supports two types of cache engines:
Remote Dictionary Server or Redis (Currently, ElastiCache supports a single-node Redis cache cluster)
Memcached (Currently, ElastiCache supports a maximum of 20 nodes in a cache cluster)
The instances are displayed in the navigation pane according to the type of cache engine.
Note: This node is known as instance name node in the document and each instance has a unique ID.
Note: This node is referred to as node name node in the document and it represents a node of a Memcached or Redis ElastiCache
instance.
Navigation: AWS > profile name > Elasti Cache> instance name > node name
Set or modify the following values if needed:
monitor name > Monitors
This section lets you configure the performance counters for generating QoS.
Note: The performance counters of an ElastiCache instance are visible in a tabular form. You can select any one counter in the
table and can configure its properties.
This node lets you configure the performance counters of an ElastiCache instance node.
The performance counters are divided into following categories:
CPU
Memory
Each category is represented as a node under the node name node.
Note: This node is referred to as ElastiCache-monitor name node in the document and it represents various ElastiCache instance
performance counters.
Navigation: AWS > profile name > ElastiCache > instance name > node name > ElastiCache-monitor name
Set or modify the following values if needed:
ElastiCache-monitor name > Monitors
This section lets you configure the performance counters for generating QoS.
Note: The performance counters of an ElastiCache instance node are visible in a tabular form. You can select any one counter
in the table and can configure its properties.
Similarly, you can configure the other performance counters that are visible under the Memory node.
SQS Node
The AWS SQS service lets you send data from an AWS service to any other AWS service. The probe monitors the properties of a queue based
on the data collected from the AWS CloudWatch in a specific region.
Navigation: AWS > profile name > SQS
Set or modify the following values if needed:
SQS > SQS Configurations
This node lets you configure the SQS service properties.
Active: activates the monitoring of SQS queue.
Time Duration: specifies the time duration (in minutes) for collecting sample values from the AWS CloudWatch. The probe starts
collecting the values that were calculated during the time period which is specified here.
Period (minutes): specifies a time interval which is used to divide the collected values into groups.
SQS <Region Name> node
This node represents the location of the SQS message queue. This node does not contain sections, and fields. You can configure the SQS QoS
metrics through the following available regions:
Asia Pacific (Singapore) Region
Asia Pacific (Sydney) Region
Asia Pacific (Tokyo) Region
EU (Frankfurt) Region
EU (Ireland) Region
South America (Sao Paulo) Region
US East (Northern Virginia) Region
US West (Northern California) Region
US West (Oregon) Region
EU (Frankfurt) Region
<Queue Name> node
This node lets you configure the QoS metrics of SQS message queues. The probe generates QoS data of the SQS service according to the
Navigation: AWS > profile name > SQS > SQS-region name > queue name
Set or modify the following values if needed:
queue name > Monitors
This section lets you configure the SQS queue performance counters of a specific region for generating QoS data.
Note: The performance counters of a message queue are visible in a tabular form. You can select any one counter in the table and can
configure its properties.
AutoScaling Node
This node lets you configure the probe for monitoring the QoS of the autoscaling groups. Auto-scaling groups are configured in the AWS
Management Console. Configure this node to view these groups, devices, and metrics in USM.
Note: You can create auto-scaling groups in USM by using the advanced filter attribute AWSAutoScalingGroup. For more information,
see Create and Manage Groups in USM.
This node represents the location of the AutoScaling group. This node does not contain sections, and fields. Refer to the SQS-region name secti
on for the list of available regions.
This node lets you configure the metrics of the AutoScaling group. The AWS probe generates QoS data of the AutoScaling service according to
the values fetched from AWS CloudWatch.
By default, the probe uses a fixed operation the collected values for only the following AutoScaling metrics:
StatusCheckFailed: Average
StatusCheckFailed_Instance: Average
StatusCheckFailed_System: Average
Note: You can configure the statistics value for all the AutoScaling metrics, except for the above mentioned metrics.
This node is referred to as AutoScaling Group name node in the document and it represents various AutoScaling metrics.
Navigation: AWS > profile name > AutoScaling > AutoScaling-region name > AutoScaling Group name
Set or modify the following values if needed:
AutoScaling name > Monitors
This section lets you configure the AutoScaling metrics of a specific region for generating QoS data.
Note: All AutoScaling monitors are visible in a tabular form. You can select any one monitor in the table and can configure its
properties.
Refer the queue name topic in SQS node topic for field descriptions.
<Instance Name> node
This node represents the EC2 instances that are included in the AutoScaling group. For more details about the EC2 instances, refer to instance
name in the EC2 node section.
<Monitor Name> node
This node lets you configure the performance counters of the EC2 instances. For more details, refer to monitor name in the EC2 node section.
Note: If you configure the EC2 metrics for generating QoS data through monitor name node, then make sure that the EC2 service is
activated for your AWS account.
ELB Node
The probe monitors the ELB layer that distributes the incoming application data between multiple EC2 instances. In this node, you can view and
configure the metrics of those EC2 instances that are registered with ELB layer that you are currently monitoring. The probe generates QoS data
based on the inputs recived from the EC2 instances at group level.
Navigation: AWS > profile name > ELB
Set or modify the following values if needed:
ELB > Elastic Load Balancing
This node lets you configure the ELB service properties.
Active: activates the monitoring of the ELB layer.
Time Duration: specifies the time duration (in minutes) for collecting sample values from the AWS CloudWatch. The probe starts
collecting the values that were calculated during the time period which is specified here.
Period (minutes): specifies a time interval which is used to divide the collected values into groups.
ELB-<Region Name> node
This node represents the location of the ELB layer. This node does not contain sections, and fields. Refer to the SQS-region name section for the
list of available regions.
This node lets you configure the metrics of the ELB layer. The probe generates QoS data of the ELB service according to the values fetched from
AWS CloudWatch.
By default, the probe uses the following statistics on the collected values for each ELB metric:
Healthy Host Count: Average
Unhealthy Host Count: Average
Request Count: Sum
Latency: Average
HTTPCode_ELB_4XX: Sum
HTTPCode_ELB_5XX: Sum
HTTPCode_Backend_2XX: Sum
HTTPCode_Backend_3XX: Sum
HTTPCode_Backend_4XX: Sum
HTTPCode_Backend_5XX: Sum
Backend Connection Errors: Sum
Surge Queue Length: Maximum
Spillover Count: Sum
Note: You cannot change the value of the statistics.
The following node is referred to as ELB Layer name node and it represents various ELB metrics.
Navigation: AWS > profile name > ELB > ELB-region name > ELB Layer name
Set or modify the following values if needed:
ELB Layer name > Monitors
This section lets you configure the ELB metrics of a specific region for generating QoS data.
Note: All ELB monitors are visible in a tabular form. You can select any one monitor in the table and can configure its properties.
Note: If you configure the EC2 metrics for generating QoS data through monitor name node, then ensure that the EC2 service is
activated for your AWS account.
RDS Node
The AWS RDS service manages relational databases that are stored on AWS CloudWatch. The probe fetches the data from the CloudWatch and
generates QoS related to an RDS instance.
Navigation: AWS > profile name > RDS
Set or modify the following values if needed:
RDS > RDS Configurations
This node lets you configure the RDS service properties.
Active: activates the addition of database instances of the AWS resource. For field descriptions, refer to the EC2 node section.
<Database Name> node
Navigation: AWS > profile name > RDS > database name
Set or modify the following values if needed:
database name > Monitors
This section lets you configure the performance counters of a relational database instance for generating QoS data.
Note: The performance counters of an RDS database instance are visible in a tabular form. You can select any one counter in
the table and can configure its properties.
Note: This node is referred to as RDS monitor name node in the document and it represents various RDS performance counters.
Navigation: AWS > profile name > RDS > database name > RDS monitor name
Set or modify the following values if needed:
RDS monitor name > Monitors
This section lets you configure the RDS performance counters of a specific instance for generating QoS data.
Note: The performance counters of a relational database are visible in a tabular form. You can select any one counter in the
table and can configure its properties.
Similarly, you can configure the other performance counters that are visible under the CPU, Disk, Memory, and Network nodes.
SNS Node
The probe monitors the SNS topic through which the publisher sends notifications to the subscriber. The probe also generates QoS data based
on the status of the notifications.
Navigation: AWS > profile name > SNS
Set or modify the following values if needed:
SNS > SNS Configurations
This node lets you configure the SNS service properties.
Active: activates the monitoring of SNS topic.
Time Duration: specifies the time duration (in minutes) for collecting sample values from the AWS CloudWatch. The probe starts
collecting the values that were calculated during the time period which is specified here.
Period (minutes): specifies a time interval which is used to divide the collected values into groups.
Note: By default, for an active topic, the AWS CloudWatch receives metrics after every 5 minutes. Default monitoring, and
monitoring per minute is not available for AWS SNS.
This node represents the location of the SNS topic. This node does not contain sections, and fields. Refer to the SQS-region name section for
the list of available regions.
<Topic Name> node
This node lets you configure the performance counters of the publisher notifications. The probe generates QoS data of the SNS service according
to the values fetched from AWS CloudWatch.
By default, the probe uses the following statistics on the collected values for each SNS performance counter:
Number Of Messages Published: Sum
Publish Size: Average
Number Of Notifications Delivered: Sum
Number Of Notifications Failed: Sum
You cannot change the value of the statistics.
This node is referred to as queue name node in the document and it represents various SNS metrics.
Navigation: AWS > profile name > SNS > SNS-region name > topic name
Set or modify the following values if needed:
topic name > Monitors
This section lets you configure the SNS topic performance counters of a specific region for generating QoS data.
Note: The performance counters of an SNS topic are visible in a tabular form. You can select any one counter in the table and
can configure its properties.
In AWS, metrics are segregated into different Namespaces. A Dimension is a variable that categorizes a metric according to its statistics. When
you create custom metrics through the script and store the metrics in AWS CloudWatch, the probe fetches that data from CloudWatch.
This node lets you select a custom metric that is available in an AWS Namespace and then define custom QoS for it. The custom metrics for
different AWS Namespace are visible in the Navigation Pane.
You can configure any of the discovered metrics that are available in an AWS Namespace through the Custom Metrics node except RDS, EC2,
EBS, ElastiCache, SQS, SNS, ELB, and AutoScaling.
Navigation: AWS > profile name > Custom Metric
Set or modify the following values if needed:
Custom Metric > Custom Configurations
This section lets you configure the probe to fetch the list of custom metrics from the AWS CloudWatch and select custom metrics for a
specific Namespace.
Available Service Metrics: specifies the list of available AWS Namespaces that the probe fetches from CloudWatch. Each Namespace
contains various custom metrics. You can move specific service Namespace from the Available List to the Selected List. The
selected service metrics are visible as nodes in the Navigation Pane.
Note:
You must include Namespace and Dimensions in a custom metric script for the probe to collect the metrics.
For other field descriptions, refer to the EC2 node section.
This node lets you view and configure the custom metrics for all AWS services. You can define a custom QoS name, unit, and can let the probe
generate QoS data for the custom metric. This node contains a table that lists the AWS dimensions against each service metric.
Note: This node is referred to as AWS-service name node in the document and is user-configurable.
Navigation: AWS > profile name > Custom Metric > AWS-service name
Set or modify the following values if needed:
AWS-service name > Collected Metrics
This section lets you define custom QoS name for different service metrics that are listed in a tabular form. You can also configure the
probe to generate QoS data for selected metrics.
Note: If you have created the custom metrics in a custom Namespace then only custom metrics are visible in the table.
However, if you have created the custom metrics in an existing Namespace then all the metrics are visible in the table.
S3 Node
The data which is stored in the cloud using the AWS S3 service is segregated into groups that are known as buckets. The probe monitors the time
which is consumed in storing and retrieving files to and from the bucket, respectively.
This node lets you configure the performance counters for S3 service. The AWS probe generates QoS data related to the time that is consumed
in storing and retrieving files to and from the S3 buckets.
Note:
Set the polling interval (Interval field in the Add New Profile section in aws node) according to the size of the file that you
want to store or retrieve. If the polling interval is too less, then the probe starts fetching data again from the bucket before
completing a previous file process. For example, if you want to upload a file of size 1 MB then you can set the polling interval
as 5 minutes.
S3 services are not supported on Factory Template.
Note: The file, for which you want to generate the QoS data, must be present in the AWS probe base folder (/probes/appli
cations/aws).
S3 > Monitors
This section lets you configure the performance counters for generating QoS.
Note: The performance counters of the S3 service are visible in a tabular form. You can select any one counter in the table and can
configure its properties.
Template Editor
The Template Editor interface is used to create, modify, or delete templates that can be applied to the probe. The editor allows you to define
templates that can be applicable across multiple profiles.
The node structure of the template editor resembles the node structure of the probe with the following differences:
<templateName> Node
The Auto Filter or the <filterName> node allows you to control which profiles or devices are associated with a particular template. You can
specify additional device criteria by using rules. Filters usually contain one or more rules to define the types of devices for the template. You can
add rules to a device filter to create divisions within a group of systems or reduce the set of devices that are monitored by the probe. For
example, you can add a rule to apply a monitoring configuration to all devices with the IPV4 address that contains 1.1.1.1.
Rules
You can specify rules to match specific criteria while applying a template on the probe. The fields in a rule are:
Label: indicates that the filter will be applicable to the name of the node or section.
Condition: selects the condition to match the value with the probe. The conditions are described as follows:
Contains: indicates that the label contains the specified value.
DoesnotContain: indicates that the label does not contain the specified value.
EndsWith: indicates that the label ends with the specified value.
Equals: indicates that the label is exactly the same as the specified value. This is the default selection.
NotEquals: indicates that the label is not the specified value.
Regex: indicates that the label will match the specified regular expression.
Refer Using Regular Expressions.
Starts With: indicates that the label ends with the specified value.
Value: specifies the value that is to be matched with the probe according to the specified condition.
Include in Template
The Include in Template checkbox is used to include and enable configuration of a monitor or section in the template. This checkbox is available
for static nodes and within filters for dynamic nodes.
This node lets you view the list of AWS services that are available for a specific region. You can configure the AWS probe for generating alarms
for specific AWS services in a region.
Note: This node is known as AWS region in the document as this node represents all the geographical locations where AWS provides
services.
Note: When you select the Publish Alarms check box, the value of the Alarm column in the table changes from Off to On.
The AWS Monitoring probe is configured to create monitoring profiles for accessing AWS resources and fetching data from AWS CloudWatch.
You can also configure health monitors to generate alarms on the basis of the availability of services in various geographical regions.
The probe also lets you configure the Auto Discovery functionality. If any service instance is added or deleted in the AWS resource, then the Aut
o Discovery functionality updates the list of instances in the probe.
The probe fetches data of instances, or services and provides you with various monitors for generating QoS. You can also configure the probe to
fetch the list of custom metrics that are created for a specific service in the AWS CloudWatch.
Contents
Preconfiguration Requirements
Upgrades and Migrations
NAS Subsystem ID Requirements
How to Configure Alarm Thresholds
Managing Profiles
Create a Profile
Delete a profile
Preconfiguration Requirements
This section contains the preconfiguration requirements for the CA UIM AWS Monitoring probe.
An AWS user-account with valid user-credentials, such as, Access Key and Secret Access Key.
EC2 Administrative Rights so that the AWS Monitoring probe can access the AWS resource.
Value
2.19.
Amazon
2.19.1.
AWS
2.19.1.1.
Resource
2.19.1.2.
ServiceStatus
2.19.1.3.
EC2
2.19.1.4.
S3
2.19.1.5.
EBS
2.19.1.6.
RDS
2.19.1.9
ElastiCache
2.19.1.8
SNS
2.19.1.7
SQS
2.19.1.11
ELB
2.19.1.12
Auto Scaling
To update the Subsystem IDs using Admin Console, follow these steps:
1. In the Admin Console, click the black arrow next to the NAS probe, select Raw Configure.
2. Click on the Subsystems folder.
3. Click on the New Key Menu item.
4. Enter the Key Name in the Add key window, click Add.
The new key appears in the list of keys with a blank value.
5. Click in the Value column for the newly created key and enter the key value.
6. Repeat this process for all of the required subsystem IDs for your probe.
7. Click Apply.
To update the Subsystem IDs using Infrastructure Manager, follow these steps:
1. In Infrastructure Manager, right click on the NAS probe, select Raw Configure.
2. Click on the Subsystems folder.
3. Click on the New Key... button.
4. Enter the Key Name and Value, Click OK.
5. Repeat this process for all of the required subsystem IDs for your probe.
6. Click Apply.
Managing Profiles
This procedure provides the information to configure a particular section of a profile. Each section within the profile configures the monitoring
properties of the probe.
Follow these steps:
1. Navigate to the section within a profile that you want to configure.
2. Update the field information and click Save.
The specified section of the probe is configured. The probe is now ready to monitor the log files, web pages, messages from queues, and
output from commands.
Create a Profile
The following procedure enables you to add a profile for monitoring the AWS services. Each profile represents one AWS resource. There can be
multiple instances of an AWS resource.
Follow these steps:
1. Click Options next to the AWS node in the navigation pane.
2. Select Add New Profile.
3. Update the field information and click Submit.
The new monitoring profile is visible under the AWS node in the navigation pane.
The Auto Discovery functionality automatically loads a list of all the available instances.
Delete a profile
You can delete a profile if you do not want the probe to monitor the performance of a specific AWS resource.
Follow these steps:
1. Click the Options icon next to the profile name node that you want to delete.
2. Select Delete Profile.
3. Click Save.
The monitoring profile is deleted from the resource.
This article describes the fields and features of the AWS Monitoring probe.
Contents
aws Node
<profile name> Node
EC2 Node
<Instance Name> node
<Monitor Name> node
<EBS Volume> node
ElastiCache Node
<Instance Name> node
<Node Name> node
<ElastiCache-Monitor Name> node
SQS Node
SQS <Region Name> node
<Queue Name> node
Autoscaling Node
Auto Scaling-<Region Name> node
<Auto Scaling Group Name> node
<Instance Name> node
<Monitor Name> node
ELB Node
ELB-<Region Name> node
<ELB Layer Name> node
<Instance Name> node
<Monitor Name> node
RDS Node
<Database Name> node
<RDS Monitor Name> node
SNS Node
SNS-<Region Name> node
<Topic Name> node
Custom Metrics
<AWS-Service Name> node
S3 Node
aws Node
This node lets you view the probe information and configure the logging properties. You can also set the polling interval for Auto Discovery functi
onality and configure the proxy settings.
Note: The AWS services nodes are visible in the Navigation Pane only after you create a monitoring profile. Initially, only the AWS nod
e and the AWS Service Health node are visible.
Navigation: aws
Set or modify the following values if needed:
aws > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
aws > Probe Setup
This section lets you configure the detail level of the log file. The default value is 3-info.
aws > Auto Discovery
This section lets you set the value of Discovery Interval (minutes). If any instance is added or deleted in the AWS resource, then the A
uto Discovery functionality updates the list of instances in the probe. The Discovery Interval (minutes) specifies the time interval
between each time the probe runs the Auto Discovery functionality.
aws > Proxy Settings
This section enables you to connect to the AWS cloud through a proxy server on the network. You need proxy server settings when your
network is not an open network.
Enable Proxy: lets you use a proxy server for connecting to the AWS cloud.
IP: defines the IP address of the proxy server.
Port: specifies the port on the proxy server through which the connection is established.
Username: defines the user name for accessing the proxy server.
Note: The AWS probe is certified for use in Squid proxy environment.
Alarm Message: specifies the alarm to be generated when the connection to AWS services fails.
Default: ResourceCritical
Access Key: defines the login credential of the AWS user-account for accessing the AWS resource.
Secret Access Key: specifies the additional login credential of the AWS user-account.
Note: The probe uses the combination of the Access Key and Secret Access Key for accessing the AWS resource.
This node represents the profile which is created to monitor the health and performance of AWS services. Each profile is mapped with an AWS
account. You can check the connection between the probe and the AWS resource through the Verify Credentials button under the Actions drop
down.
Note: This node is referred to as profile name node in the document and is user-configurable.
The AWS EC2 service of a specific region stores the instance data in AWS CloudWatch. For a specific profile, the AWS Monitoring probe fetches
the data from AWS CloudWatch.
This node lets you configure the probe for interacting with the EC2 service and collect data about the instances of the AWS resource. The probe
generates QoS based on the instance data which is collected from AWS CloudWatch.
Navigation: AWS > profile name > EC2
Set or modify the following values if needed:
EC2 > EC2 Configurations
This node lets you configure EC2 service properties.
Active: activates the addition of instances of the AWS resource.
Time Duration: specifies the time duration (in minutes) for collecting sample values from the AWS CloudWatch. The probe starts
collecting the values that were calculated during the time period which is specified here.
Statistics: defines one of the following operations to be performed on the sample values that the probe fetches:
Calculate minimum value.
Calculate maximum value.
Calculate the sum of all the values.
Note: When you change the Statistics value, the QoS graphs on the UMP portal are changed.
Period (minutes): specifies a time interval which is used to divide the collected values into groups.
For example, if the Time Duration is specified as 10 minutes and the Period is specified as 2 minutes, then the values are fetched for 5
minutes time interval.
<Instance Name> node
This node represents an instance of the AWS resource. An EC2 instance is a virtual machine (VM). If any region subscribes to the EC2 service,
then an instance of EC2 VM is created for that region.
The AWS Monitoring probe monitors the performance counters of the EC2 instances of the AWS resource. All EC2 instances are visible under the
EC2 node.
Note: This node is referred to as instance name node in the document and each instance has a unique ID.
Note: This node is referred to as EC2-monitor name node in the document and it represents various EC2 performance counters.
Navigation: AWS > profile name > EC2 > instance name > EC2-monitor name
Set or modify the following values if needed:
monitor name > Monitors
This section lets you configure the performance counters for generating QoS.
Note: The performance counters of an EC2 instance are visible in a tabular form. You can select any one counter in the table
and can configure its properties.
Similarly, you can configure the other performance counters that are visible under the CPU, Disk, and Network nodes.
<EBS Volume> node
This node represents the Elastic Block Storage (EBS) which is linked to a specific EC2 instance. The EBS node is visible in the navigation panel
only if you have added a storage block with the EC2 instances, or when you have assigned a storage block to the instances.
Note: This node is referred to as EBS Volume node in the document and it represents an EBS storage volume.
Navigation: AWS > profile name > EC2 > instance name > EBS
Note: The performance counters of an EBS volume are visible in a tabular form. You can select any one counter in the table and can
configure its properties.
ElastiCache Node
The AWS ElastiCache service provides a scalable cache for storing temporary data. The AWS Monitoring probe generates QoS based on the
instance data which is collected from CloudWatch.
This node lets you configure the probe to fetch ElastiCache instance information.
Navigation: AWS > profile name > ElastiCache
Set or modify the following values if needed:
ElastiCache > ElastiCache Configurations
This node lets you configure the ElastiCache service properties. For field descriptions, refer to the EC2 node section.
<Instance Name> node
This node represents an AWS instance that uses the ElastiCache service. The ElastiCache service supports two types of cache engines:
Remote Dictionary Server or Redis (Currently, ElastiCache supports a single-node Redis cache cluster)
Memcached (Currently, ElastiCache supports a maximum of 20 nodes in a cache cluster)
The instances are displayed in the navigation pane according to the type of cache engine.
Note: This node is known as instance name node in the document and each instance has a unique ID.
Note: This node is referred to as node name node in the document and it represents a node of a Memcached or Redis ElastiCache
instance.
Navigation: AWS > profile name > Elasti Cache> instance name > node name
Set or modify the following values if needed:
monitor name > Monitors
This section lets you configure the performance counters for generating QoS.
Note: The performance counters of an ElastiCache instance are visible in a tabular form. You can select any one counter in the
table and can configure its properties.
<ElastiCache-Monitor Name> node
This node lets you configure the performance counters of an ElastiCache instance node.
The performance counters are divided into following categories:
CPU
Memory
Each category is represented as a node under the node name node.
Note: This node is referred to as ElastiCache-monitor name node in the document and it represents various ElastiCache instance
performance counters.
Navigation: AWS > profile name > ElastiCache > instance name > node name > ElastiCache-monitor name
Set or modify the following values if needed:
ElastiCache-monitor name > Monitors
This section lets you configure the performance counters for generating QoS.
Note: The performance counters of an ElastiCache instance node are visible in a tabular form. You can select any one counter
in the table and can configure its properties.
Similarly, you can configure the other performance counters that are visible under the Memory node.
SQS Node
The AWS SQS service lets you send data from an AWS service to any other AWS service. The AWS Monitoring probe monitors the properties of
a queue based on the data collected from the AWS CloudWatch in a specific region.
Navigation: AWS > profile name > SQS
Set or modify the following values if needed:
SQS > SQS Configurations
This node lets you configure the SQS service properties.
Active: activates the monitoring of SQS queue.
Time Duration: specifies the time duration (in minutes) for collecting sample values from the AWS CloudWatch. The probe starts
collecting the values that were calculated during the time period which is specified here.
Period (minutes): specifies a time interval which is used to divide the collected values into groups.
SQS <Region Name> node
This node represents the location of the SQS message queue. This node does not contain sections, and fields. You can configure the SQS QoS
metrics through the following available regions:
Asia Pacific (Singapore) Region
Asia Pacific (Sydney) Region
Asia Pacific (Tokyo) Region
EU (Ireland) Region
South America (Sao Paulo) Region
US East (Northern Virginia) Region
US West (Northern California) Region
US West (Oregon) Region
EU (Frankfurt) Region
<Queue Name> node
This node lets you configure the QoS metrics of SQS message queues. The AWS probe generates QoS data of the SQS service according to the
values fetched from AWS CloudWatch.
By default, the AWS Monitoring probe uses the following statistics on the collected values for each SQS metric:
Number of Messages Sent: Sum
Navigation: AWS > profile name > SQS > SQS-region name > queue name
Set or modify the following values if needed:
queue name > Monitors
This section lets you configure the SQS queue performance counters of a specific region for generating QoS data.
Note: The performance counters of a message queue are visible in a tabular form. You can select any one counter in the table and can
configure its properties.
Autoscaling Node
This node lets you configure the probe for monitoring the status of the auto scaling groups.
Navigation: AWS > profile name > Auto Scaling
Set or modify the following values if needed:
Auto Scaling > Auto Scaling Configurations
This node lets you configure the Auto Scaling service properties.
Active: activates the monitoring of the auto scaling group.
Time Duration: specifies the time duration (in minutes) for collecting sample values from the AWS CloudWatch. The probe starts
collecting the values that were calculated during the time period which is specified here.
Period (minutes): specifies a time interval which is used to divide the collected values into groups.
Auto Scaling-<Region Name> node
This node represents the location of the auto scaling group. This node does not contain sections, and fields. Refer to the SQS-region name secti
on for the list of available regions.
<Auto Scaling Group Name> node
This node lets you configure the metrics of the auto scaling group. The AWS probe generates QoS data of the auto scaling service according to
the values fetched from AWS CloudWatch.
By default, the AWS Monitoring probe uses a fixed operation the collected values for only the following auto scaling metrics:
StatusCheckFailed: Average
StatusCheckFailed_Instance: Average
StatusCheckFailed_System: Average
Note: You can configure the statistics value for all the auto scaling metrics, except for the above mentioned metrics.
This node is referred to as Auto Scaling Group name node in the document and it represents various Auto Scaling metrics.
Navigation: AWS > profile name > Auto Scaling > Auto Scaling-region name > Auto Scaling Group name
Set or modify the following values if needed:
Auto Scaling name > Monitors
This section lets you configure the Auto Scaling metrics of a specific region for generating QoS data.
Note: All Auto Scaling monitors are visible in a tabular form. You can select any one monitor in the table and can configure its
properties.
Refer the queue name topic in SQS node topic for field descriptions.
<Instance Name> node
This node represents the EC2 instances that are included in the Auto Scaling group. For more details about the EC2 instances, refer to instance
name in the EC2 node section.
<Monitor Name> node
This node lets you configure the performance counters of the EC2 instances. For more details, refer to monitor name in the EC2 node section.
Note: If you configure the EC2 metrics for generating QoS data through monitor name node, then make sure that the EC2 service is
activated for your AWS account.
ELB Node
The AWS Monitoring probe monitors the ELB layer that distributes the incoming application data between multiple EC2 instances. In this node,
you can view and configure the metrics of those EC2 instances that are registered with ELB layer that you are currently monitoring. The probe
generates QoS data based on the inputs recived from the EC2 instances at group level.
Navigation: AWS > profile name > ELB
Set or modify the following values if needed:
ELB > Elastic Load Balancing
This node lets you configure the ELB service properties.
Active: activates the monitoring of the ELB layer.
Time Duration: specifies the time duration (in minutes) for collecting sample values from the AWS CloudWatch. The probe starts
collecting the values that were calculated during the time period which is specified here.
Period (minutes): specifies a time interval which is used to divide the collected values into groups.
ELB-<Region Name> node
This node represents the location of the ELB layer. This node does not contain sections, and fields. Refer to the SQS-region name section for the
list of available regions.
<ELB Layer Name> node
This node lets you configure the metrics of the ELB layer. The AWS probe generates QoS data of the ELB service according to the values fetched
from AWS CloudWatch.
By default, the AWS Monitoring probe uses the following statistics on the collected values for each ELB metric:
Healthy Host Count: Average
Unhealthy Host Count: Average
Request Count: Sum
Latency: Average
HTTPCode_ELB_4XX: Sum
HTTPCode_ELB_5XX: Sum
HTTPCode_Backend_2XX: Sum
HTTPCode_Backend_3XX: Sum
HTTPCode_Backend_4XX: Sum
HTTPCode_Backend_5XX: Sum
Backend Connection Errors: Sum
Surge Queue Length: Maximum
Spillover Count: Sum
Note: You cannot change the value of the statistics.
The following node is referred to as ELB Layer name node and it represents various ELB metrics.
Navigation: AWS > profile name > ELB > ELB-region name > ELB Layer name
Set or modify the following values if needed:
ELB Layer name > Monitors
This section lets you configure the ELB metrics of a specific region for generating QoS data.
Note: All ELB monitors are visible in a tabular form. You can select any one monitor in the table and can configure its properties.
Note: If you configure the EC2 metrics for generating QoS data through monitor name node, then make sure that the EC2 service is
activated for your AWS account.
RDS Node
The AWS RDS service manages relational databases that are stored on AWS CloudWatch. The AWS Monitoring probe fetches the data from the
CloudWatch and generates QoS related to an RDS instance.
Navigation: AWS > profile name > RDS
Set or modify the following values if needed:
RDS > RDS Configurations
This node lets you configure the RDS service properties.
Active: activates the addition of database instances of the AWS resource. For field descriptions, refer to the EC2 node section.
<Database Name> node
Navigation: AWS > profile name > RDS > database name
Set or modify the following values if needed:
database name > Monitors
This section lets you configure the performance counters of a relational database instance for generating QoS data.
Note: The performance counters of an RDS database instance are visible in a tabular form. You can select any one counter in
the table and can configure its properties.
<RDS Monitor Name> node
This node lets you configure the performance counters of RDS instances. The AWS probe generates QoS data of the RDS service according to
the values fetched from AWS CloudWatch.
The performance counters are divided into following categories:
CPU
Disk
Memory
Network
Each category is represented as a node under the database name node.
Note: This node is referred to as RDS monitor name node in the document and it represents various RDS performance counters.
Navigation: AWS > profile name > RDS > database name > RDS monitor name
Set or modify the following values if needed:
RDS monitor name > Monitors
This section lets you configure the RDS performance counters of a specific instance for generating QoS data.
Note: The performance counters of a relational database are visible in a tabular form. You can select any one counter in the
table and can configure its properties.
Similarly, you can configure the other performance counters that are visible under the CPU, Disk, Memory, and Network nodes.
SNS Node
The AWS Monitoring probe monitors the SNS topic through which the publisher sends notifications to the subscriber. The probe also generates
QoS data based on the status of the notifications.
Navigation: AWS > profile name > SNS
Set or modify the following values if needed:
SNS > SNS Configurations
This node lets you configure the SNS service properties.
Active: activates the monitoring of SNS topic.
Time Duration: specifies the time duration (in minutes) for collecting sample values from the AWS CloudWatch. The probe starts
collecting the values that were calculated during the time period which is specified here.
Period (minutes): specifies a time interval which is used to divide the collected values into groups.
Note: By default, for an active topic, the AWS CloudWatch receives metrics after every 5 minutes. Default monitoring, and
monitoring per minute is not available for AWS SNS.
This node represents the location of the SNS topic. This node does not contain sections, and fields. Refer to the SQS-region name section for
the list of available regions.
<Topic Name> node
This node lets you configure the performance counters of the publisher notifications. The AWS probe generates QoS data of the SNS service
according to the values fetched from AWS CloudWatch.
By default, the AWS Monitoring probe uses the following statistics on the collected values for each SNS performance counter:
Number Of Messages Published: Sum
Publish Size: Average
Number Of Notifications Delivered: Sum
Number Of Notifications Failed: Sum
You cannot change the value of the statistics.
This node is referred to as queue name node in the document and it represents various SNS metrics.
Navigation: AWS > profile name > SNS > SNS-region name > topic name
Set or modify the following values if needed:
topic name > Monitors
This section lets you configure the SNS topic performance counters of a specific region for generating QoS data.
Note: The performance counters of an SNS topic are visible in a tabular form. You can select any one counter in the table and
can configure its properties.
In AWS, metrics are segregated into different Namespaces. A Dimension is a variable that categorizes a metric according to its statistics. When
you create custom metrics through the script and store the metrics in AWS CloudWatch, the AWS probe fetches that data from CloudWatch.
This node lets you select a custom metric that is available in an AWS Namespace and then define custom QoS for it. The custom metrics for
different AWS Namespace are visible in the Navigation Pane.
You can configure any of the discovered metrics that are available in an AWS Namespace through the Custom Metrics node except RDS, EC2,
EBS, ElastiCache, SQS, SNS, ELB, and Auto Scaling.
Navigation: AWS > profile name > Custom Metric
Set or modify the following values if needed:
Custom Metric > Custom Configurations
This section lets you configure the probe to fetch the list of custom metrics from the AWS CloudWatch and select custom metrics for a
specific Namespace.
Available Service Metrics: specifies the list of available AWS Namespaces that the probe fetches from CloudWatch. Each Namespace
contains various custom metrics. You can move specific service Namespace from the Available List to the Selected List. The
selected service metrics are visible as nodes in the Navigation Pane.
Note: For other field descriptions, refer to the EC2 node section.
This node lets you view and configure the custom metrics for all AWS services. You can define a custom QoS name, unit, and can let the probe
generate QoS data for the custom metric. This node contains a table that lists the AWS dimensions against each service metric.
Note: This node is referred to as AWS-service name node in the document and is user-configurable.
Navigation: AWS > profile name > Custom Metric > AWS-service name
Set or modify the following values if needed:
Note: If you have created the custom metrics in a custom Namespace then only custom metrics are visible in the table.
However, if you have created the custom metrics in an existing Namespace then all the metrics are visible in the table.
S3 Node
The data which is stored in the cloud using the AWS S3 service is segregated into groups that are known as buckets. The AWS probe monitors
the time which is consumed in storing and retrieving files to and from the bucket, respectively.
This node lets you configure the performance counters for S3 service. The AWS probe generates QoS data related to the time that is consumed
in storing and retrieving files to and from the S3 buckets.
Note: Set the polling interval (Interval field in the Add New Profile section in aws node) according to the size of the file that you want
to store or retrieve. If the polling interval is too less, then the probe starts fetching data again from the bucket before completing a
previous file process. For example, if you want to upload a file of size 1 MB then you can set the polling interval as 5 minutes.
Note: The file, for which you want to generate the QoS data, must be present in the AWS probe base folder (/probes/appli
cations/aws).
S3 > Monitors
This section lets you configure the performance counters for generating QoS.
Note: The performance counters of the S3 service are visible in a tabular form. You can select any one counter in the table and can
configure its properties.
This section enables you to configure the Health Monitoring functionality of the AWS probe. The Health Interval (mins) field lets you set
the time interval, in minutes, during which the probe fetches the health status of the AWS services.
<AWS Region> Node
This node lets you view the list of AWS services that are available for a specific region. You can configure the AWS probe for generating alarms
for specific AWS services in a region.
Note: This node is known as AWS region in the document as this node represents all the geographical locations where AWS provides
services.
Note: When you select the Publish Alarms check box, the value of the Alarm column in the table changes from Off to On.
aws Metrics
The section describes the metrics that can be configured using the Amazon Web Services Monitoring (aws) probe.
Contents
Metric Name
Units
Description
Version
QOS_AWS_FILEREADTIME
Seconds
2.0
QOS_AWS_FILEWRITETIME
Seconds
2.0
Metric
Name
Units
Description
Version
QOS_AWS_CPU_UTILIZATION
CPU
Usage
Percent
The percentage of allocated EC2 compute units that are currently in use on the instance.
This metric identifies the processing power required to run an application upon a
selected instance.
2.0
QOS_AWS_DISK_WRITE_BYTES
Data
Written
Bytes
This metric is used to determine the volume of the data the application writes onto the
hard disk of the instance. This can be used to determine the speed of the application.
2.0
QOS_AWS_DISK_READ_BYTES
Data
Read
Bytes
This metric is used to determine the volume of the data the application reads from the
hard disk of the instance. This can be used to determine the speed of the application.
2.0
QOS_AWS_DISK_READ_OPS
Reads
Count
Completed read operations from all ephemeral disks available to the instance. This
metric identifies the rate at which an application reads a disk. This can be used to
determine the speed in which an application reads data from a hard disk.
2.0
QOS_AWS_DISK_WRITE_OPS
Writes
Count
Completed write operations to all ephemeral disks available to the instance. This metric
identifies the rate at which an application writes to a hard disk. This can be used to
determine the speed in which an application saves data to a hard disk.
2.0
QOS_AWS_NETWORK_IN
Total
Bytes
Received
Bytes
The number of bytes received on all network interfaces by the instance. This metric
identifies the volume of incoming network traffic to an application on a single instance.
2.0
QOS_AWS_NETWORK_OUT
Total
Bytes
Sent
Bytes
The number of bytes sent out on all network interfaces by the instance. This metric
identifies the volume of outgoing network traffic to an application on a single instance.
2.0
Metric
Name
Units
Description
Version
QOS_AWS_VOLUME_READ_BYTES
Total Read
Bytes
3.0
QOS_AWS_VOLUME_WRITE_BYTES
Total
Written
Bytes
3.0
QOS_AWS_VOLUME_READ_OPS
Total Read
Operations
Count
3.0
QOS_AWS_VOLUME_WRITE_OPS
Total Write
Operations
Count
3.0
QOS_AWS_VOLUME_TOTAL_READ_TIME
Total Read
Time
Seconds
3.0
QOS_AWS_VOLUME_TOTAL_WRITE_TIME
Total Write
Time
Seconds
3.0
QOS_AWS_VOLUME_IDLE_TIME
Total Idle
Time
Seconds
3.0
QOS_AWS_VOLUME_QUEUE_LENGTH
Queue
Length
Count
3.0
QOS_AWS_VOLUME_THROUGHPUT_PERCENTAGE
Throughput Percentage
Percentage
3.0
QOS_AWS_VOLUME_CONSUMED_READ_WRITE_OPS
Consumed
Read Write
Operations
Count
3.0
Metric
Name
Units
Description
Version
QOS_AWS_RDS_BIN_LOG_DISK_USAGE
Binary Log
Size On
Disk
Bytes
3.0
QOS_AWS_RDS_CPU_UTILIZATION
CPU
Utilization
Percent
3.0
QOS_AWS_RDS_DATABASE_CONNECTIONS
Database
Connections
Count
3.0
QOS_AWS_RDS_DISK_QUEUE_DEPTH
Outstanding
IOs in
queue
Count
3.0
QOS_AWS_RDS_FREEABLE_MEMORY
Available
Memory
Bytes
3.0
QOS_AWS_RDS_FREE_STORAGE_SPACE
Available
Storage
Space
Bytes
3.0
QOS_AWS_RDS_REPLICA_LAG
Read
Replica Lag
Time
Seconds
3.0
QOS_AWS_RDS_SWAP_USAGE
Used Swap
Space
Bytes
3.0
QOS_AWS_RDS_READ_IOPS
Read
Operations
Per Second
Count/Second
3.0
QOS_AWS_RDS_WRITE_IOPS
Write
Operations
Per Second
Count/Second
3.0
QOS_AWS_RDS_READ_LATENCY
Read
Latency
Seconds
3.0
QOS_AWS_RDS_WRITE_LATENCY
Write
Latency
Seconds
3.0
QOS_AWS_RDS_READ_THROUGHPUT
Read
Throughput
Bytes/Second
3.0
QOS_AWS_RDS_WRITE_THROUGHPUT
Write
Throughput
Bytes/Second
3.0
QOS_AWS_RDS_NETWORK_RECEIVE_THROUGHPUT
Network
Receive
Throughput
Bytes
3.0
QOS_AWS_RDS_NETOWRK_TRANSMIT_THROUGHPUT
Network
Transmit
Throughput
Bytes
3.0
Metric Name
Units
Description
Version
QOS_AWS_ELASTICACHE_CPU_UTILIZATION
CPU Utilization
Percent
3.0
QOS_AWS_ELASTICACHE_FREEABLE_MEMORY
Bytes
3.0
Memcached Metrics
QoS Name
Metric
Name
Units
Description
Version
QOS_AWS_ELASTICACHE_MEMCACHED_UNUSED_MEMORY
Unused
Memory For
Cache
Bytes
3.0
QOS_AWS_ELASTICACHE_MEMCACHED_CURRENT_ITEMS
Number of
Items
Count
3.0
QOS_AWS_ELASTICACHE_MEMCACHED_EVICTIONS
Total
Count
Non-Expired
Evicted
Items
3.0
QOS_AWS_ELASTICACHE_MEMCACHED_RECLAIMED
Total
Expired
Items
Evicted
Count
3.0
QOS_AWS_ELASTICACHE_MEMCACHED_GET_HITS
Total Cache
Hits
Requests
Count
3.0
QOS_AWS_ELASTICACHE_MEMCACHED_GET_MISSES
Total Cache
Miss
Requests
Count
3.0
QOS_AWS_ELASTICACHE_MEMCACHED_BYTES_USED_FOR_CACHE_ITEMS
Total Bytes
Used for
Cache
Items
Bytes
3.0
Redis Metrics
QoS Name
Metric Name
Units
Description
Version
QOS_AWS_ELASTICACHE_REDIS_CURRENT_CONNECTIONS
Total Bytes
Allocated for Cache
Count
3.0
QOS_AWS_ELASTICCACHE_REDIS_BYTES_USED_FOR_CACHE
Number of
Connections
Bytes
3.0
Metric
Name
Units
Description
Version
QOS_AWS_SQS_APPROXIMATE_NUMBER_OF_MESSAGES_DELAYED
Number Of
Messages
Delayed
Count
3.5
QOS_AWS_SQS_APPROXIMATE_NUMBER_OF_MESSAGES_NOT_VISIBLE
Number Of
Messages
Not Visible
Count
3.5
QOS_AWS_SQS_APPROXIMATE_NUMBER_OF_MESSAGES_VISIBLE
Number Of
Messages
Visible
Count
3.5
QOS_AWS_SQS_NUMBER_OF_EMPTY_RECEIVES
Number
OF Empty
Receives
Count
3.5
QOS_AWS_SQS_NUMBER_OF_MESSAGES_DELETED
Number Of
Messages
Deleted
Count
3.5
QOS_AWS_SQS_NUMBER_OF_MESSAGES_RECEIVED
Number Of
Messages
Received
Count
3.5
QOS_AWS_SQS_NUMBER_OF_MESSAGES_SENT
Number Of
Messages
Sent
Count
3.5
QOS_AWS_SQS_SENT_MESSAGE_SIZE
Sent
Message
Size
Bytes
3.5
Metric Name
Units
Description
Version
QOS_AWS_SNS_NUMBER_OF_MESSAGES_PUBLISHED
Number Of Messages
Published
Count
3.5
QOS_AWS_SNS_PUBLISH_SIZE
Bytes
3.5
QOS_AWS_SNS_NUMBER_OF_NOTIFICATIONS_DELIVERED
Number Of Notifications
Delivered
Count
3.5
QOS_AWS_SNS_NUMBER_OF_NOTIFICATIONS_FAILED
Number Of Notifications
Failed
Count
3.5
Metric
Name
Units
Description
Version
QOS_AWS_ELB_HEALTHY_HOST_COUNT
Healthy
Host Count
Count
3.5
QOS_AWS_ELB_UNHEALTHY_HOST_COUNT
Unhealthy
Host Count
Count
3.5
QOS_AWS_ELB_REQUEST_COUNT
Request
Count
Count
3.5
QOS_AWS_ELB_LATENCY
Latency
Seconds
The time elapsed after the request leaves the load balancer until
the response is received.
3.5
QOS_AWS_ELB_HTTP_CODE_ELB_4XX
Http Code
ELB 4XX
Count
3.5
QOS_AWS_ELB_HTTP_CODE_ELB_5XX
Http Code
ELB 5XX
Count
3.5
QOS_AWS_ELB_HTTP_CODE_BACKEND_2XX
QOS_AWS_ELB_HTTP_CODE_BACKEND_3XX
QOS_AWS_ELB_HTTP_CODE_BACKEND_4XX
QOS_AWS_ELB_HTTP_CODE_BACKEND_5XX
Http Code
Backend
2XX,
Count
3.5
Http Code
Backend
3XX,
Http Code
Backend
4XX,
Http Code
Backend
5XX
QOS_AWS_ELB_BACKEND_CONNECTION_ERROR
Backend
Connection
Error
Count
3.5
QOS_AWS_ELB_SURGE_QUEUE_LENGTH
Surge
Queue
Length
Count
3.5
QOS_AWS_ELB_SPILLOVER_COUNT
Spillover
Count
Count
3.5
Metric
Name
Units
Description
Version
QOS_AWS_AUTO_SCALING_CPU_UTILIZATION
CPU
Usage
Percentage
3.5
QOS_AWS_AUTO_SCALING_DISK_READ_OPS
Reads
Count
3.5
QOS_AWS_AUTO_SCALING_DISK_WRITE_OPS
Writes
Count
3.5
QOS_AWS_AUTO_SCALING_DISK_READ_BYTES
Data
Read
Bytes
3.5
QOS_AWS_AUTO_SCALING_DISK_WRITE_BYTES
Data
Written
Bytes
3.5
QOS_AWS_AUTO_SCALING_NETWORK_IN
Total
Bytes
Received
Bytes
3.5
QOS_AWS_AUTO_SCALING_NETWORK_OUT
Total
Bytes
Sent
Bytes
3.5
QOS_AWS_AUTO_SCALING_STATUSCHECK
Status
Check
Count
3.5
QOS_AWS_AUTO_SCALING_STATUSCHECK_INSTANCE
Instance
Status
Check
Count
3.5
QOS_AWS_AUTO_SCALING_STATUSCHECK_SYSTEM
System
Status
Check
Count
3.5
scale and manage Web applications and services through a global network of Microsoft-managed datacenters.
The Microsoft Azure Monitoring probe remotely monitors the health and performance of Azure infrastructure and services. The probe enables you
to connect to Microsoft Azure using certificates and discover Azure resources to be monitored. The probe fetches all the service data from
different geographical locations and lets you create profiles that monitor your cloud services including virtual machines (VMs), websites and
storage. The probe lets you configure various monitoring parameters for each of these services. For example, you can check the health status of
data services and VMs, number of requests made to the storage service, CPU utilization, and so on. Based on the configured parameters, the
probe generates Quality of Service (QoS) metrics. Refer to azure Metrics to understand the monitoring capabilities of the probe.
You can also configure dynamic and static threshold values for the QoS metrics to receive alarms.
To use the azure probe, you must have:
Azure Subscription: To manage your Azure services, you must purchase one or more Azure subscriptions. A subscription defines how
many cloud resources (hosted services and storage accounts) you are entitled to create or use and how these resources are billed.
These subscriptions are created at the Azure Account Center. Each cloud service belongs to a subscription.
Note: An Azure account can have multiple subscriptions.
Management Certificate: Whenever you deploy a website, create a new storage account or manage any other service, these operations
pass through the Windows Azure Management API. The Azure Management Portal calls the Management API to perform that action for
you. These operation requests must be signed by a x509 certificate to ensure that only authorized operations are performed. These
certificates are called Management Certificates and are used to permit access to resources in your Azure subscription. The Management
certificate is saved as a .cer file.
One certificate can be linked to one or more subscriptions. A subscription can have multiple certificates and a certificate can even be
shared across multiple subscriptions regardless of who owns the subscription.
Note: To use the azure probe, you must upload a Management certificate to your Azure account in the Management portal and
associate Subscription Ids from your Azure account to this certificate.
More Information
azure (Microsoft Azure Monitoring) Release Notes
Contents
Prerequisites
Create a Profile
Apply Monitoring through Templates
Alarm Thresholds
Stop Receiving Alarms for Deleted Components
Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see azure (Microsoft Azure
Monitoring) Release Notes.
The following are the prerequisites for the Microsoft Azure Monitoring probe.
Before using the azure probe, you must :
1. Have an Azure user-account with valid user-credentials
2. Have a valid Azure subscription
3. Create a JKS Key Store File. A keystore is a database of cryptographic keys, X.509 certificate chains, and trusted certificates.
4. Create an Azure Management Certificate from that Key Store File.
5. Upload the Azure Management Certificate to your account in the Azure Management Portal. This certificate is used for Azure
Management API authentication.
6. Upload the Key Store File to the azure probe.
Running this command creates a key store called WindowsAzureKeyStore.jks and sets the password to access this as India@123.
2. Run the following command to export a certificate from this key store.
Running this command creates the WindowsAzure.cer file in the D: of your system.
Upload the Azure Management Certificate to your Azure Account in the Azure Portal
Follow these steps:
1. Open the Microsoft Azure home page in your web browser.
2. Log in with your Microsoft Azure account credentials.
3. Navigate to Settings from the left navigation pane.
4. Click the Management Certificates tab.
5. Click Upload.
6. Browse and select your certificate file.
7. Click OK.
The certificate file is uploaded and appears in the Microsoft Azure Management Portal.
Create a Profile
The following procedure enables you to add a profile for monitoring the Azure services. Each profile represents one Azure subscription.
Follow these steps:
1. Click the Options (icon) next to the Azure node in the navigation pane.
2. Click the Add New Profile option.
3. Set or modify the following values.
Account Name: defines a unique name for the monitoring profile.
Key Store File Path: enables you to locate your key store file by clicking Browse.
Key Store File Password: specifies the key store file password.
Key Store Type: enables you to upload the selected the Key Store type. The available options are:
JKS: The Java Key Store (JKS) can contain private keys and certificates, but it cannot be used to store secret keys. Since it's a
Java specific keystore, so it cannot be used in other programming languages.
PKCS12: It is a standard keystore type which can be used in Java and other languages. It usually has an extension of p12 or pfx.
You can store private keys, secret keys and certificates on this type.
Subscription Id: specifies the Azure Subscription Id.
You can check the authenticity of the Subscription Id by clicking the Verify Selection button under the Actions drop down.
Active: activates the profile for service monitoring. By default, the profile is active.
Interval (seconds): specifies the time interval (in seconds) after which the probe collects the data from the Azure cloud for the specific
profile.
Alarm Message: specifies the alarm to be generated when the profile is not responding.
4. Click Submit.
The new monitoring profile is displayed as a node below the Azure Data Services Health node in the navigation pane.
5. Navigate to the Profile Name node.
6. Click the Verify Selection option under the Actions drop down to verify the Subscription Id.
7. Click Save to save the profile.
The profile is saved and the VM's, storages and websites are automatically discovered for the profile.
Note: The profile goes into pending state to fetch the data for that particular Azure subscription. Reload or reopen the page to see the
tree structure of the Azure VM's, storages and websites for that subscription.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Value
2.1.6.
Azure
2.1.6.1.
2.1.6.2.
VM Role
2.1.6.4.
Website
2.1.6.5.
Storage
To update the Subsystem IDs using Admin Console, follow these steps:
In the Admin Console, click the icon next to the NAS probe, select Raw Configure.
Click on the Subsystems folder.
The Template Editor interface allows you to create and apply monitoring templates. Templates reduce the time that you need for manual monitor
configuration and provide consistent monitoring across the devices in your network. You can configure monitoring on many targets with a
well-defined template.
You can customize any template by configuring:
Precedence
Precedence controls the order of template application. The probe applies a template with a precedence of one after a template with a
precedence of two. If there are any overlapping configurations between the two templates, then the settings in the template with a
precedence of one overrides the settings in the other template. If the precedence numbers are equal, then the templates are applied
in alphabetical order.
Filters
Filters let you control how the probe applies monitors based on attributes of the target device.
Rules
Rules apply to a device filter to create divisions within a group of systems or reduce the set of devices that the probe monitors.
Monitors
Monitors collect quality of service (QoS), event, and alarm data.
This article describes how to apply monitoring with templates for the Microsoft Azure Monitoring (Azure) probe.
Create Template
You can create a new template to configure multiple existing profiles with the same monitor configuration.
Follow these steps:
1. Open the probe configuration interface.
Note: You can skip Step 9 and Step 10 if you do not need to configure the applicable monitors in those nodes.
Note: The rules for setting the precedence value for filters are same as setting precedence for templates.
5.
Note: You must activate the template for the probe to apply the monitor configuration. When you change the template state to active,
the probe immediately applies all template configuration, including filters, rules, and monitors.
Explanation
[A-Z]
Standard (PCRE)
Standard (PCRE)
Standard (PCRE)
\d*
Custom
Template Editor
Azure Node
Azure Data Services Health Node
<Region Name> Node
<Profile Name> Node
VM Node
Storage Node
Website Node
Template Editor
The Template Editor interface is used to create, modify, or delete templates that can be applied to the probe. The editor allows you to define
templates that can be applicable across multiple profiles. For more information, see v2.1 Azure Apply Monitoring with Templates.
Azure Node
This node lets you view the probe information and configure the logging properties.
Navigation: Azure
Set or modify the following values, as needed:
Azure > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
Azure > Probe Setup
This section lets you configure the detail level of the log file.
Default: 3-info.
Azure > Proxy Settings
This section enables you to connect to the Azure cloud through a proxy server on the network. You need proxy server settings when your network
is not an open network.
Enable Proxy: lets you use a proxy server for connecting to the Azure cloud.
IP: defines the IP address of the proxy server.
Port: specifies the port on the proxy server through which the connection is established.
User: defines the user name for accessing the proxy server.
Note: The User field supports both the formats, that is, User Name and, Domain Name\User Name.
Note: You are recommended to keep the time interval as 600 seconds or above since the probe takes time to fetch the data from the
API.
Alarm Message: specifies the alarm to be generated when the profile is not responding. For example, the profile does not respond if there
is a connection failure or inventory update failure.
Default: ResourceCritical
Note: This node is known as Region Name Node in the document as it represents all the geographical locations discovered by the
azure probe.
Navigation: Azure > Azure Data Services Health > Region Name
Set or modify the following values, as needed:
Region Name > Data Service Status
This section lets you view the current health status of a data service for the selected location. The value for the health status of a data
service can be 0 (Good), 1 (Warning), 2 (Error), and, 3 (Information). You can also configure the QoS for a specific data service for the
selected location.
Note: The data services for the selected location are visible in a tabular format. You can select any one service in the table and can
configure its properties.
Description: indicates a description of the selected service.
Metric Type: identifies a unique ID for alarm and QoS.
Units: indicates the Metric unit of the selected service status.
Publish Data: enables the probe to check the status of the selected service and generate QoS and alarms.
Note: When you select the Publish Data checkbox, the value of the Data column in the table changes from Off to On.
Similarly, you can configure the services for other geographical locations.
Note: This node is known as profile name node in the document and is user-configurable.
VM Node
The azure probe enables you to discover all monitored resources such as VM's, websites and storage associated with the specified subscription.
This node represents all the Azure VM's associated with the specified subscription.
Navigation: azure > Profile Name > VM
This node does not contain any fields or sections.
<VM Name> Node
This node lets you configure the performance metrics for the VM's. The azure probe generates QoS data of that VM according to the values
fetched from the Azure Management Portal.
The performance metrics are divided into following categories:
CPU
Disk
Network
Each category is represented as a node under the VM Name node.
Note: This node is referred to as VM Name node in the document and it represents the VM state and various performance counters for
that VM.
Note: The performance counters of a VM are visible in a tabular form. You can select any one counter in the table and can configure its
properties.
QoS Name: indicates the name of performance metrics.
Publish Data: generates the QoS data for the selected monitor.
Similarly, you can configure the other performance monitors that are visible under the CPU, Disk, and Network nodes.
Storage Node
This node represents all the Azure storage associated with the specified subscription.
Navigation: Azure > Profile Name > Storage
Note: This node is referred to as Storage Name node in the document and it represents the storage state and various performance
counters for that storage.
Navigation: azure > Profile Name > Storage > Storage Name
For field descriptions, refer the VM Node topic.
For details on how to set the time duration for a storage, refer Time Duration Concept in Azure probe for Storage, VMs and Websites topic.
Website Node
This node represents all the Azure websites associated with the specified subscription.
Navigation: azure > Profile Name > Website
This node does not contain any fields or sections.
<Website Name> Node
This node lets you configure the performance metrics for the websites. The probe generates QoS data of that website according to the values
fetched from the Azure Management Portal.
Note: This node is referred to as Website Name node in the document and it represents the various performance metrics for that
Website.
Navigation: Azure > Profile Name > Website > Website Name
Set or modify the following values, if needed:
Website Name > Time Duration Configuration
This section enables you to select the time duration according to which the website data is retrieved.
Set or modify the following values, as needed.
Time Duration: enables you to select the time duration.
Default: 24 Hours
For details on how to set the time duration for a website, refer Time Duration Concept in Azure probe for Storage, VMs and Websites topic.
Website Name > Monitors
This section lets you configure the performance metrics for generating QoS.
Note: The performance metrics of a Website are visible in a tabular form. You can select any one metric in the table and can configure
its properties.
QoS Name: indicates the name of performance metric.
Publish Data: generates the QoS data for the selected metric.
Time Duration Concept in Azure probe for Storage, VMs, and Websites
The following table describes the time-duration concept in azure probe for storage, VMs and websites. You are required to set the time duration
while configuring the performance metrics for a storage, VM or website. For example, for a VM, if the time duration is set to 1 hour, it means that
you get response as an average of the data collected for every 5 minutes interval in the past 1 hour from Azure. Similarly, for the VM, if the time
duration is set to 7 days, it means that you get response as an average of the data collected for every 1 hour interval in the past 7 days from
Azure as described in the table below.
Time Duration
1 Hour
6 Hours
24 Hours
7 Days
Storage
1 Hour
1 Hour
1 Hour
VM
5 Minutes
5 Minutes
1 Hour
Websites
1 Minute
1 Hour
1 Hour
Note: The granularity of data (like 1 minute, 5 minutes) that is written in the table is based on the data provided by Microsoft Azure.
Verify Prerequisites
How to Configure Alarm Thresholds
Managing Profiles
How to create a profile
How to delete a profile
NAS Subsystem ID Requirements
Stop Receiving Alarms for Deleted Components
Verify Prerequisites
This section contains the prerequisites for the Microsoft Azure Monitoring probe.
An Azure user-account with valid user-credentials, a valid Azure subscription, Azure Management Certificate and a valid Key Store file. A
keystore is a database of cryptographic keys, X.509 certificate chains, and trusted certificates.
For using the Azure probe, you are required to upload your Azure Management Certificate to the Azure Management Portal which is used for
Azure Management API authentication. Then, you are required to upload the Key Store File to the Azure probe.
Note: Ensure that before you create or upload the certificate: JDK 1.6 or JDK 1.7 is installed on the computer to be used, and you have
the keytool.exe file in the bin folder to create the certificates. The bin folder is located at Program Files -> Java -> JDK version -> bin.
Running this command creates a key store called WindowsAzureSJKeyStore.jks and sets the password to access this as Inda@123.
2. Run the following command to export a certificate from this key store.
Managing Profiles
This procedure provides the information to configure a particular section of a profile. Each section within the profile lets you configure the
properties of the probe for connecting to the Azure cloud and monitoring various Azure services.
Follow these steps:
1. Navigate to the section within a profile that you want to configure.
2. Update the field information and click Save.
The specified section of the probe is configured.
3. Click Save.
The monitoring profile is deleted.
Value
2.1.6.
Azure
2.1.6.1.
2.1.6.2.
VM Role
2.1.6.4.
Website
2.1.6.5.
Storage
To update the Subsystem IDs using Admin Console, follow these steps:
1. In the Admin Console, click the black arrow next to the NAS probe, select Raw Configure.
2. Click on the Subsystems folder.
3. Click on the New Key Menu item.
4. Enter the Key Name in the Add key window, click Add.
The new key appears in the list of keys with a blank value.
5. Click in the Value column for the newly created key and enter the key value.
6. Repeat this process for all of the required subsystem IDs for your probe.
7. Click Apply.
To update the Subsystem IDs using Infrastructure Manager, follow these steps:
1. In Infrastructure Manager, right click on the NAS probe, select Raw Configure.
2. Click on the Subsystems folder.
3. Click on the New Key... button.
4. Enter the Key Name and Value, Click OK.
5. Repeat this process for all of the required subsystem IDs for your probe.
6. Click Apply.
Note: Ensure that you enter the key names as-is including the period(.) in the end for correct mapping.
Modify the detach configuration parameter of the probe to stop receiving false alerts.
Follow these steps:
1. Open the Raw Configure GUI of the probe.
2. Add the show_detached_configuration key under the setup section and set the key value to Yes.
3. Click Apply to close the Raw Configure GUI for restarting the probe and applying changes.
The navigation pane of the probe GUI shows a new node Detached Configuration. This node contains the components which are
deleted from the Azure subscription but still selected for monitoring in the probe.
4. Click the Options icon next to the component name you want to stop receiving alerts and select Delete.
5. Click Save.
The probe stops sending false alerts for the deleted components after you complete the above mentioned steps.
Azure Node
Azure Data Services Health Node
<Region Name> Node
<Profile Name> Node
Subscriptions Node
<Subscription Id> Node
VM Node
Storage Node
Website Node
Time Duration Concept in Azure probe for storage, VMs and websites
Azure Node
This node lets you view the probe information and configure the logging properties.
Navigation: Azure
Set or modify the following values as required:
Azure > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
Azure > Probe Setup
This section lets you configure the detail level of the log file. The default value is 3-info.
Azure > Proxy Settings
This section enables you to connect to the Azure cloud through a proxy server on the network. You need proxy server settings when your network
is not an open network.
Enable Proxy: lets you use a proxy server for connecting to the Azure cloud.
IP: defines the IP address of the proxy server.
Port: specifies the port on the proxy server through which the connection is established.
User: defines the user name for accessing the proxy server.
Note: The User field supports both the formats, that is, User Name and, Domain Name\User Name.
according to the metric values generated by these VM's, storages and websites.
Note: You are required to add the subscription Id to the profile for which monitoring is required.
The Azure probe scans the Azure cloud and fetches the global health status data of various services that are available in http://azure.microsoft.co
m/en-gb/status/. This node also enables you to set the polling interval for the Auto Discovery of new locations along with their services.
Navigation: Azure > Azure Data Services Health Node
Set or modify the following values as required:
Azure Data Services Health > Data Services Monitor
This section lets you set the polling interval after which the probe fetches the global health status data of various services.
Interval (minutes): enables you to set the polling interval.
Default: 5
<Region Name> Node
This node lets you view the list of Azure Data Services locations that are available when the Azure probe scans the Azure cloud. You can
configure the Azure probe for generating alarms for the selected location.
Note: This node is known as Region Name Node in the document as it represents all the geographical locations discovered by the
Azure probe.
Navigation: Azure > Azure Data Services Health > Region Name
Set or modify the following values as required:
Region Name > Data Service Status
This section lets you view the current health status of a data service for the selected location. The value for the health status of a data
service can be 0 (Good), 1 (Warning), 2 (Error), and, 3 (Information). You can also configure the QoS for a specific data service for the
selected location.
Note: The data services for the selected location are visible in a tabular format. You can select any one service in the table and can
configure its properties.
Description: indicates a description of the selected service.
Metric Type: identifies a unique ID for alarm and QoS.
Units: indicates the Metric unit of the selected service status.
Publish Data: enables the probe to check the status of the selected service and generate QoS and alarms.
Note: When you select the Publish Data checkbox, the value of the Data column in the table changes from Off to On.
Similarly, you can configure the services of the other geographical locations.
<Profile Name> Node
This node represents the profile created to monitor an Azure account. For each profile, you are required to add the subscription Id for which
monitoring is required. After you configure your account/resource information including certificates and subscriptions, the Azure probe examines
the current Azure subscription and imports all detected instances of VM's, websites and storage into a tree structure. In this way, the probe
creates a hierarchy of all the Azure subscriptions added by you in the profile.
Note: This node is known as profile name node in the document and is user-configurable.
Subscriptions Node
This node lets you view all the subscriptions for the selected profile.
Navigation: Azure > Profile Name > Subscriptions
This node has no sections, or fields.
<Subscription Id> Node
This node represents a subscription of the Azure account. The probe discovers all instances of VM's, database, storage and websites for each
subscription.
Navigation: Azure > Profile Name Node > Subscriptions > Subscription Id
This node has no sections, or fields.
VM Node
The Azure probe enables you to discover all monitored resources such as VM's, websites and storage associated with the specified subscription.
This node represents all the Azure VM's associated with the specified subscription.
Navigation: Azure > Profile Name > Subscriptions > Subscription Id > VM
This node does not contain any fields or sections.
<VM Name> Node
This node lets you configure the performance metrics for the VM's. The Microsoft Azure Monitoring probe generates QoS data of that VM
according to the values fetched from the Azure Management Portal.
The performance metrics are divided into following categories:
CPU
Disk
Network
Each category is represented as a node under the VM Name node.
Note: This node is referred to as VM Name node in the document and it represents the VM state and various performance counters for
that VM.
Navigation: Azure > Profile Name > Subscriptions > Subscription Id > VM > VM Name
Set or modify the following values as required:
VM Name > Time Duration Configuration
This section enables you to select the time duration according to which the VM data is retrieved.
Set or modify the following values as required.
Time Duration: enables you to select the time duration according to which the VM data is retrieved.
Default: 24 Hours
For details on how to set the time duration for a VM, refer to the Time Duration Concept in Azure probe for storage, VMs and websites topic.
VM Name > Monitors
This section lets you configure the performance metrics for generating QoS for the Power State for the VM.
Note: The performance counters of a VM are visible in a tabular form. You can select any one counter in the table and can configure its
properties.
QoS Name: indicates the name of performance metrics.
Publish Data: generates the QoS data for the selected monitor.
Similarly, you can configure the other performance monitors that are visible under the CPU, Disk, and Network nodes.
Storage Node
This node represents all the Azure storage associated with the specified subscription.
Navigation: Azure > Profile Name > Subscriptions > Subscription Id > Storage
This node does not contain any fields or sections.
<Storage Name> Node
This node lets you configure the performance metrics for the Azure Storage. The Microsoft Azure Monitoring probe generates QoS data for that
Storage according to the values fetched from the Azure Management Portal.
The performance metrics are divided into the following categories:
blob
queue
table
Each category is represented as a node under the Storage Name node.
Note: This node is referred to as Storage Name node in the document and it represents the Storage state and various performance
counters for that Storage.
Navigation: Azure > Profile Name > Subscriptions > Subscription Id > Storage > Storage Name
For field descriptions, refer to the VM Node topic.
For details on how to set the time duration for a storage, refer to the Time Duration Concept in Azure probe for storage, VMs and websites to
pic.
Website Node
This node represents all the Azure websites associated with the specified subscription.
Navigation: Azure > Profile Name > Subscriptions > Subscription Name > Website
This node does not contain any fields or sections.
<Website Name> Node
This node lets you configure the performance metrics for the websites. The Microsoft Azure Monitoring probe generates QoS data of that website
according to the values fetched from the Azure Management Portal.
Note: This node is referred to as Website Name node in the document and it represents the various performance metrics for that
Website.
Navigation: Azure > Profile Name > Subscriptions > Subscription Id> Website > Website Name
Set or modify the following values as required:
Website Name > Time Duration Configuration
This section enables you to select the time duration according to which the website data is retrieved.
Set or modify the following values as required.
Time Duration: enables you to select the time duration.
Default: 24 Hours
For details on how to set the time duration for a website, refer to the Time Duration Concept in Azure probe for storage, VMs and websites to
pic.
Website Name > Monitors
This section lets you configure the performance metrics for generating QoS.
Note: The performance metrics of a Website are visible in a tabular form. You can select any one metric in the table and can configure
its properties.
QoS Name: indicates the name of performance metric.
Publish Data: generates the QoS data for the selected metric.
Time Duration Concept in Azure probe for storage, VMs and websites
The following table describes the time-duration concept in Azure probe for storage, VMs and websites. You are required to set the time duration
while configuring the performance metrics for a storage, VM or website. For example, for a VM, if the time duration is set to 1 hour, it means that
you get response as an average of the data collected for every 5 minutes interval in the past 1 hour from Azure. Similarly, for the VM, if the time
duration is set to 7 days, it means that you get response as an average of the data collected for every 1 hour interval in the past 7 days from
Azure as described in the table below.
Time Duration
1 Hour
6 Hours
24 Hours
7 Days
Storage
1 Hour
1 Hour
1 Hour
VM
5 Minutes
5 Minutes
1 Hour
Websites
1 Minute
1 Hour
1 Hour
Note: The granularity of data (like 1 minute, 5 minutes) that is written in the table is based on the data provided by Microsoft Azure.
azure Metrics
The following table describes the checkpoint metrics that can be configured using the Microsoft Azure Monitoring (azure) probe.
Contents
QoS Metrics
QoS data for the Azure data services
QoS data for the Azure VM
QoS data for the Azure storage service
QoS data for the Azure website service
QoS Metrics
The following tables describes the checkpoint metrics that can be configured using the azure probe.
Metric
Name
Units
Description
Version
QOS_AZURE_DATASERVICE_STATUS
Status
State
The current availability and health status of the data services. The values are as
follows: 0-Good, 1-Warning, 2-Error, 3-Information
v2.0
Metric
Name
Units
Description
Version
QOS_AZURE_VM_STATE
VM Power
State
State
The current availability and health status of the VMs. The values are as
follows: 0-Started, 1-Starting, 2-Stopping, 3-Stopped, 4-Unknown
v2.0
QOS_AZURE_VM_CPU_UTILIZATION
CPU Usage
Percentage
v2.0
QOS_AZURE_VM_DISK_READ
Disk Read
Bytes/sec
v2.0
QOS_AZURE_VM_DISK_WRITE
Disk Write
Bytes/sec
v2.0
QOS_AZURE_VM_NETWORK_IN
Total Bytes
Received
Bytes
The number of bytes received on all network interfaces by the VM. This
metric identifies the volume of incoming network traffic on a single VM.
v2.0
QOS_AZURE_VM_NETWORK_OUT
Total Bytes
Sent
Bytes
The number of bytes sent out on all network interfaces by the VM. This metric
identifies the volume of outgoing network traffic on a single VM.
v2.0
Metric
Name
Units
Description
Version
QOS_AZURE_STORAGE_TOTAL_REQUESTS
Total
Requests
Count
v2.0
QOS_AZURE_STORAGE_SUCCESS_COUNT
Success
Count
Count
v2.0
QOS_AZURE_STORAGE_SUCCESS_PERCENTAGE
Success
Percentage
v2.0
QOS_AZURE_STORAGE_AVAILABILITY
Availability
v2.0
QOS_AZURE_STORAGE_STATE
Storage
State
State
v2.0
Metric
Name
Units
Description
Version
QOS_AZURE_WEBSITE_AVG_MEMORY_WORKING_SET
Average
Memory
Working
Set
Bytes
v2.0
QOS_AZURE_WEBSITE_AVG_RESPONSE_TIME
Average
Response
Time
Bytes
v2.0
QOS_AZURE_WEBSITE_CPU_TIME
CPU Time
Microsecond
v2.0
QOS_AZURE_WEBSITE_DATA_IN
Bytes
Received
Bytes
v2.0
QOS_AZURE_WEBSITE_DATA_OUT
Bytes Sent
Bytes
v2.0
QOS_AZURE_WEBSITE_HTTP_ERRORS
HTTP
Client
Errors
Count
v2.0
QOS_AZURE_WEBSITE_HTTP_CLIENT_ERRORS
HTTP
Errors
Count
v2.0
QOS_AZURE_WEBSITE_HTTP_REDIRECTS
HTTP
Redirects
Count
v2.0
QOS_AZURE_WEBSITE_HTTP_SERVER_ERRORS
HTTP
Server
Errors
Count
v2.0
QOS_AZURE_WEBSITE_HTTP_SUCCESSES
HTTP
Successes
Count
v2.0
QOS_AZURE_WEBSITE_HTTP_STATE
Website
State
State
v2.0
Note: The QoS of probe version 1.0 are not supported by the probe version 2.0.
baseline_engine
Baseline engine samples the Quality of Service (QoS) data on the message bus, and at the top of the hour calculates baseline data points for
monitoring probes. The qos_processor probe sends the resulting data points to the UIM database.
The first baseline approximation is available after the interval has concluded, and is improved with succeeding baseline data points from
corresponding intervals gathered over a four-week to 12-week period.
More Information:
Baseline Engine (baseline_engine) Release Notes
baseline_engine Deployment
The baseline_engine probe is designed to be deployed with minimal configuration. Once active, it provides baselines for all QoS metrics being
gathered across the domain.
ppm, baseline_engine, and prediction_engine must all be deployed and running on hub robots if you want to configure dynamic, static, or Time To
Threshold alarm and threshold settings for monitoring probes. Make sure the nas and alarm_enrichment probes are deployed to the hub robots
where ppm is running if you want to configure Time Over Threshold alarm and threshold settings for monitoring probes.
Note: When running ppm v2.38 or later, be sure to select the Compute Baseline check box on the probe GUI in Admin Console to
allow baseline_engine to provide baselines for QoS metrics.
Starting with CA UIM Server installer v8.0, the baseline_engine probe is included with the installer. It is automatically installed on the primary hub
as part of the installation process. Once active probes have the Compute Baseline option selected, baseline_engine provides baselines for all
QoS metrics gathered across the domain. Review the baseline_engine Release Notes for additional information about deploying baseline_engine
to secondary hubs.
Note: The qos_processor probe must be running on the primary Hub to store the QOS_BASELINE messages in the UIM database.
QOS_BASELINE messages (where subject ID is QOS_BASELINE) must be forwarded from the baseline_engine probe on the secondary hub(s)
to the qos_processor probe running on the primary hub. To enable this message forwarding, create a new hub queue or amend existing hub
queues to forward the new QOS_BASELINE messages to the primary hub--just as is done for QOS_MESSAGE and QOS_DEFINITION
messages.
To amend or augment existing hub queues, edit your existing post (or attach) queues and add the "QOS_BASELINE" subject. Alternatively,
create a new post (or attach) queue with just the "QOS_BASELINE" subject.
Verify QOS_BASELINE Messages are Detected After Installing ppm on a Secondary Hub
With CA UIM 8.2 and later, ppm must be deployed on every hub that directly, or through a subordinate robot, hosts a probe that will be configured
using Probe Provisioning. When you deploy ppm, baseline_engine, and prediction_engine are automatically deployed to the same subordinate
hub. baseline_engine, and prediction_engine must all be deployed and running if you want to configure dynamic or Time To Threshold alarm and
threshold settings for monitoring probes. Make sure the nas and alarm_enrichment probes are deployed to the same subordinate hub where ppm
is running if you want to configure Time Over Threshold alarm and threshold settings for monitoring probes.
With most UIM environments, secondary (or subordinate) hubs already have queues configured to pass QOS_MESSAGE and QOS_DEFINITION
messages to the data_engine probe on the primary hub. Verify that QOS_BASELINE messages were also added to the Subject field (on the Edit
Queue screen) so these messages are also passed to the data_engine probe on the primary hub. Otherwise, baseline data points for QoS
metrics are not generated.
Review the following example for instructions on adding QOS_BASELINE to the Subject field so these messages are passed to data_engine.
Example: Configure an Existing Queue
This example shows how to configure an existing hub queue in Infrastructure Manager.
Follow These Steps:
1. In Infrastructure Manager, double-click the hub probe to open its GUI.
2. In the Hub configuration GUI, under the Queues tab, select the queue that forwards messages with QOS_MESSAGE and
QOS_DEFINITION subjects:
4. Click OK and then Yes when prompted to enable changes. The hub will refresh its configuration (perform a soft restart).
More Information:
See Configuring Queues and Tunnels for details about setting up queues.
See baseline_engine Release Notes for details about which versions of baseline_engine supports various versions of the ppm
probe.
This article explains the commands you use to set baselines or thresholds for probes, and the procedure to configure the amount of time, in
weeks, to retain the baseline data.
Create Baselines and Thresholds for Probes Without the Web-based GUI
Use the following commands to set baselines or thresholds for probes.
Set up Baselines
To set up baselines for probes without using the web-based configuration, use the following command:
Set up Thresholds
To set up thresholds for probes without the web-based configuration use the following command:
probe (required): The path to the baseline_engine probe, for example /domain/hub/robot/baseline_engine.
threshType (required): The type of threshold, which is either static or dynamic.
Static: No alarms are sent sent until sufficient alarms meeting the time requirements have exceeded the threshold.
Dynamic: A dynamic threshold is calculated on variance from the calculated static baseline with no averaging. Variances can be set to
one of the following algorithms
id: The metric ID of the QoS for which thresholds are being defined.
type (required): The algorithm used to calculate the variance from the calculated baseline. Options are:
Scalar: A set value past the calculated baseline.
Percentage: A set percentage past the baseline.
Standard Deviation: A set standard deviation past the baseline.
o: One of the following operators:
L: less than
LE: less than or equal to
G: greater than
GE: greater than or equal to
EQ: is equal
NE: is not equal
operator1 <operator>: operator1 takes precedence over <operator>, information alarm threshold operator
operator2 <operator>: operator2 takes precedence over <operator>, warning alarm threshold operator
operator3 <operator>: operator3 takes precedence over <operator>, minor alarm threshold operator
operator4 <operator>: operator4 takes precedence over <operator>, major alarm threshold operator
operator5 <operator>: operator5 takes precedence over <operator>, critical alarm threshold operator
Note: If you specify one operator, you must specify all operators.
Note: The alarm threshold values are generated in the format 50.0 (for 50%). To generate an alarm, you must specify at least
one level alarm threshold value.
subsysId (required): The subsystem ID of the QoS for which the thresholds are being defined. Only one subsystem ID can be specified
using the subsysId option.
threshID (Optional): Unique ID which distinguishes between multiple thresholds of the same threshType and id (metric ID).
delete: Remove the threshold identified by the id (metric ID), threshType, and threshID.
customAlarmMessage: A custom alarm message generated as the alarm message when a threshold is breached. Variables include:
${baseline} - The baseline calculated for a QoS metric if the Compute Baseline and Dynamic Alarm options are selected for the metric.
Baselines are not calculated for static messages, so this value will always be zero for static alarms.
${level} - The numerical critical level of the alarm. Valid values are: 1 (critical), 2 (major), 3 (minor), 4 (warning), or 5 (information)
${operator} - The operator (>, , <, , =, or !=) for the critical level of the alarm.
${qos_name} - The name of the QoS metric.
${source} - The source of the QoS metric that generated an alarm.
${target} - The target of the QoS metric that generated an alarm
${threshold} - Specifies the threshold upon which an alarm is generated.
${value} - Specifies the value contained in the generated QoS metric.
EXAMPLE: -customAlarmMessage ${qos_name} is at ${value}
customClearAlarmMessage: A custom alarm message generated when the alarm and the source of the alarm are returned to a normal
state. Variables include:
Note: The baseline retention period can be set to a minimum of three weeks, or a maximum of twelve weeks. If you set the retention
period to an unsupported value, the baseline_engine probe will use the nearest supported value. For example, setting the retention
period to fifteen will result in an actual retention period of twelve weeks.
Important! If projectBaselines is set to True and you have already configured baselines, do not change this setting back to Fal
se. This causes the system to have two baselines for a period of one week in the future.
Important! Do not delete the projectionEnabled file stored in the root baseline_engine directory. Otherwise, your baseline
timestamps will be affected.
retentionPeriod
Sets the amount of time (in weeks) for the baseline to retain monitoring data. The range is between 3 and 12 weeks. The default is 4
weeks.
messagelimitlog
Sets the log level for the message limit sub-process.
useprevioushour
At startup, use the average for the previous hour.
alarmcheckers
Sets the amount of threads used to check alarms.
Create Baselines and Thresholds for Probes Without the Web-based GUI
Use the following commands to set baselines or thresholds for probes.
To set up baselines for probes without using the web-based configuration, use the following command:
Note: If you specify one operator, you must specify all operators.
Note: The alarm threshold values are generated in the format 50.0 (for 50%). To generate an alarm, you must specify at least
one level alarm threshold value.
subsysId: The subsystem ID of the QoS for which the thresholds are being defined. Only one subsystem ID can be specified using the
subsysId option.
threshID (Optional): Unique ID which distinguishes between multiple thresholds of the same threshType and id (metric ID).
delete: Remove the threshold identified by the id (metric ID), threshType, and threshID.
customAlarmMessage: A custom alarm message generated as the alarm message when a threshold is breached. Variables include:
${baseline} - The baseline calculated for a QoS metric if the Compute Baseline and Dynamic Alarm options are selected for the metric.
Baselines are not calculated for static messages, so this value will always be zero for static alarms.
${level} - The numerical critical level of the alarm. Valid values are: 1 (critical), 2 (major), 3 (minor), 4 (warning), or 5 (information)
${operator} - The operator (>, , <, , ==, or !=) for the critical level of the alarm.
${qos_name} - The name of the QoS metric.
${source} - The source of the QoS metric that generated an alarm.
${target} - The target of the QoS metric that generated an alarm
${threshold} - Specifies the threshold upon which an alarm is generated.
${value} - Specifies the value contained in the generated QoS metric.
EXAMPLE: -customAlarmMessage ${qos_name} is at ${value}
customClearAlarmMessage: A custom alarm message generated when the alarm and the source of the alarm are returned to a normal
state. Variables include:
Note: The baseline retention period can be set to a minimum of three weeks, or a maximum of twelve weeks. If you set the retention
period to an unsupported value, the baseline_engine probe will use the nearest supported value. For example, setting the retention
period to fifteen will result in an actual retention period of twelve weeks.
future and sends duplicate baselines with unadjusted timestamps. This allows dynamic threshold configurations to be evaluated
immediately without having to wait one week. The duplicate baselines are only generated for one week. If you do not want to use this
projection behavior, change this setting to False before the baseline_engine calculates any baselines.
Important! If projectBaselines is set to True and you have already configured baselines, do not change this setting back to Fal
se. This causes the system to have two baselines for a period of one week in the future.
Important! Do not delete the projectionEnabled file stored in the root baseline_engine directory. Otherwise, your baseline
timestamps will be affected.
retentionPeriod
Sets the amount of time (in weeks) for the baseline to retain monitoring data. The range is between 3 and 12 weeks. The default is 4
weeks.
messagelimitlog
Sets the log level for the message limit sub-process.
Create Baselines and Thresholds for Probes Without the Web-based GUI
Change the Baseline Retention Period
Create Baselines and Thresholds for Probes Without the Web-based GUI
Use the following commands to set baselines or thresholds for probes.
To set up baselines for probes without using the web-based configuration, use the following command:
Note: The baseline retention period can be set to a minimum of three weeks, or a maximum of twelve weeks. If you set the retention
period to an unsupported value, the baseline_engine probe will use the nearest supported value. For example, setting the retention
period to fifteen will result in an actual retention period of twelve weeks.
projectBaselines
By default, this key-value is set to True and baselines are projected one week in the future to ensure that dynamic threshold alarms are
consistent with the baseline. When this setting is True, the baseline_engine adjusts the timestamps of the baselines to one week in the
future and sends duplicate baselines with unadjusted timestamps. This allows dynamic threshold configurations to be evaluated
immediately without having to wait one week. The duplicate baselines are only generated for one week. If you do not want to use this
projection behavior, change this setting to False before the baseline_engine calculates any baselines.
Important! If projectBaselines is set to True and you have already configured baselines, do not change this setting back to Fal
se. This causes the system to have two baselines for a period of one week in the future.
Important! Do not delete the projectionEnabled file stored in the root baseline_engine directory. Otherwise, your baseline
timestamps will be affected.
retentionPeriod
Sets the amount of time (in weeks) for the baseline to retain monitoring data. The range is between 3 and 12 weeks. The default is 4
weeks.
messagestorelog
Sets the log level for the messagestore sub-process
demarshalpdslog
Sets the log level for the process that disassembles PDS messages from the bus
metricrunner
Sets the log level for the metricrunner process
metricfactory
Sets the log level for operation of the metricfactory (limited to whether or not the metricfactory successfully started)
metriccaluculator
Sets the logging level for the metric calculator, which executes scripts that perform the baseline calculations.
predictiveAlarmSubject
Sets the alarm subject for a Time To Threshold alarm. The baseline_engine probe uses this subject setting to assist in routing predictive
alarms. If this setting is alarm, the baseline_engine probe sends predictive alarms properly. If this parameter is set to a value other than
alarm, the baseline_engine will be unable to properly route predictive alarms to an administrator.
Create Baselines and Thresholds for Probes Without the Web-based GUI
Change the Baseline Retention Period
Computing Baselines
Create Baselines and Thresholds for Probes Without the Web-based GUI
Use the following commands to set baselines or thresholds for probes.
To set up baselines for probes without using the web-based configuration, use the following command:
Dynamic: A dynamic threshold is calculated on variance from the calculated static baseline with no averaging. Variances can be set to
one of the following algorithms:
Scalar: A set value past the calculated baseline.
Percentage: A set percentage past the baseline.
Standard Deviation: A set standard deviation past the baseline.
o: One of the following operators:
L: less than
G: greater than
level1: Sets the level1 information alarm threshold value.
level2: Sets the level2 warning alarm threshold value.
level3: Sets the level3 minor alarm threshold value.
level4: Sets the level4 major alarm threshold value.
level5: Sets the level5 critical alarm threshold value.
subsysId: The subsystem ID of the QoS for which the thresholds are being defined. Only one subsystem ID can be specified using the
subsysId option.
queue: Indicates to send configurations over the BASELINE_CONFIG queue instead of using callbacks.
Note: The baseline retention period can be set to a minimum of three weeks, or a maximum of twelve weeks. If you set the retention
period to an unsupported value, the baseline_engine probe will use the nearest supported value. For example, setting the retention
period to fifteen will result in an actual retention period of twelve weeks.
Computing Baselines
Configuring baselines and thresholds for probes is a two-step process.
1. Configure baselines and thresholds for probes without the web-based GUI or you use Infrastructure Manager.
2. Indicate that you want a baseline computed for a QoS monitoring probe by selecting the Compute Baseline and Publish Data options in
the probe's GUI. See Set Thresholds for more information.
Navigation: Setup
The setup folder contains the following configurable key-values:
logfile
Defines the log file name
loglevel
Sets the overall root log level (from 0 (minimum) to 5(maximum))
scriptloglevel
Sets the log level for messages tracking operation of the scripts that perform the baseline calculations
messagelimitlog
Sets the log level for messages that show if the maxmetrics limit is exceeded or not. This is essentially a binary setting, with level=1
denoting "off" and level=>2 equivalent to "on." If level=2, then an error message is logged to the main log when maxmetrics is exceeded.
To see the metrics that are not processed/baselined, view the skipped_message.log file.
performance
Switches a lightweight performance monitor process on (true) or off (false). The performance monitor checks every minute on the rate of
execution of queuing and calculation, etc. and logs this information to performance.log.
isNative
Sets the ability to override the default baseline calculation interval for individual probes.
projection
For new installations of the baseline_engine v2.3 probe, this key-value is set to True by default and baselines are projected one week in
the future to ensure that dynamic threshold alarms are consistent with the baseline. When this setting is True, the baseline_engine
adjusts the timestamps of the baselines to one week in the future and sends duplicate baselines with unadjusted timestamps. This allows
dynamic threshold configurations to be evaluated immediately without having to wait one week. The duplicate baselines are only
generated for one week. If you do not want to use this projection behavior, change this setting to False before the baseline_engine
calculates any baselines.
For baseline_engine v2.2 and earlier, the projection key is set to False by default.
Note: If you install baseline_engine v2.3 on a hub where a previous version of baseline_engine was running, this key value defaults to
False.
Important! If the projection key is set to True and you have already configured baselines, do not change this setting back to False.
This causes the system to have two baselines for a period of one week in the future.
Important! Do not delete the projectionEnabled file stored in the root baseline_engine directory. Otherwise, your baseline timestamps
will be affected.
retentionPeriod
Sets the amount of time (in weeks) for the baseline to retain monitoring data. The range is between 3 and 12 weeks. The default is 4
weeks.
messagestorelog
Sets the log level for the messagestore sub-process
demarshalpdslog
Sets the log level for the process that disassembles PDS messages from the bus
metricrunner
Sets the log level for the metricrunner process
metricfactory
Sets the log level for operation of the metricfactory (limited to whether or not the metricfactory successfully started)
metriccaluculator
Sets the logging level for the metric calculator, which executes scripts that perform the baseline calculations.
predictiveAlarmSubject
Sets the alarm subject for a Time To Threshold alarm. The baseline_engine probe uses this subject setting to assist in routing predictive
alarms. If this setting is alarm, the baseline_engine probe sends predictive alarms properly. If this parameter is set to a value other than
alarm, the baseline_engine will be unable to properly route predictive alarms to an administrator.
Memory Allocation
Disk Allocation
1 GB
100 MB
10,000 to 50,000
1.5 GB
200 MB
50,000 to 100,000
2 GB
400 MB
You can estimate the current memory usage for the baseline_engine probe by checking the performance log. Find the entry JVM Memory
currently allocated: and add 40% to this figure. If required you can change the memory allocation using the jav_mem_max setting in The Startup
Folder.
billing
The billing probe collects usage data from all the usage_metering probes in the environment and performs the calculations required to generate a
billing report.
After a calculation completes, the probe creates a billable items report, which is stored locally and in the database. If the webgtw probe is
enabled, webgtw automatically transfers the report to CA Sales. You also can export a report that contains summary and detail information.
This article provides an introduction to:
More information:
For information on the automated billing process, see the Set Up Automated Usage Metering and Billing article in the Mana
ging section of the CA Unified Infrastructure Management wiki space.
See billing Release Notes
Billing Subscriptions
Prior to billing v8.0, each customer was assigned a billing subscription file that needed to reside with the billing probe. Once the subscription was
installed, a billing calculation could be performed to generate billing reports. With billing v.8.0 and later, subscription files are included with the
probe and do not need to be installed separately.
Billing Reports
After a calculation completes, a billable items report containing summary and detail information can be exported from the Calculations tab. The
report is exported in HTML format.
The generated HTML report can be loaded into Excel to review the resulting data in a spreadsheet. With the data loaded in Excel, advanced
features such as filtering and pivot tables can be used to view sections of the data in which one is particularly interested.
CA Sales Operations requires that customers generate and send their billing reports each month. Billing reports also can be generated on any
specific time interval for internal purposes.
billing Prerequisites
For the billing probe to work properly, the following prerequisites must be met:
All usage_metering probes in the environment must be the same release version as the billing probe.
An instance of usage_metering must reside on the same robot as the billing probe and must be configured as the Primary instance type.
See the usage_metering documentation for clarification on primary and secondary instances.
The probes must be deployed to a robot that is version 5.70 or higher and that has java v1.6.2 or higher installed on the system.
The subscription must been successfully configured in order for the billing probe to perform a calculation.
Note: The billing probe and the primary instance of usage_metering must be installed on the same robot, They are usually
deployed to the primary hub. For illustrations and additional information, see the section on deployment scenarios in the usage
1.
See the billing Release Notes for for compatibility. When applicable, the usage_metering and billing probe installed to the environment
should be the same versions. If the same versions do not exist in the web archive, then the latest version of usage_metering should be
installed.
Configure a Subscription
The Subscription tab allows you to configure your subscription file.
NOTE: If this file currently exists in a probes/service/billing/ directory, then copy it to another location (such as your desktop)
and select from this non-billing probe location. The configuration will not succeed if selecting the file from the billing probe
directory. Click the OK button.
3. Click Import. You know the import succeeded when:
A message states that the probe has been successfully configured.
The Subscription ID field is populated with an encrypted string of characters that uniquely identify a billing report for a specific
customer.
Check Billing Status
Click Export Subscription File to export the subscription to a read-only Excel file for viewing purposes. No changes can be made and stored
back into the billing probe.
Calculate Billing
To create and view billing calculations.
1.
To add these additional columns, modify the billing probe's config file setup section. The setup section follows this format:
<setup>
...
http_timeout_in_millis = 15000
nimrequest_timeout_in_mins = 15
report_generate_full_details_flag = false
</setup>
To configure additional reporting detail through Raw Configure, follow these steps:
1. Shift + right-click the billing probe and select Raw Configure.
2. Navigate to the setup/report_generate_full_details_flag section.
3. Edit the following key-value pair:
setup/report_generate_full_details_flag=true
Configure a Subscription
The Subscription tab allows you to configure your subscription file.
NOTE: If this file currently exists in a probes/service/billing/ directory, then copy it to another location (such as your desktop)
and select from this non-billing probe location. The configuration will not succeed if selecting the file from the billing probe
directory. Click the OK button.
3. Click Import. You know the import succeeded when:
A message states that the probe has been successfully configured.
The Subscription ID field is populated with an encrypted string of characters that uniquely identify a billing report for a specific
customer.
Check Billing Status
Click Export Subscription File to export the subscription to a read-only Excel file for viewing purposes. No changes can be made and stored
back into the billing probe.
Calculate Billing
To create and view billing calculations.
1. Navigate to Calculations in the billing probe configuration GUI.
2. Under Billing Calculation, enter a four-digit year and numerical month (1 for January, 2 for February, etc.) for the billing report of
interest.
3. Click Run. The billing calculation is performed in the background. Progress can be viewed in the LogViewer, which lets you view billing.l
og.
4. The Results section shows the first and last day of the report, as well as the active subscription. Press the Refresh to recalculate.
To export an HTML calculation report containing summary and detail information:
Click Export. This report must be submitted to CA Sales Operations on a monthly basis.
To add these additional columns, modify the billing probe's config file setup section. The setup section follows this format:
<setup>
...
http_timeout_in_millis = 15000
nimrequest_timeout_in_mins = 15
report_generate_full_details_flag = false
</setup>
To configure additional reporting detail through Raw Configure, follow these steps:
1. Shift + right-click the billing probe and select Raw Configure.
2. Navigate to the setup/report_generate_full_details_flag section.
3. Edit the following key-value pair:
setup/report_generate_full_details_flag=true
capman_da AC Configuration
This article describes how to configure the capman_da probe.
Verify Prerequisites
Configure the capman_da Probe
(Optional) Collect Metrics
(Optional) vmax Probe QoS Mappings
(Optional) Configure the Queue List
Verify Prerequisites
Verify that the required hardware and software is available, and any installation consideration is met before you configure the probe. For more
information, see the capman_da Release Notes.
iv. thrift_truststores
The complete path to the trust store that contains the certificate used for SSL communication.
v. thrift_trustore_passwords
The encrypted password string for the trust store that contains the certificate used for SSL communication. You must use the CCC
EncryptPassword utility to generate the encrypted password string.
vi. thrift_client_timeouts
The timeout value, in milliseconds, for connecting to the thrift server over secure communication.
b. data_output_setup
i. cleanup_archive_interval_in_days
The number of days that an archived file is saved before it is deleted from the ARCHIVE folder. Only files with a date prior to the
number of days specified are deleted. The default interval is set at three days. A minimum of one day is required.
ii.
b.
ii. configuration_data_interval_time
The time when the resource configuration data is gathered from the source probes. The default time is set to 11:00 p.m.
iii. output_folder
A target folder, for example DM Data directory, that is automatically generated by the system, and comprises configuration and
metric data. This folder includes system-generated DATA and ARCHIVE sub-folders that contain probe-specific folders. Once the
data files are published successfully to the Data Manager, they are automatically moved to the ARCHIVE folder. The default folder
path is C:/CapMan_DA_Data/.
c. nis_api_connection
i. host
The host on which the nisapi_wasp package is deployed.
ii. port
The port for accessing the Web service. By default, the port is set to 8080.
3. Select the startup section. Verify or make changes to the options for JVM memory, language, and locale. The default is set at -Xms32m
-Xmx512m -Duser.language=en -Duser.country=US.
4. Expand the advanced_setup section to verify or make changes to the following parameters:
a. Select advanced_setup.
i. log_level
The log_level parameter maps to INFO, and the configurable range is 0 to 5.
ii. number_of_configuration_data_threads
Sets the size of the thread pool that processes the configuration data at the scheduled time. Set the size according to the number
of cores and processors. The acceptable range is 10 to 50. If you set the value to less than 10, the system uses 10. If you set the
value higher than 50, the system uses 50.
iii. inventory_data_collection
If set to true, inventory data is collected.
iv. performance_data_collection
If set to true, performance data is collected.
v. performance_data_multithreading
If set to true, multi-thread processing of performance data is enabled.
vi. maximum_csv_file_sie_in_mb
Sets the maximum CSV file size before it rolls over to the next file.
b. Select data_queue_configuration. The probe uses a data queue to temporarily store the relevant QoS messages before they can be
processed. The following properties are related to configuring this queue.
i. data_queue_batch_size
The number of records that are pulled from the data queue for bulk processing. The default batch size is 1,000 records. A
minimum of 100 records is required.
ii. data_queue_capacity
The maximum number of elements held in the queue before being dropped. The default queue size 1,000,000 elements. A
minimum of 10,000 elements is required.
c. Select process_sleep_configuration.
i. cleanup_archive_thread_sleep_time_in_minutes
The amount of time, in minutes, the cleanup thread waits between each time it checks for old archive files to be deleted. The
default time is set at 30 minutes. A minimum of one minute is required.
d. Select thrift_upload_configuration
i. enable_thrift_upload
If set to true, data is uploaded to the Data Manager.
ii. execution_interval_in_minutes
Sets when the probe restarts the upload after the previous upload completes. The minimum time that can be set is one minute.
iii. first_start_delay_in_minutes
Sets when the probe starts the upload process for the first time. The default is 60 minutes. The minimum time is one minute.
The following settings are recommended:
In large environments (over 8,000 QoS messages/second), configure 120 minutes.
In medium environments (4,000 to 8,000 QoS messages/second), configure 60 minutes.
In small environments (1,000 to 4,000 QoS messages/second), configure 30 minutes.
iv. maximum_number_of_threads
Sets the maximum number of threads for thrift data upload. The acceptable range is 1 to 50 threads.
In large environments (over 8,000 QoS messages/second), the recommended setting is 10 threads.
e. Select inventory_data_configuration.
i. dynamic_multi_threading
If set to true, dynamic thread pooling is used for inventory data processing.
ii. fixed_thread_pool_size
Sets the thread pool size to use if dynamic_multi_threading is disabled. Acceptable values are 1 to 50 threads.
iii.
iii. maximum_dynaic_thread_pool_size
Sets the maximum number of threads in a thread pool if dynamic_multi_threading is enabled.
The following settings are recommended:
In large environments (over 8,000 QoS messages/second), configure 50 threads.
In medium environments (4,000 to 8,000 QoS messages/second), configure 30 threads.
In small environments (1,000 to 4,000 QoS messages/second), leave the default value of 20 threads.
f. nis_api_configuration
i. enable_metric_api_call
If set to true, the system makes a call to the metric API endpoint of the nisapi_wasp web service.
ii. loss_of_data_at_threshold
Manages data when the threshold is reached. If set to true, data collection stops when the threshold is met to prevent the probe
from running out of memory.
iii. maximum_number_of_metrics_per_metric_api_call
Sets the maximum number of metrics used in a single call to the nisapi_wasp API endpoint. The acceptable range is 1 to 1000
metrics.
iv. maximum_number_of_metrics_per_metric_definition_call
Sets the maximum number of metrics used in a single call to the nisapi_wasp API definition. The acceptable range is 1 to 1000
metrics.
v. metric_api_data_queue_threshold_percentage
Sets the threshold of the data queue size for metric API data calls. The acceptable range is 10 to 100 percent.
vi. metric_api_maximum_thread_pool_size
Sets the maximum number of possible threads used for metric API endpoint calls. The acceptable range is 1 to 50 threads.
vii. metric_definition_api_maximum_thread_pool_size
Sets the maximum number of possible threads used for metric definition endpoint calls. The acceptable range is 1 to 50 threads.
The following settings are recommended:
In large environments (over 8,000 QoS messages/second), configure 50 threads.
In medium environments (4,000 to 8,000 QoS messages/second), configure 25 threads.
In small environments (1,000 to 4,000 QoS messages/second), leave the default value of 10 threads.
viii. nis_api_client_object_pool_size
Sets the number of nisapi_wasp clients created in an object pool. The acceptable range is 1 to 60 clients.
The following settings are recommended:
In large environments (over 8,000 QoS messages/second), configure 60 clients.
In medium environments (4,000 to 8,000 QoS messages/second), configure 35 clients.
In small environments (1,000 to 4,000 QoS messages/second), leave the default value of 20 clients.
ix. use_collection_xp_endpoint
If set to true, the collection XP endpoint is used instead of the collection endpoint.
5. Select the generic_metric_mapping section to list the default metrics mappings that are used by CA Data Manager to collect the
metrics.
At the vmax metric level, the parameter matching_metric_type is set to 5:1. The additional_match key lists the out-of-box QoS mapping
parameters.
Since the matching_metric_type parameter is set to 5:1, you must enable the vmax probe monitors using the QoS mapping parameters.
If you disable the capman_da probe, it is recommended that you disable the queue to avoid accumulating messages.
More information:
casdgtw (CA ServiceDesk Gateway) Release Notes
Contents
casdgtw Node
Field Mapping Node
Configure a Node
Add Field Mapping Details
Delete Field Mapping Details
Advanced Configuration Settings
Enable Offline Management
Configure HTTPS CA Service Desk URL
Configure the subscribe_alarm_closure Key
Configure the subscribe_alarm_updates Key
The CA ServiceDesk Gateway probe is configured by defining the URL of the CASD application with the user account details for generating
incidents by the probe. You can also specify the UIM user to whom the alarm is assigned for creating the incident. The configuration details
specify the field mapping details for storing relevant alarm information in the incident.
casdgtw Node
The casdgtw node contains sections for enabling the communication between the probe and the CASD application.
This section contains configuration details specific to the CA ServiceDesk Gateway probe.
Navigation: casdgtw
Set or modify the following values as required:
casdgtw > Probe Information
This section provides information about the probe name, probe version, start time, and the probe vendor.
casdgtw > Server Configuration
This section lets you configure the URL of the CASD application and user credentials for the authorization purpose.
Server URL: defines the WSDL URL of the CA Service Desk application for retrieving the description of the Web Service. This Web
Service exposes methods for performing necessary operations on the CA Service Desk application. For example, http://<IP/Instance
Name>/axis/services/USD_R11_Webservice?WSDL
Username: defines the user name for logging in to the CA Service Desk application.
Password: defines the password for logging in to the CA Service Desk application.
Note: Use the Test option from the Actions drop-down list for verifying the connectivity between the probe and the CASD
application.
casdgtw > General Configuration
This section lets you configure the compatibility and connection settings between the probe and the CASD application.
Log Level: sets the level of details to be included in the log file.
Default: 0 - Fatal
NAS Address: defines the address of the local Alarm Server (NAS) in the /<Domain>/<Hub>/<Robot>/nas format where the probe is
deployed. The address is case-sensitive.
Service Desk User: defines the username of the UIM user. Whenever you assign an alarm to the given user, the probe initiates the
request to generate a new incident.
Service Desk Version: specifies the version of the CA Service Desk application, which the probe is connecting.
Default: v12.1
Check Interval (Minutes): defines the time interval (in minutes) after which the probe checks for closed incidents in the CA Service
Desk application for clearing the corresponding alarms. The recommended value is 5 minutes.
Default: 30
Date Format: specifies the Date and Time format code for storing the time value. This field ensures that the probe and the CA Service
Desk application are using the same Date and Time format.
Default: MM/dd/yyyy HH:mm:ss
Timezone: specifies the time zone code for storing the time value. The time zone must be same as the time zone of the CA Service
Desk application. This field ensures that the probe and the CA Service Desk application are using the same time zone.
Default: Asia/Calcutta
Incident Id Custom Field: specifies the custom field of the alarm (custom_1 to custom_5) to save the incident id of the corresponding
incident.
Default: custom_1
Owning System: specifies the way of updating the incident and alarm status. Select one of the following options:
Nimsoft Monitor: closes the CASD incident, when the alarm is acknowledged.
Service Desk: acknowledges the alarm when the CASD incident is closed.
Both: performs the bidirectional updates.
Default: Service Desk
Closed Ticket Status: specifies the CASD incident status, which is set when the corresponding alarm is acknowledged.
Enable Incident Activity Logging: updates the activity log (creating, updating, and closing the incident) in the CASD incident that is
based on the corresponding alarm in UIM.
Default: Not selected
casdgtw > Configuration Item Status
The Configuration Item Status section lets you exclude the configuration items status from displaying in a CASD incident. This list of
configuration item status represents retired or blocked configuration items. If the CASD incident is having the selected status in the Confi
guration Item list, the respective status is not displayed in the incident.
Note: The list of available Configuration Item Status is populated only after configuring the server details.
Note: The options available in this field depends on the Service Desk Version field of the General Configuration section.
Clear: specifies the Priority, Severity, or Urgency of the incident when the alarm severity is Clear.
Information: specifies the Priority, Severity, or Urgency of the incident when the alarm severity is Information.
Warning: specifies the Priority, Severity, or Urgency of the incident when the alarm severity is Warning.
Minor: specifies the Priority, Severity, or Urgency of the incident when the alarm severity is Minor.
Major: specifies the Priority, Severity, or Urgency of the incident when the alarm severity is Major.
Critical: specifies the Priority, Severity, or Urgency of the incident when the alarm severity is Critical.
Set or modify the following values that are based on your requirement:
Field Mapping > Field Mapping
This section contains a Mapping table for displaying the list of the mapped fields and its associated value. The CA ServiceDesk Gateway
lets you map fields for three different scenarios:
On Alarm Create
On Alarm Update
On Alarm Close
Use the Delete button of the mapping table for removing the mapping details.
Note: Use the Options icon next to the Field Mapping node for adding the mapping details.
Configure a Node
This procedure provides the information to configure a particular section within a node.
Each section within the node lets you configure the properties of the probe. These properties are used for generating incident in the CASD
application that is based on the UIM alarm.
Follow these steps:
1. Navigate to the section within a node that you want to configure.
2. Update the field information and click Save.
The specified section of the probe is configured.
The probe is now ready for generating incidents in the CASD application.
Note: The service desk field list appears only if valid credentials are provided in the Server Configuration section of the casd
gtw node.
4. Specify Alarm Field or define a Default Value of the selected Service Desk Field for the following three scenarios:
Alarm Create
Alarm Update
Alarm Close
5. Click Submit.
The mapping details are saved and displayed in the Mapping table of the Field Mapping node.
Note: You can map a field (an Alarm field or a Service Desk field) again for updating the field mapping details.
2. Use the Raw Configure option and navigate to the Setup section.
3. Define the certificate path in the certificatePath key.
Note: The probe supports only absolute path including the certificate file name.
casdgtw Metrics
The CA ServiceDesk Gateway (casdgtw) probe does not generate any QoS. Therefore, there are no probe checkpoint metrics to be configured
for this probe.
More Information:
cassandra_monitor (Cassandra Monitoring) Release Notes
cassandra_monitor AC Configuration
Each Cassandra Monitoring probe is a local standalone probe. Any configuration changes must be made on all instances of the probe where the
change is relevant.
The following figure provides an overview of the process you must follow to configure a working probe.
Verify Prerequisites
Verify that the required software is available before you configure the probe.
Follow these steps:
1. Review the cassandra_monitor (Cassandra Monitoring) Release Notes for dependencies, requirements, and deployment information.
2. Install the probe on each Cassandra node that you want to monitor.
The probe configuration has a default resource that is defined with an identifier of CASSANDRA. You can change the default resource settings
according to your monitoring needs.
Follow these steps:
1. In the probe configuration GUI, select a resource node in the navigation pane. The resource settings appear in the details pane.
2. Make the appropriate changes.
The available settings are:
Identifier - The identifier of the Cassandra resource.
Active - Indicates if monitoring is on for the resource. The probe collects data at the specified interval when this option is active.
Interval (secs) - The frequency of data collection in seconds.
Alarm Message - The alarm message to be sent if the resource does not respond.
Collect System Metrics - Indicates if the probe collects system-level metrics. System metrics include metrics about the system, not
Cassandra processes. For example, system metrics collect QoS data on CPU usage and storage.
Publish to BDIM - Indicates if the probe sends data to Big Data Infrastructure Management.
3. Click Save at the top of the page. A Success message appears when the configuration change is complete.
Configure Monitoring
The CASSANDRA resource is associated with various subcomponents of the probe. You can add and configure monitors for these
subcomponents through the navigation pane. Click the node in the navigation pane to see the monitors that are associated with the probe. You
can configure the QoS measurements that you want to collect data for, and any alarms or events you want by modifying the appropriate fields.
Follow these steps:
1. Go to cassandra_monitor > host name > CASSANDRA in the navigation pane.
2. Click a node in the resource tree. If necessary, expand one or more nodes to view the monitors for the components included in the
resource.
The available monitors appear in a table on the details pane.
3. Select the monitor that you want to modify in the table.
4. Configure the monitor settings in the fields below the table.
5. Click Save at the top of the page.
A Success message appears when the configuration change is complete.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
verified by viewing the "raw config" of the peer probes or by observing the change in probe functionality. Once the peer probes are
restarted, their configuration screens will reflect the changes made on the agent node.
cassandra_monitor Node
Verify the probe setup after you install the probe.
Navigation: cassandra_monitor
Probe Information
This section displays read-only information about the probe.
Probe Setup
This section allows you to change the probe log level.
Hostname Node
The hostname node is a container for the probe inventory. This node is only a container, no configuration information is displayed in the details
pane for this node.
Navigation: cassandra_monitor > hostname
Profile Node
The profile node typically includes a cluster node and a system node. You can modify some of the resource settings and can verify the resource
information.
Navigation: cassandra_monitor > hostname > CASSANDRA
Resource Setup
Fields to know:
Identifier
The identifier for the Cassandra resource profile
Active
Indicates if monitoring is active for the resource.
Interval (secs)
The frequency of data collection in seconds.
Alarm Message
Cluster Node
A Cassandra cluster is defined in the Cassandra YAML or some other Cassandra cluster configuration facility. This node contains a hierarchy of
the various Cassandra subcomponents that represent a cluster. Each cluster is identified by a hostname which contains a hierarchy of the various
Cassandra components that represent a cluster. All of the cluster components contain monitors that you can configure to collect QoS
measurements. The probe can monitor the following cluster components:
Binary - This node contains binary-related metrics. For example, the number of dropped messages for binary operation.
Gossip - This node contains gossip-related metrics. For example, the number of pending tasks for gossip operation.
Incremental Backup - This node does not contain any metrics.
Keyspaces - This node is a container for the keyspace components.
NodeMetrics - This node contains most of the Cassandra-specific metrics.
Thrift - This node contains thrift-related metrics. For example, the number of thrift clients.
Navigation: cassandra_monitor > hostname > profile name > cluster name
Click the cluster node to view the components in the cluster. Expand the cluster node and navigate through the hierarchy as needed. To
configure monitors, select each node that you want to monitor. In the details pane, select the monitors that you want to configure in the table and
modify the appropriate settings. Each monitor allows you to specify a QoS measurement, conditions to be associated with the monitor, and what
actions to take when the conditions are met, such as raising an Alarm.
Fields to know:
QoS Name
The name to be used on the QoS message issued. The field is read-only.
Description
A read-only field that describes the monitor.
Metric Type
Identifies a unique Id of the QoS.
Units
The unit of the monitored value (for example, % or Mbytes). The field is read-only.
Publish Data
Select this option if you want QoS messages to be issued on the monitor.
Publish Alarms
Select this option if you want to activate alarms.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
Current Value -- The most current value that is measured is used.
Delta Value (Current - Previous) -- The delta value that is calculated from the current and the previous measured sample is used.
Delta Per Second -- The delta value that is calculated from the measured samples within a second are used.
Average Value Last n Samples -- The user specifies a count. The value is then averaged based on the last "count" items.
Number of Samples
The count of items for the Value Definition when set to Average Value Last n Samples.
Operator
The operator to be used when setting the high or low alarm threshold for the measured value. Select Publish Alarms to enable this
setting.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The high or low alarm threshold value. An alarm message is sent if this threshold is exceeded.
Message Name
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message pool.
Configure dynamic alarm thresholds following the instructions that are found in cassandra_monitor AC Configuration.
System Node
The system node and its subcomponents contain many monitors for system-level QoS measurements. The probe can monitor the following
system components and services:
CPU
Memory
Network - This node is a container all network interfaces on the system and TCP statistics. Network information is collected by searching
the system for all network interfaces. If the system contains multiple NICs for example, you might see values such as eth1 and eth2.
StorageVolumes - This node is a container for all mounted filesystems on the system. Storage volume information is collected by
searching for mount points. If a user has /etc mounted from a separate partition for example, you see /etc in the StorageVolumes
list. Even network drives can appear in the StorageVolumes container.
Navigation: cassandra_monitor > hostname > CASSANDRA > System
Click the system node to view the subcomponents. Expand the system node and navigate through the hierarchy as needed. To configure
monitors, select each node that you want to monitor. In the details pane, select the monitors that you want to configure in the table and modify
the appropriate settings. Each monitor allows you to specify a QoS measurement, conditions to be associated with the monitor, and what actions
to take when the conditions are met, such as raising an Alarm.
Fields to know:
QoS Name
The name to be used on the QoS message issued. The field is read-only.
Description
A read-only field that describes the monitor.
Metric Type
Identifies a unique Id of the QoS.
Units
The unit of the monitored value (For example, % or Mbytes). The field is read-only.
Publish Data
Select this option if you want QoS messages to be issued on the monitor.
Publish Alarms
Select this option if you want to activate alarms.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
Current Value -- The most current value that is measured is used.
Delta Value (Current - Previous) -- The delta value that is calculated from the current and the previous measured sample is used.
Delta Per Second -- The delta value that is calculated from the samples that are measured within a second are used.
Average Value Last n Samples -- The user specifies a count. The value is then averaged based on the last "count" items.
Number of Samples
The count of items for the Value Definition when set to Average Value Last n Samples.
Operator
The operator to be used when setting the high or low alarm threshold for the measured value. Select Publish Alarms to enable this
setting.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The high or low alarm threshold value. An alarm message is sent if this threshold is exceeded.
Message Name
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message pool.
Configure dynamic alarm thresholds following the instructions that are found in cassandra_monitor AC Configuration.
cassandra_monitor Metrics
The following tables list the metrics that you can collect with the probe. These metrics are configured in the probe configuration GUI.
Contents
Cluster Node
Monitor
Units
Description
Version Added
CassandraVersion
Number
v1.0
LiveNodes
Count
v1.0
DeadNodes
Count
v1.0
Units
Description
Version Added
ProcessId
Float
v1.0
NonJVMHeapMemoryUsed
Megabytes
Non-java virtual machine (JVM) heap memory that is used by the process
v1.0
NonJVMHeapMemoryCommitted
Megabytes
v1.0
JVMHeapMemoryUsed
Megabytes
v1.0
JVMHeapMemoryCommitted
Megabytes
v1.0
JVMHeapMemoryUsedPercent
Percent
v1.0
JVMHeapMemoryMax
Megabytes
v1.0
JVMThreadActiveCount
Count
v1.0
JVMThreadPeakCount
Count
v1.0
JVMThreadTotalStartedCount
Count
v1.0
JVMThreadDaemonCount
Count
v1.0
JVMTotalGcCount
Count
v1.0
JVMTotalGcTime
Milliseconds
v1.0
MemoryMajorFaults
Count
v1.0
MemoryMinorFaults
Count
Count of the page faults that did not cause disk I/O requests
v1.0
ResidentMemory
Bytes
v1.0
MemoryUsage
Bytes
v1.0
SharedMemory
Bytes
v1.0
MemoryPageFaults
Count
v1.0
CpuUsage
Percent
v1.0
CpuKernelTime
Milliseconds
v1.0
CpuTotalTime
Milliseconds
v1.0
CpuUserTime
Milliseconds
v1.0
CpuKernelTimePercent
Percent
v1.0
CpuUserTimePercent
Percent
v1.0
FileDescriptorsOpen
Count
v1.0
Binary Node
Monitor
Units
Description
Version Added
BinaryDroppedMessages
Count
v1.0
Gossip Node
Monitor
Units
Description
Version Added
GossipPendingTasks
Count
v1.0
Keyspace Nodes
Monitor
Units
Description
Version Added
CFTotalDiskSpaceUsed
Bytes
v1.0
CFMemTableDataSize
Bytes
Amount of space that is used for the memtable of the column family
v1.0
CFLiveStableCount
Count
Number of live sstables that are used for the column family
v1.0
CFBloomFilterDiskSpaceUsed
Bytes
Amount of space that is used for the bloom filter of the column family
v1.0
CFLIveDiskSpaceUsed
Bytes
Amount of live disk space that is used for the column family
v1.0
CFTotalWriteLatency
Microseconds
v1.0
CFRecentWriteLatency
Microseconds
Total amount of latency for the recent write in the column family
v1.0
CFRecentReadLatency
Microseconds
Total amount of latency for the recent read in the column family
v1.0
CFWriteCount
Count
v1.0
CFReadCount
Count
v1.0
CFPendingTasks
Count
v1.0
CFKeyspaceCacheHitRace
Number
v1.0
CFBloomFilterFalsePositives
Count
v1.0
CFBloomFilterRecentFalseRatio
Number
v1.0
NodeMetrics Node
Monitor
Units
Description
Version
Added
NodeMsgQueueBinaryDroppedMessage
Count
v1.0
NodeMsgQueueCounterMutationDroppedMessage
Count
v1.0
NodeMsgQueueMutationDroppedMessage
Count
v1.0
NodeMsgQueuePageRangeDroppedMessage
Count
v1.0
NodeMsgQueueRangeSliceDroppedMessage
Count
v1.0
NodeMsgQueueReadDroppedMessage
Count
v1.0
NodeMsgQueueReadRepairDroppedMessage
Count
v1.0
NodeMsgQueueRequestResponseDroppedMessage
Count
v1.0
NodeMsgQueueTraceDroppedMessage
Count
v1.0
NodePendingTasksHintedHandOff
Count
v1.0
NodePendingTasksPendingRangeCalculator
Count
v1.0
NodePendingTasksAntiEntropyStage
Count
v1.0
NodePendingTasksValidatorExecutor
Count
v1.0
NodePendingTasksCacheCleanup
Count
v1.0
NodePendingTasksReadRepairStage
Count
v1.0
NodePendingTasksReadStage
Count
v1.0
NodePendingTasksRequestResponseStage
Count
v1.0
NodePendingTasksMutationStage
Count
v1.0
NodePendingTasksGossipStage
Count
v1.0
NodePendingTasksMigrationStage
Count
v1.0
NodePendingTasksInternalResponseStage
Count
v1.0
NodePendingTasksMiscStage
Count
v1.0
NodePendingTasksWrites
Count
v1.0
NodePendingTasksReads
Count
v1.0
NodePendingTasksCompactionExecutor
Count
v1.0
NodeOpenFileDescriptorCount
Count
v1.0
NodeMaxFileDescriptorCount
Count
v1.0
NodeSwapSpaceTotal
Bytes
v1.0
NodeSwapSpaceFree
Bytes
v1.0
NodePhysicalMemFree
Bytes
v1.0
NodePhysicalMemTotal
Bytes
v1.0
NodeVirtualMemCommitted
Bytes
v1.0
NodeTotalDiskSpaceUsed
Bytes
v1.0
NodeTotalMemTableSize
Bytes
v1.0
NodeRecentReadLatencyAverage
Microseconds
v1.0
NodeRecentWriteLatencyAverage
Microseconds
v1.0
NodeTotalWriteCount
Count
v1.0
NodeTotalReadCount
Count
v1.0
RangeOperationCount
Count
v1.0
TotalReadLatencyMicros
Microseconds
v1.0
TotalRangeLatencyMicros
Microseconds
v1.0
NodeOperation
RoleOperationMode
v1.0
STARTING = 0
NORMAL = 1
CLIENT = 2
JOINING = 3
LEAVING = 4
DECOMMISSIONED = 5
MOVING = 6
DRAINING = 7
DRAINED = 8
RELOCATING = 9
UNKNOWN = 10
Thrift Node
Monitor
Units
Description
Version Added
ThriftClientCount
Count
v1.0
CPU Node
Monitor
Units
Description
Version Added
CPUIdleTime
Percent
v1.0
CPUIRQTime
Percent
v1.0
CPUNicePriority
Percent
v1.0
CPUSoftIrq
Percent
v1.0
CPUStolenTime
Percent
v1.0
CPUSystemLevel
Percent
v1.0
CPUUserLevel
Percent
v1.0
CPUIOWaitTime
Percent
v1.0
Memory Node
Monitor
Units
Description
Version Added
SystemMemoryFree
Bytes
v1.0
SystemMemoryUsed
Bytes
v1.0
SystemMemoryUsedPercent
Percent
v1.0
SystemSwapMemoryFree
Bytes
v1.0
SystemSwapMemoryUsed
Bytes
v1.0
TotalUsedSwapMemoryPercent
Percent
v1.0
Units
Description
Version Added
NetIntfRxPacketsDropped
Count
v1.0
NetIntfRXPacketWithErrors
Count
v1.0
NetIntfRXPacketWithFrameErrors
Count
v1.0
NetIntfRXPacketsWithOverrunErrors
Count
v1.0
NetIntfPacketsRecieved
Count
v1.0
NetInterfaceSpeed
Bytes/Second
v1.0
NetIntfBytesTransmitted
Bytes
v1.0
NetIntfTXPacketsWithCarrierErrors
Count
v1.0
NetIntfTXPacketCollisions
Count
v1.0
NetIntfTXPacketsDropped
Count
v1.0
NetIntfTXPacketsErrors
Count
v1.0
NetIntfTXPackets
Count
v1.0
NetIntfBytesRecieved
Bytes
v1.0
NetIntfTXPacketsWithOverrunErrors
Count
v1.0
Units
Description
Version Added
TCPPassiveOpens
Count
v1.0
TCPFailedConnectionAttempts
Count
v1.0
TCPRetransmittedSegments
Count
v1.0
TCPActiveOpens
Count
v1.0
TCPConnectionResets
Count
v1.0
TCPCurentlyEstablishedConnections
Count
v1.0
TCPResetsSent
Count
v1.0
TCPSegementsSent
Count
v1.0
TCPSegementsRecievedInErrors
Count
v1.0
TCPSegementsRecieved
Count
v1.0
StorageVolumes Nodes
Monitor
Units
Description
Version Added
DiskServiceTime
Milliseconds
Average service time in milliseconds for I/O requests that were issued to the device
v1.0
DiskWrites
Bytes
v1.0
DiskReads
Bytes
v1.0
DiskUsage
Percent
v1.0
DiskWritesCount
Count
v1.0
DiskReadsCount
Count
v1.0
FileSystemUsage
Kilobytes
v1.0
FileSystemFree
Kilobytes
v1.0
FileSystemCapacity
Kilobytes
v1.0
Right-clicking in the pane opens a pop-up menu, giving you the following possibilities:
New Host
Available only when a group is selected.
Opens the profile dialog, enabling you to define a new host to be monitored.
New Group
Available only when a group is selected.
Opens the profile dialog, enabling you to define a new group. Use the group folders to place the hosts in logical groups.
New
Opens the profile dialog, enabling you to define a new host to be monitored.
Edit
Available only when a host is selected.
Opens the profile dialog for the selected host, enabling you to modify the properties for the host.
Rename
Lets you rename the selected group or host. Note that you are not allowed to rename the group Default.
Delete
Lets you delete the selected host or group. Note that you are not allowed to delete the group Default.
Refresh
Makes a refresh to reflect the current values of the objects listed in the right window pane.
Note: When attempting to refresh a host that does not respond, the checkpoints description fields in the right pane will appear
with red letters.
Reload
Retrieves updated configuration information from the selected agent.
Information
Available only when a host is selected.
Opens an informational window, containing system and configuration information about the selected host.
Note that the Enable Monitoring option in the properties dialog (opened by right-clicking the checkpoint and selecting Edit) for
the checkpoint must be ticked before this option works.
Deactivate
Deactivates the selected checkpoint (if activated), and the probe will stop monitoring the checkpoint.
Monitor
Opens the monitor window for the selected checkpoint, showing the values recorded since the probe was started.
Note: The horizontal red line in the graph indicates the alarm threshold (in this case 90 %) defined for the checkpoint.
When clicking and holding the left mouse button inside the graph, a red vertical line appears. If you continue to hold the left mouse-button
down and move the cursor, you can read the exact value at different points in the graph. The value is displayed in the upper part of the
graph on the format: <Day> <Time> <Value>.
Right-clicking inside the monitor window lets you select the backlog (the time range shown in the monitor window). In addition, the
right-click menu lets you select the option Show Average. This adds a horizontal blue line in the graph, representing the average sample
value.
Note the status bar at the bottom of the monitor window. The following information can be found:
The number of samples since the probe was started.
The minimum value measured.
The average value measured.
The maximum value measured.
The backlog
This is the time range shown in the monitor window. The backlog can be selected to 6, 12, 24 or 48 hours by right-clicking inside the
graph. Note that the graph can not show a time range that is greater than the selected Sample Period setting in the Setup section of
the GUI.
View Summary
This option is enabled only when CallManager is selected in the left pane.
Opens the Call Manager Summary report window (see the section ) for the host.
The tool buttons
The configuration tool also contains a row of tool buttons:
Clicking the General Setup button opens the General Setup dialog.
Field
Description
General
Log-level
SNMP
Community
String
SNMP
request
timeout
Select the timeout value for the SNMP requests. Use the default value, or select another value (in seconds) using the slider.
Show
SNMP
Error
In the case of an SNMP error, a general error message appears on the screen.
If this option is checked, you will get a more detailed error message, describing the specific error that occurred.
Advanced
Check
Interval
Select the Check interval from the drop-down list. Valid choices are:
1 minute
5 minutes
15 minutes
30 minutes
1 hour
Check interval is the time gap between each time the probe collects the monitor data and writes to the database.
Sample
period
Select the Sample period from the drop-down list. Valid choices are:
6 hours
12 hours
24 hours
48 hours
Sample period is the time range within which the monitored values are picked when calculating the average value (used when
setting the alarm threshold). A Sample period of 24 hours means that the samples stored within the last 24 hours will be used.
Max
number of
threads
Specifies the maximum number of profiles the probe can run simultaneously. The valid range is 0 - 100.
You may show the database status for the monitored hosts by clicking the Show Database Status tool button. The database status for the
monitored hosts will be displayed in the right pane.
The list shows the first and the last date QoS data has been recorded and written to the database, and also the number of records in the period.
The probe includes a set of predefined checkpoints available on most hosts running the CCM software, but you may also define your own objects
to be monitored. Clicking this button, previously user specified objects will be listed in the right pane.
Right-clicking in the right pane and selecting New, the User Object dialog appears, enabling you to define a new object. This enables you to
extend objects that are not a part of the Cisco Monitor standard objects, but are available from the windows performance system.
The User Object dialog appears, enabling you to define a new object.
When clicking the Apply button in the probe GUI, the new user object will be added to the list of user defined objects and also under the User
Objects sub-node under each host listed in the left pane. When restarting the probe, the probe has to wait for a couple of minutes before it can
show the measured values for the new user defined object.
Field
Description
Login Profile
Select one of the defined hosts from the drop-down list as a Login Profile.
Performance Object
Selector
Object Name
Select the name of the new object from the drop-down list. These are objects that are not a part of the Cisco
Monitor standard objects.
Counter Name
Select the counter name from the drop-down list (e.g. Disk Transfers/sec. if object LogicalDisk is selected).
Instance Name
Select the instance name from the drop-down list (e.g. disk drive D: if object LogicalDisk and counter Disk
Transfers/sec is selected).
Object Preferences
Description
Unit
Specify the unit of the QoS value, such as pages/sec, calls, connections, %.
You may create a new group folder by clicking the New Group Folder tool button and giving the new folder a name.
Creating a new host profile
You may create a new host profile by selecting the group it should belong to and click the New Host Profile tool button.
A dialog-box will appear and prompt you for the hostname or IP-address to the CISCO device to monitor and some additional parameters. See
the section .
Launching the Message Pool Manager
The Message Pool Manager can be opened by clicking the Message Pool button in the Tool bar.
The alarm messages for each alarm situation are stored in the Message Pool. Using the Message Pool Manager, you can customize the alarm
text, and you may also create your own messages.
Note that variable expansion in the message text is supported. If typing a $ in the Alarm text field, a dialog pops up, offering a set of variables to
be chosen.
Showing the phone table
You can open the phone table for the selected host by clicking the Show Phone Table button in the Tool bar.
Opens the Phone Table window containing all telephony devices handled by the selected host.
Showing the CTI device table
You can open the CTI device table for the selected host by clicking the Show CTI Device Table button in the Tool bar.
This table lists all CTI devices found on the selected host. CTI (Computer-Telephony Integration) is the integration of telephony services with
computers, such as IVR (Interactive Voice Response) applications.
View Call Summary
You can open the CallManager Summary Report for a host by clicking the View Call Summary button in the Tool bar.
Clicking this button hides checkpoints that are not available at the selected host from the list.
Clicking the button again will make the unavailable checkpoints appear in the list again.
The CallManager Summary Report window opens The window contains a graph, showing all calls for the period you select from the drop-down
lists.
Calls completed appear in red, and calls attempted in blue. This information is fetched from the summary database.
When clicking and holding the left mouse-button down and move the cursor, you can read the exact value at different points in the graph. The
value is displayed in the upper part of the graph on the format: <Day> <Time> <Attempted> <Completed>.
Field
Description
Description
This field allows you to create your own description of the monitoring object (checkpoint). If desired, replace the original
description text with a description of your own choice.
Monitoring object
This is the name of the monitoring object (checkpoint) and cannot be changed.
Value definition
Decides which value to be compared with the Threshold value defined below. The two options are:
Current value
Average value
Delta value (current-previous)
Current value
Uses the last measured value to be compared with the Threshold value defined below.
Average value
Computes the average of the values measured in the time interval selected from the drop-down list.
You may select one of the predefined intervals from the drop-down list or you may type your own value in the field.
Delta value
(current-previous)
The delta value (current - previous). This means that the delta value calculated from the current and the previous
measured sample will be used.
Enable
Monitoring
Operator
Select from the drop-down list the operator to be used when setting the alarm threshold for the measured value.
Example:
=> 90 means alarm condition if the measured value (current sample or average value - see above) is above 90.
= 90 means alarm condition if the measured value (current sample or average value - see above) is exact 90.
Threshold Value
The alarm threshold value. If this value is exceeded (see value and operator above).
Note: For some checkpoints, the threshold is not a value, but a condition. Then the threshold must be selected as a fixed
condition/status from a drop-down list. The Unit field (see below) does then not apply and will be hidden.
Unit
Message Token
Select the alarm message to be issued if the specified threshold value is breached. These messages are kept in the
message pool. The messages can be modified in the Message Pool Manager.
Publish Quality of
Service (QoS)
Select this option if you want QoS messages to be issued on the checkpoint.
A dialog-box will appear and prompt you for the hostname or IP-address to the CISCO device to monitor and some additional parameters.
Field
Description
Hostname or IP
address
Active
Use this option to activate or deactivate monitoring of the checkpoints on the host. Note that you may also
activate/deactivate monitoring of the various checkpoints individually.
Group
Select from the drop-down list which group folder you want to put the host. Use the group folders to place the hosts in
logical groups.
Alarm Message
The alarm message to be issued if the host doesnt respond. Using the Message pool, you can edit this message or add
other messages.
Description
Windows
Authentication
User name and
password
The login properties (a valid user name and password with administrator privileges) on the monitored host.
Domain
If the monitored host is on another domain, use a domain user name and password (see above) and specify the domain
name here.
If the monitored host is on the same domain as the computer hosting the probe, you may use a local user name and
password on the monitored host and leave this field blank.
Test
SNMP
Authentication
Community
/password
Select the SNMP community string to be used from the drop-down list (these community strings must have been created
on the monitored device).
Test
cdm AC Configuration
This article describes the configuration concepts and procedures to set up the cdm probe. You can configure the probe to monitor CPU, disk or
memory of the system on which the probe is deployed.
The following diagram outlines the process to configure the probe to monitor CPU, disks and memory.
Contents
Verify Prerequisites
Configure Disk Monitoring
Configure CPU Monitoring
Configure Memory Monitoring
Configure Network Disk State Monitoring on Windows
Configure Network Disk State Monitoring on Linux and Solaris
Alarm Thresholds
Edit the Configuration File Using Raw Configure
Using Regular Expressions
Verify Prerequisites
Verify that required hardware and software is available and any installation consideration is met before you configure the probe. For more
information, see cdm (CPU, Disk, Memory Performance Monitoring) Release Notes.
Note: If the monitored environment also includes cluster disks, these disks are also included in the diskname node with the
same alarms configurations as local disks. However, for such environments, a Cluster section is displayed in the cdm node
that enables you to view and modify alarm and QoS sources for the cluster resources.
3. Expand the Disk Usage node. Enable the alarms (options are MB or Percentage) and QoS metrics in the Alarm Thresholds and Disk
Usage sections.
4. Navigate to the Disk Configuration section under the Disks node. Specify the monitoring interval, in minutes, when the probe requests
the disk data.
5. Save the configuration.
The selected disk is now being monitored.
The cdm probe allows you to monitor the availability state and usage of network disks on Windows robots.
Follow these steps:
1. Click the Options icon next to the Disks node.
2. Click Add New Share to add a network disk.
The Add New Share window opens.
3. Specify the path of the network resource and access credentials.
4. Select the Enable Folder Availability Monitoring checkbox to enable alarms for the availability state of the network disk.
You can also skip this step and enable the alarms after you create the shared network disk in the probe.
5. Click Submit.
A success message is displayed if the probe is able to connect to the network disk using the specified credentials.
6. Click the shared disk name node of the network disk.
7. Select the Enable Space Monitoring checkbox to enable the alarms for the disk usage metrics of the selected network disk.
8. Save the configuration to start generating the alarms.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info: makes the Device Id of shared drive and local drive identical. Navigate to the setup folder and set this key as No
to enable this feature.
Default: Yes
This feature is introduced because of the following two reasons:
When file system is mounted on Linux through cifs and the cdm probe is deployed fresh, the Device Id and Metric Id for QoS and
alarms for the respective mounted file system are missing.
On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size: indicates the total size of the disk. Navigate to disk > fixed_default and set this key to yes to enable this feature.
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
sliced_cpu_interval: avoids the delay caused by the probe while executing commands (such as SAR) that cause internal alarms.
Navigate to cpu folder and specify a value (in seconds).
sliced_memory_interval: avoids the delay caused by the probe for collecting data from commands (such as VMSTAT) that cause internal
alarms. Navigate to memory folder and specify a value (in seconds).
The following example explains the use of sliced_memory_interval key.
Configure the interval from Setup tab of Memory as 5 min. If you configure the value of sliced_memory_interval key as 10 seconds, then
the probe collects data through VMSTAT command for 290 seconds (that is, interval would be (5 min) 10 seconds). Thus, the probe does
not generate internal alarms and seamlessly generates QOS. If still the problem persists, you can increase the sliced_memory_interval few
more seconds.
The sliced_cpu_interval and sliced_memory_interval keys have been introduced for AIX platform in the Raw Configuration section.
Note: On UNIX platforms, use the regular expression "/\" to exclude the root directory (/) from monitoring.
A regular expression (regex for short) is a special text string for describing a search pattern. Constructing regular expression and pattern matching
requires meta characters. The probe supports Perl Compatible Regular Expression (PCRE) which are enclosed within forward slash (/). For
example, the expression /[0-9A-C]/ matches any character in the range 0 to 9 in the target string.
You can also use simple text with some wild card operators for matching the target string. For example, *test* expression matches the text test in
target string.
The following table describes some examples of regex and pattern matching for the cdm probe.
Regular
expression
Type of regular
expression
Explanation
[A-Z]
Standard (PCRE)
[A-Z:\\]
Custom
Matches with the Uppercase character type of the local disk available on the respective
box
Standard (PCRE)
[*.\\]
Custom
Standard (PCRE)
\d*
Custom
cdm node
<hostname> node
Disks node
<diskname> node
Disk Usage node
Disk Usage Change node
<diskname> Inode Usage node
<shareddiskname> node
Memory node
Memory Paging node
Physical Memory node
Swap Memory node
Network node
Processor node
Individual CPU node
Total CPU node
Iostat node (Linux, Solaris, and AIX)
Device Iostat node
cdm node
Navigation: cdm
This node lets you view the probe information, configure the logging properties and set data management values.
Set or modify the following values, as needed:
cdm > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
cdm > General Configuration
This section provides general configuration details.
Log Level: specifies the level of details that are written to the log file. Log as little as possible during normal operation to minimize disk
consumption, and increase the amount of detail when debugging.
Default: 0 - Fatal
Log size (KB): specifies the size of the log file to which the internal log messages of the probe are written, in kilobytes. When this size is
reached, new log file entries are added and the older entries are deleted.
Default: 100 KB
Send alarm on each sample: If selected, the probe generates an alarm on each sample where there is a threshold breach. If not
selected, the probe waits for the number of samples (specified in Samples in the cdm > Disk Configuration, cdm > Memory or cdm >
Processor configuration screens) before sending the alarm. The sample count is cleared on de-activation of the probe.
Send short name for QoS source: If selected, sends only the host name. If not selected, sends the full host name with domain.
Allow QoS source as target: A number of QoS messages, by default, use the host name as their target. If selected, the target name is
changed to be the same as the QoS source name.
Monitor iostat (Linux, AIX and Solaris only): Enables the iostat monitoring of the host system devices.
Count Buffer-Cache as Used Memory (Linux, Solaris, AIX and HP-UX only): Counts the buffer and cache memory as used memory
while monitoring the physical and system memory utilization. If not selected, the buffer and cache memory is counted as free memory.
Calculate Load Average Per Processor: For all Unix systems, the system load measures the computational work that the system is
performing. This means that if your system has a load of 4, four running processes are either using or waiting for the CPU. Load average
refers to the average of the computers load over several periods of time. This option enables you to calculate the load average per
processor and is available for Linux, Solaris, AIX and HP-UX platforms.
Default: Selected
cdm > Cluster
This section is only visible when monitoring clustered environments and displays the cluster resources associated to the monitored system.
The following fields are displayed for each resource:
Virtual Group: displays the resource group of the cluster where the host system of the robot is a node.
Cluster Name: displays the name of the cluster.
Cluster IP: displays the IP address of the cluster. This is used as the default source for alarm and QoS.
Alarm Source: defines the source of the alarms to be generated by the probe for cluster resources.
QoS Source: defines the source of the QoS to be monitored by the probe for cluster resources.
Note: The Alarm Source and QoS source fields can have the following values:
<cluster ip>
<cluster name>
<cluster name>.<group name>
The default value for both the fields is <cluster ip>
<hostname> node
Navigation: cdm > host name
This section lets you configure computer uptime, QoS data and system reboot alarms.
Disks node
Navigation: cdm > Disks
The Disks node lets you configure the global monitoring metrics and default attribute values for each individual disk. The node also includes the
shared drives of the host system. For example, cifs is a shared Windows disk that is mounted on the Linux environment, and gfs which is a
shared disk of a clustered environment.
cdm > Disks > Disk Configuration
This section lets you configure the time interval and number of samples for fetching metric values from the system. These properties are
applicable for all the monitoring disks of the system.
Interval (minutes): specifies the time in minutes for how often the probe retrieves sample data.
Default: 15
Samples: specifies how many samples the probe is keeping in memory for calculating average and threshold values.
Default: 4
Note: In case, the Send alarm on each sample option is not selected, the probe waits for the number of samples then sends the
alarm. Even if you set the sample value as 0, the QoS for disk are generated based on the default sample value.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, the interval is
set to 5 minutes and number of samples is set to 5, the disk utilization is the average for the last 25 minutes.
Default: 1
Ignore Filesystems: defines the file system to be excluded from monitoring. For example, specifying the regular expression C:\\ in this
field excludes the Disk C of the system from monitoring and also stops displaying the disk in navigation pane.
Note: On UNIX platforms, use the regular expression "/\" to exclude the root directory (/) from monitoring.
Timeout: specifies the time limit in seconds for the probe for collecting the disk-related data. This option is useful at time of disk fail/crash
in stale File system and avoid a hang situation for the probe. Default: 10
Filesystem Type Filter: specifies the type of the file system to be monitored as a regular expression. For example, specifying ext* in this
field enables monitoring of only file systems such as ext4 or ext5.
Note: The first three fields are common to Memory and Processor configuration sections.
Note: The disk read throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Note: The disk write throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Note: The disk total throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Important! You can only add shared disks to be monitored in Windows robots.
<diskname> node
Navigation: cdm > host name > Disks > disk name
The disk name node lets you configure alarms and QoS for disk availability and size for an individual local or cluster disk.
Disk Missing: configure QoS disk availability status and generate alarm when the probe fails to connect with the disk.
Disk Size: configure QoS disk size and generate alarm when the probe fails to calculate the disk size.
Note: The configuration of disk size alarms and QoS are supported only on the Windows, Linux and AIX platforms.
The following attributes are common to many probe configuration fields in the cdm user interface. Here they pertain to disk usage, elsewhere they
pertain to memory or CPU usage, depending on context.
Enable High Threshold: enables the high threshold for disk usage change. This threshold is evaluated first and if it is not exceeded,
then the low threshold is evaluated.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the high threshold.
Enable Low Threshold: enables the low threshold for disk usage change. This threshold is evaluated only if the high threshold has not
been breached.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the low threshold.
Note: The alarms are generated for free disk space and QoS are generated for disk usage.
<shareddiskname> node
Navigation: cdm > host name > Disks > shared disk name
A shared network disk is added under the Disks node in the navigation pane. You can select the shared disk and update user name, password,
and disk availability monitoring properties.
Enable Space Monitoring: This section allows you to enable network disk usage monitoring for the profile by selecting the Enable Space
Monitoring checkbox.
Network Connection: This section allows you to view or edit the user credentials for the shared network disk specified in the Add New Share wi
ndow while creating a network disk monitoring profile.
Shared Folder Availability: This section allows you to specify or edit the thresholds and alarms for the availability state of the network disk.
Note: The Disk Usage and the Disk Usage Change nodes for the <shareddiskname> node are the same as defined for the
<diskname> node.
Memory node
Navigation: cdm > Memory > Memory Configuration
At the Memory level, set or modify the following global memory attributes based on your requirements.
The fields are common to all three probe configuration sections (Disks, Memory, Processor).
Interval (minutes): specifies the time in minutes for how often the probe retrieves sample data.
Default: 5
Samples: specifies how many samples the probe should keep in memory to calculate average and threshold values. Default: 5 If you did
not select the Send alarm on each sample check box in the Probe Configuration details pane, the probe waits for the number of samples
(specified in this field) before sending the alarm. Do not specify 0 (Zero) in this field.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, If the interval is
set to 5 minutes and number of samples is set to 5, the memory utilization reported will be the average for the last 25 minutes.
Default: 1
Set QoS Target as 'Memory': sets the QoS target to Memory.
Default: Not selected
Network node
Navigation: cdm > Network
This node lets you monitor the outbound and inbound traffic of your system Network Interface Card (NIC). The NIC monitoring lets you analyze
the network bandwidth that is being utilized which can impact the overall network performance. For example, your NIC capacity is 100 MBPS and
aggregated traffic is more than 90 MBPS then it can slow down the data transfer rate. This monitoring helps you take preventive actions before
the network goes down. For example, upgrade your NIC or install more NICs and implement the load-balancing solution.
This node lets you monitor the following network metrics:
Inbound Traffic: Monitors the traffic coming from LAN or a public network to the monitored system in bytes per second.
Outbound Traffic: Monitors the traffic going from the monitored system to LAN or a public network in bytes per second.
Aggregated Traffic: Monitors both inbound and traffic in bytes per second.
Important! The probe monitors only physical NICs of system and sum up the metric values when multiple NICs are installed on the
monitored system.
Note: These network metrics are supported only on the Windows, Linux and AIX platforms.
Processor node
Navigation: cdm > Processor
The Processor node lets you configure processor-related metrics and their corresponding time interval for fetching the monitoring data. The probe
lets you configure the number of samples and returns the average of computed values. All calculations are based on the number of CPU ticks
returned, for example, the /proc/stat command returns in Linux. The probe adds the column values (user, nice, system, idle, and iowait) for
calculating the total CPU ticks. In a multi-CPU environment, the total for all CPU column values are added.
Similarly, the delta values are calculated by comparing the total CPU tick values of last and current interval. Then, the percentage values are
calculated for each column based on the total CPU ticks value. The QoS for total CPU value is the sum of CPU System, CPU User, and (if
configured) CPU Wait.
Configure the following fields:
Interval (minutes): specifies the time in minutes for how often the probe retrieves sample data.
Default: 5
Samples: specifies how many samples the probe should keep in memory to calculate average and threshold values.
Default: 5
Note: If you did not select the Send alarm on each sample checkbox, under the cdm node -> General Configuration section, the
probe waits for the number of samples (specified in this field) before sending the alarm. Do not specify 0 (Zero) in this field.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, If the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization reported will be the average for the last 25 minutes.
Set QoS Target as 'Total': Select this checkbox if you want the QoS target to be set to Total.
Default: 5
Include CPU Wait in CPU Usage: includes the CPU Wait in the CPU Usage calculation.
Number of CPUs: displays the number of CPUs. This is a read-only field.
Maximum Queue Length: indicates the maximum number of items in the queue before an alarm is sent.
Alarm Message: sends the alarm message when the queue has been exceeded.
Top CPU consuming processes in alarm: defines the total number of top CPU consuming processes. Consider the following points
while using this option:
This alarm is generated when the defined total CPU usage is breached. The new alarms generate the process information in the
following format:
[processname[pid]-cpu%]; [processname[pid]-cpu%]
The actual CPU value in the alarm may not always match the total percentage of all the top CPU consuming processes shown in the
alarm message. It may vary as Total CPU Usage is calculated on the basis of samples. The Top CPU consuming processes fetch the
raw data at a given time will be displayed in the alarm.
For non-Windows platform, the probe uses ps command to retrieve the top CPU consuming processes.
Notes:
The CpuErrorProcesses and CpuWarningProcesses messages are available only on fresh installation of the probe version
5.6. If you upgrade the probe from previous versions, you need to select these messages from Alarm messages drop-down
list in Total CPU Usage section.
The variable '$processes' is added from probe version 5.6 and later.
The Individual CPU node lets you configure metrics for monitoring each CPU node of the host system. You can configure appropriate setting for
the following sections:
CPU Usage Difference - lets you monitor the difference in percent of CPU usage between two successive intervals.
Individual CPU Idle - lets you generate QoS data on the amount of time when CPU is not busy. In other words, CPU is running the
System Idle Process.
Individual CPU System - lets you generate QoS data on the amount of time during which CPU executed processes in kernel mode.
Individual CPU Usage - lets you generate QoS data for monitoring CPU usage in percent as compared to the CPU capacity.
Individual CPU User - lets you generate QoS data on the amount of time during which CPU executed processes in kernel mode.
Individual CPU Wait - lets you generate QoS data on the amount of time during which CPU is waiting for I/O process to complete.
Maximum CPU Usage - lets you generate alarm when the CPU usage percent breaches the maximum usage limit.
The probe executes the iostat command for fetching the iostat monitors value. The QoS values are obtained from the second sample value of the
devices.
Set or modify the following values, as needed:
Interval (minutes): defines the time interval for fetching the sample values from the device. Default: 5
Sample: defines the time interval in seconds, which is used with iostat command for fetching the iostat data for that time duration. This
value must be less than Interval (minutes) field value.
Default: 10
Ignore Iostat Devices: defines the iostat devices to be excluded from monitoring. For example, specifying the regular expression
sda|sdb in this field excludes the sda and sdb iostat devices from monitoring.
Operating System
Linux
Iostat Monitors
Iostat Average Queue Length
Iostat Average Request Size
Iostat Average Service Time (Linux)
Iostat Average Wait Time (active, by default)
Iostat Read Requests Merged Per Second
Iostat Reads Per Second
Iostat Sector Reads Per Second
Iostat Sector Writes Per Second
Iostat Utilization Percentage (active, by default)
Iostat Write Requests Merged Per Second
Iostat Writes Per Second
Solaris
AIX
The probe detects the underlying OS and filters the list of monitors. This section lets you enable the iostat monitoring for the device. This option is
disabled, by default.
This section represents the actual monitor name of the device for configuration.
QoS Name: identifies the QoS name of the monitor.
Units: identifies a unit of the monitor. For example, % and Mbytes.
Publish Data: publishes the QoS data of the monitor.
Enable High Threshold: lets you configure the high threshold parameters. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatError
Enable Low Threshold: lets you configure the low threshold parameters. Typically, the low threshold generates a warning alarm and the
high threshold generates an error alarm. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatWarning
Similarly, you can configure other monitors because each monitor contains the same set of fields.
cdm IM Configuration
This article describes the configuration concepts and procedures to set up the cdm probe. You can configure the probe to monitor CPU, disk or
memory of the system on which the probe is deployed.
The following diagram outlines the process to configure the probe to monitor CPU, disks and memory.
Important! You must use cdm version 5.61 with cluster version 3.33 to view the cluster disks on the cdm Infrastructure Manager (IM).
Contents
Verify Prerequisites
Configure Disk Monitoring
Configure CPU Monitoring
Configure Memory Monitoring
Probe Defaults
How to Copy Probe Configuration Parameters
Edit the Configuration File Using Raw Configure
Using Regular Expressions
Verify Prerequisites
Verify that required hardware and software is available and any installation consideration is met before you configure the probe. For more
information, see cdm (CPU, Disk, Memory Performance Monitoring) Release Notes.
4. Click the Control Properties tab under the Setup tab. Specify the monitoring interval, in minutes, when the probe requests the disk data.
5. Save the configuration.
The selected disk is now being monitored.
Note: You can add network disks to be monitored using Windows robots using the New Share option in the Disk Usage section of the
Status tab. You can monitor availability state and usage of network disks using any robot where the probe is deployed by clicking the E
nable Space Monitoring option in the Disk Usage section of the Status tab.
Probe Defaults
You can use the sample configuration file to configure a probe with default monitoring values.
Follow these steps:
1. Navigate to the Program Files\Nimsoft\Probes\System\<probe_name> folder.
2. Make the desired configuration in the <probe_name>.cfg file.
3. Run/restart the probe in Infrastructure Manager to initialize the configuration.
You can now use the newly added default monitoring values, such as templates, in the left pane as per requirement.
Note: When you perform this operation with the cdm probe, you must ensure that the disk partitions are the same on the source and
the target computers.
For example, If the source computer has a C: and a D: partition, and you copy the cdm probe configuration to a cdm probe on a computer with
only a C: partition, the cdm probe on this computer will try to monitor a D: partition (which is missing) and report an error.
Follow these steps:
1. Log on to the robot where your configured cdm probe resides.
2. Select the cdm probe to be copied from the probe list in the Infrastructure Manager and drag and drop the probe into the Archive.
3. Click Rename and enter a unique package name the copy of the cdm probe archive. For example, rename the package to cdm_master.
The distribution progress window appears and the configuration of the probe is completed after distribution process is finished.
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info: makes the Device Id of shared drive and local drive identical. Navigate to the setup folder and set this key as No
to enable this feature.
Default: Yes
Note: On UNIX platforms, use the regular expression "/\" to exclude the root directory (/) from monitoring.
A regular expression (regex for short) is a special text string for describing a search pattern. Constructing regular expression and pattern matching
requires meta characters. The probe supports Perl Compatible Regular Expression (PCRE) which are enclosed within forward slash (/). For
example, the expression /[0-9A-C]/ matches any character in the range 0 to 9 in the target string.
You can also use simple text with some wild card operators for matching the target string. For example, *test* expression matches the text test in
target string.
The following table describes some examples of regex and pattern matching for the cdm probe.
Regular
expression
Type of regular
expression
Explanation
[A-Z]
Standard (PCRE)
[A-Z:\\]
Custom
Matches with the Uppercase character type of the local disk available on the respective
box
Standard (PCRE)
[*.\\]
Custom
Standard (PCRE)
\d*
Custom
The CPU, Disk and Memory Monitor (cdm) probe configuration interface displays a screen with tabs for configuring this probe. This probe can be
set up in three types of environments: single computer, multi-CPU and cluster.
Contents:
Setup Tab
General Tab
Control Properties Tab
Message Definitions Tab
Cluster Tab
Edit Alarm or QoS Source
Status Tab
Disk Usage
Disk Usage Modification
New Share Properties
UNIX platforms
Edit Disk Properties
Delete a Disk
Modify Default Disk Parameters
Enable Space Monitoring
The Multi CPU Tab
Advanced Tab
Custom Tab
New CPU Profile
New Disk Profile
New Memory Profile
Setup Tab
The Setup tab is used to configure general preferences for the probe. There are three tabs within this tab: General, Control Properties and Mes
sage Definitions. A fourth tab, the Cluster tab, appears if the probe is running within a clustered environment.
General Tab
Log level
Specifies the level of details that are written to the log file.
Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when debugging.
Log size
Specifies the size of the probe's log file where probe-internal log messages are written. Upon reaching this size, the contents of the file
are cleared.
Default: 100 KB
Send alarm on each sample
If selected, the probe generates an alarm on each sample. If not selected, the probe waits for the number of samples (specified in the Sa
mples field of the Control properties tab) before sending the alarm. This check box is selected by default.
For example, if the value in the Interval field is set to 1 minute and the value in the Samples field is set to 2, under the Control
Properties tab, and if this:
Option is Unchecked: the first alarm will be generated in 2 minutes and the respective alarms will be generated in 1 minute time
interval each.
Option is Checked: the first alarm will be generated in 1 minute and the respective alarms will be generated in 1 minute time interval
each.
Note: The sample collected at the start of the probe is considered as the first sample. The sample count is cleared on de-activation of
the probe. For more details about the samples, see the Control Properties tab.
The Control Properties tab defines the time limit after which the probe asks for data and the number of samples the probe should store to
calculate the values used to determine the threshold breaches.
The fields displayed in the above dialog are divided into the following three sections:
Disk properties
CPU properties
Memory & Paging properties
The field description of each section is given below:
Interval
Specify the time limit in minutes between probe requests for data. This field is common for all three sections.
Samples
Allows you to specify how many samples the probe should store for calculating values used to determine threshold breaches. This field is
common for all three sections.
Note: Even if you set the sample value as 0, the QoS for disk are generated based on the default sample value.
Note: On UNIX platforms, use the regular expression "/\" to exclude the root directory (/) from monitoring.
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE 'qos_cpu_usage'
AND target NOT IN('user','system','wait','idle')
To update table for new target:
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE
'qos_cpu_multi_usage' AND (target NOT LIKE 'User%' AND target NOT LIKE 'System%'
AND target NOT LIKE 'Wait%' AND target NOT LIKE 'Idle%')
To update table for new target:
The Message Definitions tab enables you to customize the sent messages whenever a threshold is breached. A message is defined as a text
string with a severity level. Each message has a token that identifies the associated alarm condition.
Message Pool
This section lists all messages with their associated message ID. You can right-click in the message pool window to create a new
message and edit/delete an existing message.
Active Messages
This section contains tabs to allow you to associate messages with the thresholds. You can drag the alarm message from the message
pool and drop it into the threshold field. The available tabs are explained below:
CPU
High (error) and Low (warning) threshold for total CPU usage.
High (error) threshold for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the
threshold).
CPU difference threshold (alarms are sent when the difference in CPU usage between different CPUs in multi-CPU systems breaches
the threshold).
Disk
To modify the thresholds for disks, double click the disk-entries under the Status tab.
Memory
Depends on what memory view is selected in the memory usage graph, where you may toggle among three views (see the Status
tab).
High (error) and Low (warning) threshold for pagefile usage and paging activity
Physical memory
Swap memory
Computer
Allows you to select the alarm message to be issued if the computer is rebooted.
Default: The time when the computer was rebooted.
Other
You can select the alarm message to be sent if the probe is not able to fetch data.
Default: Contains information about the error condition.
Cluster Tab
The Cluster tab is displayed only when the cdm probe is hosted in clustered environment and it is configured as a part of a cluster.
It displays a list of detected virtual groups belonging to the cluster. By editing the entries (refer Edit Alarm or QoS source section), you can set
the alarm source and QoS source to be used for disks belonging to that virtual group.
The available options for alarm source and QoS source are:
<cluster ip>
<cluster name>
<cluster name>.<group name>
Edit Alarm or QoS Source
Status Tab
The Status tab sets up high and low thresholds for the CPU, memory and paging activity for the selected file system. It is also the default tab of
the cdm probe GUI.
The disk usage section displays the details of all disks installed on the system and the disk usage details such as file system type, amount of free
space and total disk usage. You can monitor each disk individually, with individual threshold values, messages and severity levels.
Note:
The probe uses the mount entries as in /proc/mounts file in Linux to display the file system type of devices that are remounted
to a different location.
When using NFS mounts in the cdm probe, be aware that the server where the mount point is pointing will appear in the
discovery in USM.
You can modify the monitoring properties of disk by right-clicking on a monitored disk in the list.
Use the New Share option to modify the disk usage properties.
You can specify the network disk or folder to be monitored by the cdm probe.The network location is specified in the Share field using the format:
\\computer\share. In addition, specify the user name and password to be used when testing the availability of the share, and the Message ID to be
sent if a share is determined to be unavailable. You can use the domain user if the machine is a member of a domain.
Select the Folder Availability Quality of Service Message option to send QoS messages on availability of the shared folder.
Note: For UNIX platforms, this option is used to monitor NFS file systems.
To enable or disable space monitoring of the Windows share/mounted drive, right-click a monitored windows share/mounted drive in the list and
select the enable/disable space monitoring option.
Note: The shares are tested from the service context, and the cdm probe just checks that it is possible to mount the share.
UNIX platforms
To enable/disable space monitoring of the file system, right-click a monitored NFS file system in the list and select the enable/disable space
monitoring option. Enabling space monitoring of a NFS file system may cause problems for the cdm probe if the communication with the NFS
server is disrupted (e.g. stale NFS handles). By default, the NFS file systems are monitored for availability only.
Edit Disk Properties
The disk usage configuration GUI displays tabs for each section of the disk configuration, which are explained below:
Disk usage and Thresholds tab
The page displays the amount of total, used, and free disk space for the file system.
You can configure the following threshold settings:
Monitor disk using either Mbytes or %.
High threshold for the disk. If you select this option, set the value (based on either Mbytes or %) and select the alarm message to be
sent. When the amount of free space gets below this value, the specified alarm message will be sent. This threshold is evaluated first and
if it is not exceeded, then the low threshold is evaluated.
Low threshold for the disk. If you select this option, set the value (based on either Mbytes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be sent. This threshold is evaluated only if the
high threshold has not been exceeded.
You can configure the Quality of Service message, which can have information about the disk usage in Mbytes, % or both depending on
your selections.
Inode Usage and Thresholds tab
This tab is only available for UNIX systems; otherwise it remains disabled. The tab indicates the amount of total, used, and free inodes on the file
system.
You can configure the following threshold settings:
Monitor disk using either inodes or %.
High threshold for the disk. If you select this option, set the value (based on either inodes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be sent.
Low threshold for the disk. If you select this option, set the value (based on either inodes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be issued.
You can configure the Quality of Service message, which can have information about the disk usage in inodes, % or both depending on your
selections.
Disk Usage Change and Thresholds tab
This tab lets you specify the alarm conditions for alarms to be sent when changes in disk usage occur.
Disk usage change calculation
You can select one of the following:
Change summarized over all samples. The change in disk usage is the difference between the latest sample and the first sample in
the "samples window". The number of samples the cdm probe will keep in memory for threshold comparison is set as Samples on
the Setup > Control Properties tab.
Note: There may be some discrepancy between the values in QoS and values in alarms when the Change summarized over all
samples option is selected. This is because the QoS are generated on every interval and Alarms are generated based on the selection
of the option Change summarized over all samples.
Change between each sample. The change in disk usage will be calculated after each sample is collected.
Threshold settings
This section allows you to define the alarm conditions:
Type of change. You can select whether alarms should be issued on increase, decrease or both increase and decrease in disk usage.
High threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be sent. The default value is 2 Mbytes.
Low threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be issued. The default value is 2 Mbytes.
QoS
You can send QoS messages on disk usage change in Mbytes.
Delete a Disk
Use this option to delete the disk from being monitored by the cdm probe. When you use the Delete option a confirmation dialog appears. Click Y
es to delete the disk from the list.
Modify Default Disk Parameters
Use the Modify Default Disk Parameters option to change fixed disk properties.
If you modify the default settings than every disk that you add from that point forward will have the new settings as the default disk properties.
Enable Space Monitoring
The Enable space monitoring option appears only for the shared drive/folder (using the New Share... option) being monitored by the cdm probe.
To enable/disable space monitoring of the windows share/mounted drive/NFS file system, right-click a monitored windows share/mounted drive/
NFS file system in the list and select the enable/disable space monitoring option.
Note: This tab only visible when the cdm probe is running on a multi-CPU computer.
A multi-core processor (multi-CPU) is a single computing component with two or more independent actual processors (called "cores"), which are
the units that read and execute program instructions. A multi-core processor implements multiprocessing in a single physical package.
This tab contains a graph displaying the alarm threshold and the CPU usage for each processor in a multi-CPU configuration.
The thresholds and options available in the above dialog are explained below:
Maximum
High (error) threshold (in %) for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the
threshold).
Difference
CPU difference threshold (in %). Alarms are sent when the difference in CPU usage among the CPUs in a multi-CPU system
breaches the threshold).
Select processors to view
Select the processor(s) to view in the graph. By default all available processor are shown.
Click Update to refresh the graph with the most current sample values.
Advanced Tab
The Advanced tab enables you to customize the QoS messages, for example an alarm on processor queue length, an alarm on detected reboot,
and paging measurements.
The values of the Load Average metrics are calculated per processor. These are dependent on the field Calculate Load
Average per Processor under the Setup -> General tab.
Sampling is not applicable for the three Load Average Metrics.
Memory Usage
Measures the amount of total available memory (physical + virtual memory) used in Mbytes.
Memory in %
Measures the amount of total available memory (physical + virtual memory) used in %.
Memory Paging in Kb/s
Measures the amount of memory that has been sent to or read from virtual memory in Kbytes/second.
Memory Paging in Pg/s
Measures the amount of memory that has been sent to or read from virtual memory in pages per second.
Note: If you have been running CDM version 3.70 or earlier, the QoS settings in the cdm probe GUI are different than CDM
version 3.72. However, if CDM version 3.70 or earlier already has created QoS entries in the database for kilobytes per second
(Kb/s) and/or pages per second (Pg/s), these entries will be kept and updated with QoS data from the newer CDM version (3.72
and higher).
only one processor queue. The processor queue length is a measurement of the last observed value, and it is not an average of any kind.
Alarm messages are generated according to the specified threshold value .
Default: 4.
Notes:
If running on a multi-CPU system, the queued processes will be shared on the number of processors. For example, if running
on a system with four processors and using the default Max Queue Length value (4), alarm messages will be generated if the
number of queued processes exceeds 16.
To enable the QoS metric QOS_PROC_QUEUE_LEN for per CPU, you are required to add a key system_load_per_cpu with
value as Yes under the CPU section through the raw configure option. The probe calculates the system load on Linux, Solaris
and AIX as Load/Number of CPU if this key is set to Yes.
Lparstat i command
Total Capacity =( maxVirtualCPU/ maxCapacity)*100;
CPU User = CPU user *EntCap)/TotCapacity;
cpuStats->fSystem = (double)((cpuStats->fSystem *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fWait = (double)((cpuStats->fWait *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fIdle = (double)((cpuStats->fIdle *
cpuStats->fEntCap)/TotCapacity);
Top CPU consuming processes in alarm: defines the total number of top CPU consuming processes. Consider the following points
while using this option:
This alarm is generated when the defined total CPU usage is breached. The new alarms generate the process information in the
following format:
[processname[pid]-cpu%]; [processname[pid]-cpu%]
The actual CPU value in the alarm may not always match the total percentage of all the top CPU consuming processes shown in
the alarm message. It may vary as Total CPU Usage is calculated on the basis of samples. The Top CPU consuming processes
fetch the raw data at a given time will be displayed in the alarm.
For non-Windows platform, the probe uses ps command to retrieve the top CPU consuming processes.
Notes:
The CpuErrorProcesses and CpuWarningProcesses messages are available only on fresh installation of the probe
version 5.6. If you upgrade the probe from previous versions to 5.6, you must drag and drop these messages from
the Message Pool list under the Setup > Message definitions tab.
The variable '$processes' is added from probe version 5.6 and later.
Paging measured in
Paging can be measured in Kilobytes per second or pages per second.
Paging is the amount of memory which has been sent to or read from virtual memory. This option lets you select the paging to be
measured in one of the following units:
Kilobytes per second (KB/s)
Pages per second (Pg/s). Note that the size of the pages may vary between different operating systems.
Note: When changing the paging selection, the header of the Paging graph on the Status tab immediately changes to show the
selected unit, but the values in the graph do not change until the next sample is measured.
Custom Tab
The Custom tab displays a list of all currently defined custom profiles. Custom profiles are used to get additional thresholds and alarms for
checkpoints that are available in the probe. All the alarm situations are available, except for those available for multi-CPU and cluster disks. A
custom profile allows you to fine-tune monitoring of resources for alarming purposes.
The alarms for each custom profile will be sent using suppression keys unique to the profile so that you can get multiple alarms for what is
basically the same alarm situation (for instance, the a breach of the memory usage threshold).
You can right-click inside the above dialog to create new custom profiles to monitor the CPU, disk or memory. Once a custom profile is created
you can select one or more custom profiles to edit, delete or activate/deactivate as and when required.
New CPU Profile
This alarm is generated when the defined total CPU usage is breached. The new alarms generate the process information in the
following format:
[processname[pid]-cpu%]; [processname[pid]-cpu%]
The actual CPU value in the alarm may not always match the total percentage of all the top CPU consuming processes shown in the
alarm message. It may vary as Total CPU Usage is calculated on the basis of samples. The Top CPU consuming processes fetch the
raw data at a given time will be displayed in the alarm.
For non-Windows platform, the probe uses ps command to retrieve the top CPU consuming processes.
Notes:
The CpuErrorProcesses and CpuWarningProcesses messages are available only on fresh installation of the probe
version 5.6. If you upgrade the probe from previous versions to 5.6, you must drag and drop these messages from the M
essage Pool list under the Setup > Message definitions tab.
The variable '$processes' is added from probe version 5.6 and later.
High and Low: activates the alarm generation in case high, or low threshold values of selected checkpoint are breached.
New Disk Profile
You can create a custom profile for a local disk, shared disk, or for a disk available on a network.
Note: on selecting this option, the drop-down menu Mount point and the field Remote Disk are disabled which means that
monitoring is enabled either through the regular expression or through the drop-down menu.
Active: activates the alarm generation if the disk is unavailable or not mounted.
Allow space monitoring: lets you configure three new checkpoints to monitor the disk, which are Disk free space, Inodes free, Space usage
change.
For more information on these checkpoints, refer to the Control Properties Tab section.
Note: You are required to enable NFS drives from the Status tab to see custom NFS inode alarms.
cdm Troubleshooting
This article contains the troubleshooting points for the cdm probe.
Contents
# uname -a
# mount
# df -k (on systems that support the -k option)
AIX
# /usr/bin/vmstat
# /usr/sbin/sar -P ALL
# /usr/bin/uptime
HP-UX
No commands need to be executed on this platform.
LINUX
#
#
#
#
cat
cat
cat
cat
/proc/stat
/proc/vmstat (if applicable)
/proc/meminfo
/proc/loadavg
Solaris
# /usr/bin/mpstat 60 (note: runs until stopped with Ctrl-C, get at least two
iterations)
# /usr/bin/uptime
Tru64
swap information between CDM and the swap utility, you can take the blocks swap reports and run it through the formula: (blocks * 512) / (1024 *
1024) = total_swap Mb. This is the same number of MB the CDM probe uses in its calculations.
TOP, on the other hand, gathers information about anonymous pages in the VM, which is quicker and easier to gather but do not represent a true
picture of the amount of swap space available and used. The reason is that anonymous pages also take into account physical memory that is
potentially available for use as swap space. Thus, the TOP utility reports more total swap space since it is also factoring in physical memory not in
use at this time.
The cdm probe and TOP gather physical memory information in similar ways, so the differences in available physical memory should be
insignificant.
Since the cdm probe does not differentiate between available swap and physical memory (after all, it is only when you run out of both the
resources that things stop working on the system), the accumulated numbers are used. The accumulated numbers for TOP will be off, since the
free portions of physical memory will be counted twice in many instances.
Thus, the cdm probe does not provide the data in the same format that TOP does, to give a clear picture of the memory/swap usage on the
system.
bash-3.2# lparstat -i
Maximum Virtual CPUs
: 4
Maximum Capacity
: 4.00
03/08/13
%usr
%sys
%wio
%idle
physc
%entc
99
0.25
249.3
100
0.25
250.5
100
0.25
250.0
100
0.25
249.9
100
1.00
999.4
If you select the option Cpu Stats against entitled capacity, it is calculated as
(%usr
X %entc)/Total Capacity
Similarly, you can calculate entitled capacity for system and idle CPU utilization.
cdm Metrics
This article describes the metrics that can be configured using the CPU, Disk, and Memory Performance (cdm) probe.
QoS Metrics
CPU
Disk
Memory
Miscellaneous
Network
Iostat Monitors: Linux Platform
Iostat Monitors: Solaris Platform
Iostat Monitors: AIX Platform
Alert Metrics Default Settings
Disk Usage and Thresholds (Disk Error)
Disk Usage Change and Thresholds (Delta Error)
Inode Usage and Thresholds
lostat
QoS Metrics
The following tables list all the QoS metrics generated by the cdm probe.
CPU
Monitor Name
QoS Name
Units
Description
Version
Individual CPU
Idle
QOS_CPU_MULTI_USAGE
Percent
The percentage of time when an individual CPU of the system was idle.
4.7
The percentage of time when an individual CPU of the system was executing
the kernel or operating system.
4.7
Individual CPU
Usage
The percentage of time for which an individual CPU of the system was used.
4.7
Individual CPU
User
The percentage of time for which an individual CPU of the system was execut
ing in user mode.
4.7
Individual CPU
Wait
The percentage of time for which an individual CPU of the system was waitin
g for I/O.
4.7
The percentage of time when all CPUs of the system were idle.
4.7
The sum of CPU time when all CPUs of the system were executing the kernel
or operating system.
4.7
Total CPU
Usage
The percentage of time for which all CPUs of the system were used.
4.7
The percentage of time for which all CPUs of the system were executing in
user mode.
4.7
The percentage of time for which all CPUs of the system were waiting for I/O.
4.7
Individual CPU
System
Total CPU
System
QOS_CPU_USAGE
(all of these metrics are calculated
from this monitor)
Percent
Disk
Monitor Name
QoS Name
Units
Description
Version
QOS_DISK_DELTA
Megabytes
4.7
QOS_DISK_USAGE
Megabytes
4.7
QOS_DISK_USAGE_PERC
Percent
4.7
QOS_DISK_READ_THROUGHPUT
Bytes/Second
5.1
QOS_DISK_WRITE_THROUGHPUT
Bytes/Second
5.1
QOS_DISK_TOTAL_THROUGHPUT
Bytes/Second
5.1
Disk Size
QOS_DISK_TOTAL_SIZE
Gigabytes
5.1
Disk Available
QOS_DISK_AVAILABLE
Available
5.1
QOS_INODE_USAGE
Inodes
4.7
QOS_INODE_USAGE_PERC
Percent
4.7
Memory
Monitor
Name
QoS Name
Units
Description
Version
Memory
Usage (MB)
QOS_MEMORY_USAGE
Megabytes
4.7
Memory
Paging (KB/s)
QOS_MEMORY_PAGING
Kilobytes/Second
4.7
Memory
Paging (Pg/s)
QOS_MEMORY_PAGING_PGPS
Pages/Second
4.7
Memory
Usage (%)
QOS_MEMORY_PERC_USAGE
Percent
4.7
Physical
Memory (MB)
QOS_MEMORY_PHYSICAL
Megabytes
5.4
Physical
Memory (%)
QOS_MEMORY_PHYSICAL_PERC
Percent
4.7
Swap Memory
(MB)
QOS_MEMORY_SWAP
Megabytes
4.7
Swap Memory
(%)
QOS_MEMORY_SWAP_PERC
Percent
4.7
System
Memory
Utilization (%)
QOS_MEMORY_SYS_UTIL
Percent
5.1
User Memory
Utilization (%)
QOS_MEMORY_USR_UTIL
Percent
5.1
Miscellaneous
Monitor Name
QoS Name
Units
Description
Version
QOS_LOAD_AVERAGE_1MIN
Count
5.5
Note: This metric is supported only on the Linux, Solaris, AIX and HP-UX
platforms.
QOS_LOAD_AVERAGE_5MIN
Count
5.5
Note: This metric is supported only on the Linux, Solaris, AIX and HP-UX
platforms.
QOS_LOAD_AVERAGE_15MIN
Count
5.5
Note: This metric is supported only on the Linux, Solaris, AIX and HP-UX
platforms.
System Load
QOS_PROC_QUEUE_LEN
Processes
4.7
Folder Available
QOS_SHARED_FOLDER
Available
4.7
Computer Uptime
Hourly
QOS_COMPUTER_UPTIME
Seconds
It contains information detailing how long the system has been on since its
last restart.
4.7
Network
Monitor Name
QoS Name
Units
Description
Version
QOS_NETWORK_INBOUND_TRAFFIC
Bytes/Second
5.1
Network Outbound
Traffic
QOS_NETWORK_OUTBOUND_TRAFFIC
Bytes/Second
5.1
Network Aggregated
Traffic
QOS_NETWORK_AGGREGATED_TRAFFIC Bytes/Second
5.1
Note: These metrics are supported only on the Windows, Linux and AIX platforms.
QoS Name
Units
Description
Version
QOS_IOSTAT_RRQM_S
ReadReqMerged/Sec
Total Iostat read requests merged per second that were queued to
the device
5.1
QOS_IOSTAT_WRQM_S
WriteReqMerged/Sec
Total Iostat write requests merged per second that were queued to
the device
5.1
QOS_IOSTAT_RS
Reads/Sec
The number of read requests that were issued to the device per
second
5.1
QOS_IOSTAT_WS
Writes/Sec
The number of write requests that were issued to the device per
second
5.1
QOS_IOSTAT_SEC_RS
SectorReads/Sec
5.1
QOS_IOSTAT_SEC_WS
SectorWrites/Sec
5.1
QOS_IOSTAT_AR_SZ
Sectors
The average size (in sectors) of the requests that were issued to the
device
5.1
QOS_IOSTAT_AQ_SZ
QueueLength
The average queue length of the requests that were issued to the
device
5.1
QOS_IOSTAT_AWAIT
Milliseconds
The average time for I/O requests issued to the device. This
includes the time spent by the requests in queue and the time spent
servicing them.
5.1
QOS_IOSTAT_SVCT
Milliseconds
The average service time for I/O requests that were issued to the
device.
5.1
Iostat Utilization
Percentage
QOS_IOSTAT_PU
Percent
5.1
QOS_IOSTAT_KRS
Kilobytes/Sec
5.2
QOS_IOSTAT_KWS
Kilobytes/Sec
5.2
QoS Name
Units
Description
Version
QOS_IOSTAT_RS
Reads/Sec
5.1
QOS_IOSTAT_WS
Writes/Sec
5.1
QOS_IOSTAT_KRS
Kilobytes/Sec
5.1
QOS_IOSTAT_KWS
Kilobytes/Sec
5.1
QOS_IOSTAT_QLEN
QueueLength
5.1
QOS_IOSTAT_ACT
Transactions
5.1
QOS_IOSTAT_SVCT
Milliseconds
5.1
QOS_IOSTAT_PCTW
Percent
5.1
QOS_IOSTAT_PCTB
Percent
5.1
QoS Name
Units
Description
Version
QOS_IOSTAT_PCTA
Percent
5.1
QOS_IOSTAT_KBPS
Kilobytes/Sec
5.1
QOS_IOSTAT_TPS
5.1
QOS_IOSTAT_KR
Kilobytes
The amount of data read per second from the device in kilobytes
5.1
QOS_IOSTAT_KW
Kilobytes
5.1
QOS_IOSTAT_RS
Reads/Sec
5.2
QOS_IOSTAT_WS
Writes/Sec
5.2
QOS_IOSTAT_KRS
Kilobytes/Sec
5.2
QOS_IOSTAT_KWS
Kilobytes/Sec
5.2
The following tables list the Alert Metrics Default Settings, by type, for the cdm probe.
Alert Metric
Warning Threshold
Warning Severity
Error Threshold
Error Severity
Description
CPU Usage
75%
Warning
90%
Major
50%
Warning
90%
Major
85%
95%
60%
85%
150KB/ sec
Warning
Major
Warning Threshold
Warning
Severity
Error Threshold
20%
Major
10%
Disk usage
(Mb)
Error
Severity
Description
Warning Threshold
Disk usage
Warning Severity
Error Threshold
Error Severity
Description
200
Warning
Threshold
Warning
Severity
Error
Threshold
20
10
20
10
Inode Free
20
10
10
Error
Severity
Description
Warning
Maximum
MultiCPU CPU usage of single cpu
90
Difference
MultiCPU Difference in CPU usage between
CPUs
50
lostat
Alert
Metric
Warning
Threshold
Warning
Severity
Error
Threshold
Error
Severity
Description
IostatError
90
Major
This article describes the configuration concepts and procedures to set up the cdm probe. You can configure the probe to monitor CPU, disk or
memory of the system on which the probe is deployed.
The following diagram outlines the process to configure the probe to monitor CPU, disks and memory.
Contents
Verify Prerequisites
Configure Disk Monitoring
Configure CPU Monitoring
Configure Memory Monitoring
Configure Network Disk State Monitoring on Windows
Configure Network Disk State Monitoring on Linux and Solaris
Alarm Thresholds
Edit the Configuration File Using Raw Configure
Using Regular Expressions
Verify Prerequisites
Verify that required hardware and software is available and any installation consideration is met before you configure the probe. For more
information, see cdm (CPU, Disk, Memory Performance Monitoring) Release Notes.
Note: If the monitored environment also includes cluster disks, these disks are also included in the diskname node with the
same alarms configurations as local disks. However, for such environments, a Cluster section is displayed in the cdm node
that enables you to view and modify alarm and QoS sources for the cluster resources.
3. Expand the Disk Usage node. Enable the alarms (options are MB or Percentage) and QoS metrics in the Alarm Thresholds and Disk
Usage sections.
4. Navigate to the Disk Configuration section under the Disks node. Specify the monitoring interval, in minutes, when the probe requests
the disk data.
5. Save the configuration.
The selected disk is now being monitored.
2. Expand the Memory node and enable the alarms and QoS for Memory Metrics from the Alarm Thresholds section.
3. Specify the time interval in minutes during which the probe requests for data from the Memory Configuration section.
4. Save the configuration.
The system memory is now being monitored.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info: makes the Device Id of shared drive and local drive identical. Navigate to the setup folder and set this key as N
o to enable this feature.
Default: Yes
This feature is introduced because of the following two reasons:
When file system is mounted on Linux through cifs and the cdm probe is deployed fresh, the Device Id and Metric Id for QoS and
alarms for the respective mounted file system are missing.
On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size: indicates the total size of the disk. Navigate to disk > fixed_default and set this key to yes to enable this feature.
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
sliced_cpu_interval: avoids the delay caused by the probe while executing commands (such as SAR) that cause internal alarms.
Navigate to cpu folder and specify a value (in seconds).
sliced_memory_interval: avoids the delay caused by the probe for collecting data from commands (such as VMSTAT) that cause internal
alarms. Navigate to memory folder and specify a value (in seconds).
The following example explains the use of sliced_memory_interval key.
Configure the interval from Setup tab of Memory as 5 min. If you configure the value of sliced_memory_interval key as 10 seconds, then
the probe collects data through VMSTAT command for 290 seconds (that is, interval would be (5 min) 10 seconds). Thus, the probe does
not generate internal alarms and seamlessly generates QOS. If still the problem persists, you can increase the sliced_memory_interval few
more seconds.
The sliced_cpu_interval and sliced_memory_interval keys have been introduced for AIX platform in the Raw Configuration section.
Type of regular
expression
Explanation
[A-Z]
Standard (PCRE)
[A-Z:\\]
Custom
Matches with the Uppercase character type of the local disk available on the respective
box
Standard (PCRE)
[*.\\]
Custom
Standard (PCRE)
\d*
Custom
cdm node
<hostname> node
Disks node
<diskname> node
Disk Usage node
Disk Usage Change node
<diskname> Inode Usage node
<shareddiskname> node
Memory node
Memory Paging node
Physical Memory node
Swap Memory node
Network node
Processor node
Individual CPU node
Total CPU node
Iostat node (Linux, Solaris, and AIX)
Device Iostat node
cdm node
Navigation: cdm
This node lets you view the probe information, configure the logging properties and set data management values.
Set or modify the following values, as needed:
cdm > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
cdm > General Configuration
This section provides general configuration details.
Log Level: specifies the level of details that are written to the log file. Log as little as possible during normal operation to minimize disk
consumption, and increase the amount of detail when debugging.
Default: 0 - Fatal
Log size (KB): specifies the size of the log file to which the internal log messages of the probe are written, in kilobytes. When this size is
reached, new log file entries are added and the older entries are deleted.
Default: 100 KB
Send alarm on each sample: If selected, the probe generates an alarm on each sample where there is a threshold breach. If not
selected, the probe waits for the number of samples (specified in Samples in the cdm > Disk Configuration, cdm > Memory or cdm >
Processor configuration screens) before sending the alarm. The sample count is cleared on de-activation of the probe.
Send short name for QoS source: If selected, sends only the host name. If not selected, sends the full host name with domain.
Allow QoS source as target: A number of QoS messages, by default, use the host name as their target. If selected, the target name is
changed to be the same as the QoS source name.
Monitor iostat (Linux and Solaris only): Enables the iostat monitoring of the host system devices.
Count Buffer-Cache as Used Memory (Linux, Solaris, AIX and HP-UX only): Counts the buffer and cache memory as used memory
while monitoring the physical and system memory utilization. If not selected, the buffer and cache memory is counted as free memory.
Calculate Load Average Per Processor: For all Unix systems, the system load measures the computational work that the system is
performing. This means that if your system has a load of 4, four running processes are either using or waiting for the CPU. Load average
refers to the average of the computers load over several periods of time. This option enables you to calculate the load average per
processor and is available for Linux, Solaris, AIX and HP-UX platforms.
Default: Selected
cdm > Cluster
This section is only visible when monitoring clustered environments and displays the cluster resources associated to the monitored system.
The following fields are displayed for each resource:
Virtual Group: displays the resource group of the cluster where the host system of the robot is a node.
Cluster Name: displays the name of the cluster.
Cluster IP: displays the IP address of the cluster. This is used as the default source for alarm and QoS.
Alarm Source: defines the source of the alarms to be generated by the probe for cluster resources.
QoS Source: defines the source of the QoS to be monitored by the probe for cluster resources.
Note: The Alarm Source and QoS source fields can have the following values:
<cluster ip>
<cluster name>
<cluster name>.<group name>
The default value for both the fields is <cluster ip>
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, the interval is
set to 5 minutes and number of samples is set to 5, the disk utilization is the average for the last 25 minutes.
Default: 1
Ignore Filesystems: defines the file system to be excluded from monitoring. For example, specifying the regular expression C:\\ in this
field excludes the Disk C of the system from monitoring and also stops displaying the disk in navigation pane.
Timeout: specifies the time limit in seconds for the probe for collecting the disk-related data. This option is useful at time of disk fail/crash
in stale File system and avoid a hang situation for the probe. Default: 10
Filesystem Type Filter: specifies the type of the file system to be monitored as a regular expression. For example, specifying ext* in this
field enables monitoring of only file systems such as ext4 or ext5.
Note: The first three fields are common to Memory and Processor configuration sections.
Note: The disk read throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Note: The disk write throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Note: The disk total throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Important! You can only add shared disks to be monitored in Windows robots.
<diskname> node
Navigation: cdm > host name > Disks > disk name
The disk name node lets you configure alarms and QoS for disk availability and size for an individual local or cluster disk.
Disk Missing: configure QoS disk availability status and generate alarm when the probe fails to connect with the disk.
Disk Size: configure QoS disk size and generate alarm when the probe fails to calculate the disk size.
Note: The configuration of disk size alarms and QoS are supported only on the Windows, Linux and AIX platforms.
The following attributes are common to many probe configuration fields in the cdm user interface. Here they pertain to disk usage, elsewhere they
pertain to memory or CPU usage, depending on context.
Enable High Threshold: enables the high threshold for disk usage change. This threshold is evaluated first and if it is not exceeded,
then the low threshold is evaluated.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the high threshold.
Enable Low Threshold: enables the low threshold for disk usage change. This threshold is evaluated only if the high threshold has not
been breached.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the low threshold.
Disk Usage node
Navigation: cdm > host name > Disks > disk name > Disk Usage
This node lets you configure disk usage individually for each monitored disk (diskname1, diskname2, etc). You can set attributes for alarm
thresholds, disk usage (%) and disk usage (MB).
Note: The alarms are generated for free disk space and QoS are generated for disk usage.
Navigation: cdm > host name > Disks > disk name > Disk Usage Change
This node lets you configure thresholds and alarm messages sent with changes in disk usage for each monitored disk.
Change Calculation: indicates how you want to calculate the disk change. Select from the drop-down menu either of the following:
Summarized over all samples: The change in disk usage is the difference between the latest sample and the first sample in the
"samples window," which is configured at the Disk Configuration level.
Between each sample: The change in disk usage is calculated after each sample is collected.
<diskname> Inode Usage node
Navigation: cdm > Disks > disk name > Inode Usage > Alarm Thresholds
You can individually configure inode usage for each monitored disk on a Unix host.
Inode Usage Alarm Based on Threshold for: indicates the usage measurement units. Select either percent or count.
<shareddiskname> node
Navigation: cdm > host name > Disks > shared disk name
A shared network disk is added under the Disks node in the navigation pane. You can select the shared disk and update user name, password,
and disk availability monitoring properties.
Enable Space Monitoring
This section allows you to enable network disk usage monitoring for the profile by selecting the Enable Space Monitoring checkbox.
Network Connection
This section allows you to view or edit the user credentials for the shared network disk specified in the Add New Share window while creating a
network disk monitoring profile.
Shared Folder Availability
This section allows you to specify or edit the thresholds and alarms for the availability state of the network disk.
Note: The Disk Usage and the Disk Usage Change nodes for the <shareddiskname> node are the same as defined for the
<diskname> node.
Memory node
Default: 1
Set QoS Target as 'Memory': sets the QoS target to Memory.
Default: Not selected
Memory Paging node
Navigation: cdm > Memory > Memory Paging > Alarm Thresholds
You can individually configure alarm and memory paging thresholds for alarm sent with changes in memory paging for each monitored disk. See
Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Physical Memory node
Navigation: cdm > Memory > Swap Memory > Swap Memory (%)
A swap memory is a reserved space on hard drive which is used by the system when the physical memory (RAM) is full. However, the swap
memory is not a replacement of the physical memory due to lower data access rate.
The CPU, Disk, and Memory Monitoring probe calculates the swap memory similar to the swap -l command of Solaris. However, the probe use
pages instead of blocks. You can compare the swap memory information of the probe and the swap -l command by using the following formula:
Swap Memory (calculated by probe) in MB = (Blocks returned by the swap -l command * 512)/ (1024*1024).
Network node
Note: These network metrics are supported only on the Windows, Linux and AIX platforms.
Processor node
Navigation: cdm > Processor > Total CPU > Total CPU Idle
This section lets you configure thresholds to send alarm messages when the CPU usage gets below the configured thresholds. Some of the
configuration fields are:
Enable High Threshold: sets the high threshold for disk usage. This threshold is evaluated first and if it is not exceeded, then the low
threshold is evaluated.
Threshold: sends an alarm message when the CPU usage gets below this value. The value in percent of the CPU usage.
Alarm Message: sends the alarm message when the CPU usage on the disk is below the high threshold.
Enable Low Threshold: sets the low threshold for disk usage. This threshold is evaluated only if the high threshold has not been
exceeded.
The probe executes the iostat command for fetching the iostat monitors value. The QoS values are obtained from the second sample value of the
devices.
Set or modify the following values, as needed:
Interval (minutes): defines the time interval for fetching the sample values from the device. Default: 5
Sample: defines the time interval in seconds, which is used with iostat command for fetching the iostat data for that time duration. This
value must be less than Interval (minutes) field value.
Default: 10
Ignore Iostat Devices: defines the iostat devices to be excluded from monitoring. For example, specifying the regular expression
sda|sdb in this field excludes the sda and sdb iostat devices from monitoring.
Device Iostat node
Iostat Monitors
Iostat Average Queue Length
Iostat Average Request Size
Iostat Average Service Time (Linux)
Iostat Average Wait Time (active, by default)
Iostat Read Requests Merged Per Second
Iostat Reads Per Second
Iostat Sector Reads Per Second
Iostat Sector Writes Per Second
Iostat Utilization Percentage (active, by default)
Iostat Write Requests Merged Per Second
Iostat Writes Per Second
Solaris
AIX
The probe detects the underlying OS and filters the list of monitors. This section lets you enable the iostat monitoring for the device. This option is
disabled, by default.
This section represents the actual monitor name of the device for configuration.
QoS Name: identifies the QoS name of the monitor.
Units: identifies a unit of the monitor. For example, % and Mbytes.
Publish Data: publishes the QoS data of the monitor.
Enable High Threshold: lets you configure the high threshold parameters. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatError
Enable Low Threshold: lets you configure the low threshold parameters. Typically, the low threshold generates a warning alarm and the
high threshold generates an error alarm. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatWarning
Similarly, you can configure other monitors because each monitor contains the same set of fields.
Contents
Verify Prerequisites
Configure Disk Monitoring
Configure CPU Monitoring
Configure Memory Monitoring
Probe Defaults
How to Copy Probe Configuration Parameters
Options Configured Using Raw Configure
Using Regular Expressions
Verify Prerequisites
Verify that required hardware and software is available and any installation consideration is met before you configure the probe. For more
information, see cdm (CPU, Disk, Memory Performance Monitoring) Release Notes.
Note: You can add network disks to be monitored using Windows robots using the New Share option in the Disk Usage section of the
Status tab. You can monitor availability state and usage of network disks using any robot where the probe is deployed by clicking the E
nable Space Monitoring option in the Disk Usage section of the Status tab.
Probe Defaults
You can use the sample configuration file to configure a probe with default monitoring values.
Follow these steps:
1. Navigate to the Program Files\Nimsoft\Probes\System\<probe_name> folder.
2. Make the desired configuration in the <probe_name>.cfg file.
3. Run/restart the probe in Infrastructure Manager to initialize the configuration.
You can now use the newly added default monitoring values, such as templates, in the left pane as per requirement.
Note: When you perform this operation with the cdm probe, you must ensure that the disk partitions are the same on the source and
the target computers.
For example, If the source computer has a C: and a D: partition, and you copy the cdm probe configuration to a cdm probe on a computer with
only a C: partition, the cdm probe on this computer will try to monitor a D: partition (which is missing) and report an error.
Follow these steps:
1. Log on to the robot where your configured cdm probe resides.
2. Select the cdm probe to be copied from the probe list in the Infrastructure Manager and drag and drop the probe into the Archive.
3. Click Rename and enter a unique package name the copy of the cdm probe archive. For example, rename the package to cdm_master.
3.
The distribution progress window appears and the configuration of the probe is completed after distribution process is finished.
The value should be a regular expression that would match all disks and/or filesystems that you want the probe to ignore. Here is an
example to ignore Auto-mounted disks that are recognized on each "disk interval":
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info
Default: Yes
This key has been introduced in the Raw Configuration section to make the Device Id of shared drive and local drive identical. You
are required to set this key as No to enable this feature.
This feature is introduced because of the following two reasons.
1.
When file system is mounted on linux through cifs, then, on fresh deployment of the cdm probe, the Device Id and Metric Id for QoS
and alarms of respective mounted file system are missing.
On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
sliced_cpu_interval
sliced_memory_interval
Sometimes, the probe can take more than expected time to execute commands (such as VMSTAT and SAR) causing internal alarms.
Configuring time in seconds using the sliced_cpu_interval and sliced_memory_interval keys through the raw configuration section,
enables you to avoid that delay.
The following example explains the use of sliced_memory_interval key.
Configure the interval from Setup tab of Memory as 5 min. If you configure the value of sliced_memory_interval key as 10 seconds, then
the probe collects data through VMSTAT command for (Interval(5 min) 10 seconds), that is, for 290 seconds. Thus, the probe does not
generate internal alarms and seamlessly generates QOS. If still the problem persists, you can increase the sliced_memory_interval few
more seconds.
These keys have been introduced for AIX platform in the Raw Configuration section.
Type of regular
expression
Explanation
[A-Z]
Standard (PCRE)
[A-Z:\\]
Custom
Matches with the Uppercase character type of the local disk available on the respective
box
Standard (PCRE)
[*.\\]
Custom
Standard (PCRE)
\d*
Custom
Setup Tab
General Tab
Control Properties Tab
Message Definitions Tab
Cluster Tab
Edit Alarm or QoS Source
Status Tab
Disk Usage
Disk Usage Modification
New Share Properties
UNIX platforms
Edit Disk Properties
Delete a Disk
Modify Default Disk Parameters
Enable Space Monitoring
The Multi CPU Tab
Advanced Tab
Custom Tab
New CPU Profile
New Disk Profile
New Memory Profile
Setup Tab
The Setup tab is used to configure general preferences for the probe. There are three tabs within this tab: General, Control Properties and Mes
sage Definitions. A fourth tab, the Cluster tab, appears if the probe is running within a clustered environment.
General Tab
Option is Unchecked: the first alarm will be generated in 2 minutes and the respective alarms will be generated in 1 minute time
interval each.
Option is Checked: the first alarm will be generated in 1 minute and the respective alarms will be generated in 1 minute time interval
each.
Note: The sample collected at the start of the probe is considered as the first sample. The sample count is cleared on de-activation of
the probe. For more details about the samples, see the Control Properties tab.
The Control Properties tab defines the time limit after which the probe asks for data and the number of samples the probe should store to
calculate the values used to determine the threshold breaches.
The fields are divided into the following three sections:
Disk properties
CPU properties
Memory & Paging properties
The field description of each section is given below:
Interval
Specify the time limit in minutes between probe requests for data. This field is common for all three sections.
Samples
Allows you to specify how many samples the probe should store for calculating values used to determine threshold breaches. This field is
common for all three sections.
Note: Even if you set the sample value as 0, the QoS for disk are generated based on the default sample value.
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE 'qos_cpu_usage'
AND target NOT IN('user','system','wait','idle')
To update table for new target:
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE
'qos_cpu_multi_usage' AND (target NOT LIKE 'User%' AND target NOT LIKE
'System%' AND target NOT LIKE 'Wait%' AND target NOT LIKE 'Idle%')
Here, Target is the new QoS target to be set and Source is the QoS source for which target need to be changed. Both of these
can be configured by user.
Set QoS target as 'Memory'
If selected, QoS target for memory and paging is set as Memory.
The following SQL scripts demonstrate how to update old data in the database when the QoS Target as "Memory" is changed:
To see the rows to be changed or updated rows:
The Message Definitions tab enables you to customize the sent messages whenever a threshold is breached. A message is defined as a text
string with a severity level. Each message has a token that identifies the associated alarm condition.
The fields are explained below:
Message Pool
This section lists all messages with their associated message ID. You can right-click in the message pool window to create a new
message and edit/delete an existing message.
Active Messages
This section contains tabs to allow you to associate messages with the thresholds. You can drag the alarm message from the message
pool and drop it into the threshold field. The available tabs are explained below:
CPU
High (error) and Low (warning) threshold for total CPU usage.
High (error) threshold for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the
threshold).
CPU difference threshold (alarms are sent when the difference in CPU usage between different CPUs in multi-CPU systems breaches
the threshold).
Disk
To modify the thresholds for disks, double click the disk-entries under the Status tab.
Memory
Depends on what memory view is selected in the memory usage graph, where you may toggle among three views (see the Status
tab).
High (error) and Low (warning) threshold for pagefile usage and paging activity
Physical memory
Swap memory (Unix systems)
Computer
Allows you to select the alarm message to be issued if the computer is rebooted.
Default: The time when the computer was rebooted.
Other
You can select the alarm message to be sent if the probe is not able to fetch data.
The Cluster tab is displayed only when the cdm probe is hosted in clustered environment and it is configured as a part of a cluster.
It displays a list of detected virtual groups belonging to the cluster. By editing the entries (refer Edit Alarm or QoS source section), you can set
the alarm source and QoS source to be used for disks belonging to that virtual group.
The available options for alarm source and QoS source are:
<cluster ip>
<cluster name>
<cluster name>.<group name>
Edit Alarm or QoS Source
You can edit the alarm source or QoS source.
Follow these steps:
1. Double-click a virtual group entry.
2. On the Group Sources dialog, select the Alarm source and QoS source.
3. Click OK.
Note: QoS messages can also be sent on Disk usage (both in % and MB), and availability for shared disks (also disk usage on NFS file
systems if the Enable space monitoring option is set for the file system as described in the section Setup > Cluster). These options can
be selected when defining the threshold values for these options under the Status tab.
Status Tab
The Status tab sets up high and low thresholds for the CPU, memory and paging activity for the selected file system. It is also the default tab of
the cdm probe GUI.
The fields are explained below:
Graphs
The graphs display actual samples in purple, averages in blue, error threshold (if configured) in red, and warning threshold (if configured) in
yellow.
CPU usage: graph of the CPU usage.
Memory usage: three separate graphs (% of total available memory, physical, and virtual memory). Use the buttons M, S, and P on the
top right corner of the graph to toggle through the three graphs.
% of available memory: in % of total available memory
Physical memory: in % of available physical memory (RAM).
Swap memory: on UNIX systems, this value refers to the % of available swap space.
Note: Typing <Ctrl>+S on your keyboard will save the current view for this graph, and this view will be shown the next time you open
the probe GUI.
The disk usage section displays the details of all disks installed on the system and the disk usage details such as file system type, amount of free
space and total disk usage. You can monitor each disk individually, with individual threshold values, messages and severity levels.
Note: When using NFS mounts in the cdm probe, be aware that the server where the mount point is pointing will appear in the
discovery in USM.
Note: For UNIX platforms, this option is used to monitor NFS file systems.
To enable or disable space monitoring of the Windows share/mounted drive, right-click a monitored windows share/mounted drive in the list and
select the enable/disable space monitoring option.
Note: The shares are tested from the service context, and the cdm probe just checks that it is possible to mount the share.
UNIX platforms
To enable/disable space monitoring of the file system, right-click a monitored NFS file system in the list and select the enable/disable space
monitoring option. Enabling space monitoring of a NFS file system may cause problems for the cdm probe if the communication with the NFS
server is disrupted (e.g. stale NFS handles). By default, the NFS file systems are monitored for availability only.
Edit Disk Properties
Use the Edit option to modify the disk usage properties.
The disk usage configuration GUI displays tabs for each section of the disk configuration, which are explained below:
Disk usage and Thresholds tab
The page displays the amount of total, used, and free disk space for the file system.
You can configure the following threshold settings:
Monitor disk using either Mbytes or %.
High threshold for the disk. If you select this option, set the value (based on either Mbytes or %) and select the alarm message to be
sent. When the amount of free space gets below this value, the specified alarm message will be sent. This threshold is evaluated first and
if it is not exceeded, then the low threshold is evaluated.
Low threshold for the disk. If you select this option, set the value (based on either Mbytes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be sent. This threshold is evaluated only if the
high threshold has not been exceeded.
You can configure the Quality of Service message, which can have information about the disk usage in Mbytes, % or both depending on
your selections.
Inode Usage and Thresholds tab
This tab is only available for UNIX systems; otherwise it remains disabled. The tab indicates the amount of total, used, and free inodes on the file
system.
You can configure the following threshold settings:
Monitor disk using either inodes or %.
High threshold for the disk. If you select this option, set the value (based on either inodes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be sent.
Low threshold for the disk. If you select this option, set the value (based on either inodes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be issued.
You can configure the Quality of Service message, which can have information about the disk usage in inodes, % or both depending on your
selections.
Disk Usage Change and Thresholds tab
This tab lets you specify the alarm conditions for alarms to be sent when changes in disk usage occur.
Disk usage change calculation
You can select one of the following:
Change summarized over all samples. The change in disk usage is the difference between the latest sample and the first sample in
the "samples window". The number of samples the cdm probe will keep in memory for threshold comparison is set as Samples on
the Setup > Control Properties tab.
Note: There may be some discrepancy between the values in QoS and values in alarms when the Change summarized over all
samples option is selected. This is because the QoS are generated on every interval and Alarms are generated based on the selection
of the option Change summarized over all samples.
Change between each sample. The change in disk usage will be calculated after each sample is collected.
Threshold settings
This section allows you to define the alarm conditions:
Type of change. You can select whether alarms should be issued on increase, decrease or both increase and decrease in disk usage.
High threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be sent. The default value is 2 Mbytes.
Low threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be issued. The default value is 2 Mbytes.
QoS
You can send QoS messages on disk usage change in Mbytes.
Delete a Disk
Use this option to delete the disk from being monitored by the cdm probe. When you use the Delete option a confirmation dialog appears. Click Y
es to delete the disk from the list.
Modify Default Disk Parameters
Use the Modify Default Disk Parameters option to change fixed disk properties.
If you modify the default settings than every disk that you add from that point forward will have the new settings as the default disk properties.
Enable Space Monitoring
The Enable space monitoring option appears only for the shared drive/folder (using the New Share... option) being monitored by the cdm probe.
To enable/disable space monitoring of the windows share/mounted drive/NFS file system, right-click a monitored windows share/mounted drive/
NFS file system in the list and select the enable/disable space monitoring option.
Use the Multi CPU option to display the alarm threshold and the CPU usage for the different CPUs in a multi-CPU configuration. You can specify
the maximum threshold, CPU difference threshold and processors to display.
Note: This tab only visible when the cdm probe is running on a multi-CPU computer.
A multi-core processor (multi-CPU) is a single computing component with two or more independent actual processors (called "cores"), which are
the units that read and execute program instructions. A multi-core processor implements multiprocessing in a single physical package.
This tab contains a graph displaying the alarm threshold and the CPU usage for each processor in a multi-CPU configuration.
The thresholds and options are explained below:
Maximum
High (error) threshold (in %) for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the
threshold).
Difference
CPU difference threshold (in %). Alarms are sent when the difference in CPU usage among the CPUs in a multi-CPU system
breaches the threshold).
Select processors to view
Select the processor(s) to view in the graph. By default all available processor are shown.
Click Update to refresh the graph with the most current sample values.
Advanced Tab
The Advanced tab enables you to customize the QoS messages, for example an alarm on processor queue length, an alarm on detected reboot,
and paging measurements.
The fields are explained below:
Quality of Service Messages
Select any of the following settings to send the QoS messages as per the time intervals defined under the Control properties tab.
Processor Queue Length (For Windows)/System Load (Processor Queue Length)(For AIX, SGI, Linux and Solaris)
Measures the number of queued processes, divided by the number of processors waiting for CPU time for the system.
Load Average 1 min
Specifies the average system load over the last one minute.
Default: Not Selected
Load Average 5 min
Specifies the average system load over the last five minutes.
Default: Not Selected
Load Average 15 min
Specifies the average system load over the last fifteen minutes.
Default: Not Selected
Notes:
The values of the Load Average metrics are calculated per processor. These are dependent on the field Calculate Load
Average per Processor under the Setup -> General tab.
Sampling is not applicable for the three Load Average Metrics.
Memory Usage
Measures the amount of total available memory (physical + virtual memory) used in Mbytes.
Memory in %
Measures the amount of total available memory (physical + virtual memory) used in %.
Memory Paging in Kb/s
Measures the amount of memory that has been sent to or read from virtual memory in Kbytes/second.
Memory Paging in Pg/s
Measures the amount of memory that has been sent to or read from virtual memory in pages per second.
Note: If you have been running CDM version 3.70 or earlier, the QoS settings in the cdm probe GUI are different than CDM
version 3.72. However, if CDM version 3.70 or earlier already has created QoS entries in the database for kilobytes per second
(Kb/s) and/or pages per second (Pg/s), these entries will be kept and updated with QoS data from the newer CDM version (3.72
and higher).
Lparstat i command
Total Capacity =( maxVirtualCPU/ maxCapacity)*100;
CPU User = CPU user *EntCap)/TotCapacity;
cpuStats->fSystem = (double)((cpuStats->fSystem *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fWait = (double)((cpuStats->fWait *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fIdle = (double)((cpuStats->fIdle *
cpuStats->fEntCap)/TotCapacity);
Paging measured in
Paging can be measured in Kilobytes per second or pages per second.
Paging is the amount of memory which has been sent to or read from virtual memory. This option lets you select the paging to be
measured in one of the following units:
Kilobytes per second (KB/s)
Pages per second (Pg/s). Note that the size of the pages may vary between different operating systems.
Note: When changing the paging selection, the header of the Paging graph on the Status tab immediately changes to show the
selected unit, but the values in the graph do not change until the next sample is measured.
information differently.
CDM gathers swap information in a similar way as the Solaris utility swap-l does, but using pages instead of blocks. To compare the swap
information between CDM and the swap utility you take the blocks swap reports and run it through the formula: (blocks * 512) / (1024 *
1024) = total_swap Mb. This is the same number of MB the CDM probe uses in its calculations.
TOP on the other hand gathers information about anonymous pages in the VM, which is quicker and easier to gather but do not represent a
true picture of the amount of swap space available and used. The reason is that anonymous pages also take into account physical memory
that is potentially available for use as swap space. Thus, the TOP utility will report more total swap space since it is also factoring in
physical memory not in use at this time.
CDM and TOP gather physical memory information in similar ways, so the differences in available physical memory should be insignificant.
Since CDM does not differentiate between available swap and physical memory (after all, it is only when you run out of both the resources
that things stop working on the system), the accumulated numbers are used. The accumulated numbers for TOP will be off, since the free
portions of physical memory will be counted twice in many instances. While we could easily represent the data in the same format that TOP
does, we feel it does not give a correct picture of the memory/swap usage on the system.
Custom Tab
The Custom tab displays a list of all currently defined custom profiles. Custom profiles are used to get additional thresholds and alarms for
checkpoints that are available in the probe. All the alarm situations are available, except for those available for multi-CPU and cluster disks. A
custom profile allows you to fine-tune monitoring of resources for alarming purposes.
The alarms for each custom profile will be sent using suppression keys unique to the profile so that you can get multiple alarms for what is
basically the same alarm situation (for instance, the a breach of the memory usage threshold).
You can right-click inside the dialog to create new custom profiles to monitor the CPU, disk or memory. Once a custom profile is created you can
select one or more custom profiles to edit, delete or activate/deactivate as and when required.
New CPU Profile
You can create a custom profile for a local disk, shared disk, or for a disk available on a network.
The fields are explained below:
Name: defines the disk profile name.
Description: defines the description of the disk profile.'
Regular Expression for Mount Point: defines a regular expression through which you can monitor your Custom Local Disk (for
Windows platform) and Custom Local and NFS (for Linux, Solaris and AIX platforms).
Note: on selecting this option, the drop-down menu Mount point and the field Remote Disk are disabled which means that
monitoring is enabled either through the regular expression or through the drop-down menu.
Active: activates the alarm generation if the disk is unavailable or not mounted.
Allow space monitoring: lets you configure three new checkpoints to monitor the disk, which are Disk free space, Inodes free, Space usage
change.
For more information on these checkpoints, refer to the Control Properties Tab section.
Note: You are required to enable NFS drives from the Status tab to see custom NFS inode alarms.
You can configure the probe to monitor local disks as well as shared disks (cluster). When monitoring shared disks (such as NFS mounts) over
low-performance or over-utilized lines, you may experience slow response times. If quota is turned on for a disk on a Windows system, the size
reported is the total size, and the free disk space is calculated after quota.
Note: This probe can be configured using the probe configuration interface or by copying configuration parameters from another cdm pr
obe.
The following diagram outlines the process to configure the probe to monitor CPU, disks and memory.
This section describes the minimum configuration settings required to configure the cdm probe for local disk monitoring.
Follow these steps:
1. Open the cdm probe configuration interface.
2. Select the disk you want to monitor from the list of all available disks which are displayed under the Disks node.
3. Enable the alarms (either in Mb or in percentage) and QoS from the Alarm Thresholds and Disk Usage sections under the Disk Usage
node.
4. Specify the time interval in minutes during which the probe requests for data from the Disk Configuration section under the Disks node.
5. Save the configuration.
The selected disk is now being monitored.
Note: If the monitored environment also includes cluster disks, these disks are also included in the diskname node with the same
alarms configurations as local disks. However, for such environments, a Cluster section is displayed in the cdm node where you can
view and modify alarm and QoS sources for the cluster resources.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
Note: To configure cdm probe on robots deployed in other environments such as Linux or Solaris, proceed directly to step 6 since a
shared network disk profile can only be created on windows robots.
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info: makes the Device Id of shared drive and local drive identical. Navigate to the setup folder and set this key as No
to enable this feature.
Default: Yes
This feature is introduced because of the following two reasons:
When file system is mounted on Linux through cifs and the cdm probe is deployed fresh, the Device Id and Metric Id for QoS and
alarms for the respective mounted file system are missing.
On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size: indicates the total size of the disk. Navigate to disk > fixed_default and set this key to yes to enable this feature.
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
cdm node
<hostname> node
Disks node
<diskname> node
Disk Usage node
Disk Usage Change node
<diskname> Inode Usage node
<shareddiskname> node
Memory node
Memory Paging node
Physical Memory node
Swap Memory node
Total Memory node
Network node
Processor node
Individual CPU node
Total CPU node
Iostat node (Linux, Solaris, and AIX)
Device Iostat node
cdm node
Navigation: cdm
This node lets you view the probe information, configure the logging properties and set data management values.
Set or modify the following values as required:
cdm > Probe Information
This section provides the basic probe information and is read-only.
cdm > General Configuration
This section provides general configuration details.
Log Level: Sets the amount of detail that is logged to the log file. Default: 0 - Fatal
Log size (KB): Sets the maximum size of the log. When using the up and down arrows, the value increases or decreases by 5. Default:
100 KB
Send alarm on each sample: If selected, the probe generates an alarm on each sample where there is a threshold breach. If not
selected, the probe waits for the number of samples (specified in Samples in the cdm > Disk Configuration, cdm > Memory or cdm >
Processor configuration screens) before sending the alarm. The sample count is cleared on de-activation of the probe.
Send short name for QoS source: If selected, sends only the host name. If not selected, sends the full host name with domain.
Allow QoS source as target: A number of QoS messages, by default, use the host name as their target. If selected, the target name is
changed to be the same as the QoS source name.
Monitor iostat (Linux and Solaris only): Enables the iostat monitoring of the host system devices.
Count Buffer-Cache as Used Memory (Linux, Solaris, AIX and HP-UX only): Counts the buffer and cache memory as used memory
while monitoring the physical and system memory utilization. If not selected, the buffer and cache memory is counted as free memory.
cdm > Cluster
This section is only visible in when monitoring clustered environments and displays the cluster resources associated to the monitored system.
The following fields are displayed for each resource:
Virtual Group: displays the resource group of the cluster where the host system of the robot is a node.
Cluster Name: displays the name of the cluster.
Cluster IP: displays the IP address of the cluster. This is used as the default source for alarm and QoS.
Alarm Source: defines the source of the alarms to be generated by the probe for cluster resources.
QoS Source: defines the source of the QoS to be monitored by the probe for cluster resources.
Note: The Alarm Source and QoS source fields can have the following values:
<cluster ip>
<cluster name>
<cluster name>.<group name>
The default value for both the fields is <cluster ip>
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization is the average for the last 25 minutes. Default: 1
Ignore Filesystems: defines the file system to be excluded from monitoring. For example, specifying the regular expression C:\\ in this
field excludes the Disk C of the system from monitoring. and also stops displaying the disk in navigation pane.
Timeout: specifies the time limit in seconds for the probe for collecting the disk-related data. This option is useful at time of disk fail/crash
in stale File system and avoid a hang situation for the probe. Default: 10
Filesystem Type Filter: specifies the type of the file system to be monitored as a regular expression. For example, specifying ext* in this
field will enable monitoring of only file systems such as ext4 or ext5.
Note: The first three fields are common to Memory and Processor configuration sections.
Note: The disk read throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Note: The disk write throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Note: The disk total throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Default: //
User: specifies the username for the network disk.
Password: specifies the password corresponding to the username for the network disk.
Alarm Message: selects the alarm message to be generated if network disk is not available.
Enable Folder Availability Monitoring: enables the availability state monitoring of the network disk while creating the profile.
The shared network disk is displayed as the <shareddiskname> node.
Important! You can only add shared disks to be monitored in windows robots.
<diskname> node
Navigation: cdm > host name > Disks > disk name
The disk name node lets you configure alarms and QoS for disk availability and size for an individual local or cluster disk.
Disk Missing: configure QoS disk availability status and generate alarm when the probe fails to connect with the disk.
Disk Size: configure QoS disk size and generate alarm when the probe fails to calculate the disk size.
Note: The configuration of disk size alarms and QoS are supported only on the Windows, Linux and AIX platforms.
The following attributes are common to many probe configuration fields in the cdm user interface. Here they pertain to disk usage, elsewhere they
pertain to memory or CPU usage, depending on context.
Enable High Threshold: enables the high threshold for disk usage change. This threshold is evaluated first and if it is not exceeded,
then the low threshold is evaluated.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the high threshold.
Enable Low Threshold: enables the low threshold for disk usage change. This threshold is evaluated only if the high threshold has not
been breached.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the low threshold.
Disk Usage node
Navigation: cdm > host name > Disks > disk name > Disk Usage
This node lets you configure disk usage individually for each monitored disk (diskname1, diskname2, etc). You can set attributes for alarm
thresholds, disk usage (%) and disk usage (MB).
Note: The alarms are generated for free disk space and QoS are generated for disk usage.
Navigation: cdm > host name > Disks > disk name > Disk Usage Change
This node lets you configure thresholds and alarm messages sent with changes in disk usage for each monitored disk.
Change Calculation: indicates how you want to calculate the disk change. Select from the drop-down menu either of the following:
Summarized over all samples: The change in disk usage is the difference between the latest sample and the first sample in the
"samples window," which is configured at the Disk Configuration level.
Between each sample: The change in disk usage is calculated after each sample is collected
Navigation: cdm > Disks > disk name > Inode Usage > Alarm Thresholds
You can individually configure inode usage for each monitored disk on a Unix host.
Inode Usage Alarm Based on Threshold for: indicates the usage measurement units. Select either percent or count.
<shareddiskname> node
Navigation: cdm > host name > Disks > shared disk name
A shared network disk is added under the Disks node in the navigation pane. You can select the shared disk and update user name, password,
and disk availability monitoring properties.
Enable Space Monitoring
This section allows you to enable network disk usage monitoring for the profile by selecting the Enable Space Monitoring checkbox.
Network Connection
This section allows you to view or edit the user credentials for the shared network disk specified in the Add New Share window while creating a
network disk monitoring profile.
Shared Folder Availability
This section allows you to specify or edit the thresholds and alarms for the availability state of the network disk.
Note: The Disk Usage and the Disk Usage Change nodes for the <shareddiskname> node are the same as defined for the
<diskname> node.
Memory node
Navigation: cdm > Memory > Memory Paging > Alarm Thresholds
You can individually configure alarm and memory paging thresholds for alarm sent with changes in memory paging for each monitored disk. See
Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Physical Memory node
down and the application response time increases. The increased response time of critical business applications can adversely impact the user
interaction. Monitoring the system memory lets you helps diagnosing the issue, for example, identifying the closing the unwanted applications.
You can also consider system upgrade when the memory utilization is consistently high.
This node lets you monitor the following memory metrics:
Physical Memory
System Memory
User Memory
Note: The system and user memory monitoring is supported only on the Windows, Linux and AIX platforms.
Navigation: cdm > Memory > Swap Memory > Swap Memory (%)
A swap memory is a reserved space on hard drive which is used by the system when the physical memory (RAM) is full. However, the swap
memory is not a replacement of the physical memory due to lower data access rate.
The CPU, Disk, and Memory Monitoring probe calculates the swap memory similar to the swap -l command of Solaris. However, the probe use
pages instead of blocks. You can compare the swap memory information of the probe and the swap -l command by using the following formula:
Swap Memory (calculated by probe) in MB = (Blocks returned by the swap -l command * 512)/ (1024*1024).
Total Memory node
Navigation: cdm > Memory > Total Memory > Memory Usage (%)
Network node
Note: These network metrics are supported only on the Windows, Linux and AIX platforms.
Processor node
calculated for each column based on the total CPU ticks value. The QoS for total CPU value is the sum of CPU System, CPU User, and (if
configured) CPU Wait.
Configure the following fields:
Interval (minutes): specifies the time in minutes for how often the probe retrieves sample data. Default: 5
Samples: specifies how many samples the probe should keep in memory to calculate average and threshold values. Default: 5 If you did
not select the Send alarm on each sample checkbox, in the Probe Configuration pane, the probe waits for the number of samples
(specified in this field) before sending the alarm. Do not specify 0 (Zero) in this field.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, If the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization reported will be the average for the last 25 minutes.Set QoS
Target as 'Total': Select this checkbox if you want the QoS target to be set to Total. Default: 5
Include CPU Wait in CPU Usage: includes the CPU Wait in the CPU Usage calculation.
Number of CPUs: displays the number of CPUs. This is a read-only field.
Maximum Queue Length: indicates the maximum number of items in the queue before an alarm is sent.
Alarm Message: sends the alarm message when the queue has been exceeded.
Individual CPU node
Navigation: cdm > Processor > Total CPU > Total CPU Idle
This section lets you configure thresholds to send alarm messages when the CPU usage gets below the configured thresholds. Some of the
configuration fields are:
Enable High Threshold: sets the high threshold for disk usage. This threshold is evaluated first and if it is not exceeded, then the low
threshold is evaluated.
Threshold: sends an alarm message when the CPU usage gets below this value. The value in percent of the CPU usage.
Alarm Message: sends the alarm message when the CPU usage on the disk is below the high threshold.
Enable Low Threshold: sets the low threshold for disk usage. This threshold is evaluated only if the high threshold has not been
exceeded.
Iostat node (Linux, Solaris, and AIX)
The probe executes the iostat command for fetching the iostat monitors value. The QoS values are obtained from the second sample value of the
devices.
Set or modify the following values as required:
Interval (minutes): defines the time interval for fetching the sample values from the device. Default: 5
Sample: defines the time interval in seconds, which is used with iostat command for fetching the iostat data for that time duration. This
value must be less than Interval (minutes) field value. Default: 10
Device Iostat node
Iostat Monitors
Iostat Average Queue Length
Iostat Average Request Size
Iostat Average Service Time (Linux)
Iostat Average Wait Time (active, by default)
Iostat Read Requests Merged Per Second
Iostat Reads Per Second
Iostat Sector Reads Per Second
Iostat Sector Writes Per Second
Iostat Utilization Percentage (active, by default)
Iostat Write Requests Merged Per Second
Iostat Writes Per Second
Solaris
AIX
The probe detects the underlying OS and filters the list of monitors. This section lets you enable the iostat monitoring for the device. This option is
disabled, by default.
This section represents the actual monitor name of the device for configuration.
QoS Name: identifies the QoS name of the monitor.
Units: identifies a unit of the monitor. For example, % and Mbytes.
Publish Data: publishes the QoS data of the monitor.
Enable High Threshold: lets you configure the high threshold parameters. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatError
Enable Low Threshold: lets you configure the low threshold parameters. Typically, the low threshold generates a warning alarm and the
high threshold generates an error alarm. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatWarning
Similarly, you can configure other monitors because each monitor contains the same set of fields.
Note: You can add network disks to be monitored using Windows robots using the New Share option in the Disk Usage section of the
Status tab. You can monitor availability state and usage of network disks using any robot where the probe is deployed by clicking the E
nable Space Monitoring option in the Disk Usage section of the Status tab.
Probe Defaults
You can use the sample configuration file to configure a probe with default monitoring values.
Follow these steps:
1. Navigate to the Program Files\Nimsoft\Probes\System\<probe_name> folder.
2. Make the desired configuration in the <probe_name>.cfg file.
3. Run/restart the probe in Infrastructure Manager to initialize the configuration.
You can now use the newly added default monitoring values, such as templates, in the left pane as per requirement.
Note: When performing this operation with the cdm probe, you must ensure that the disk partitions are the same on the source and the
target computers.
For example, If the source computer has a C: and a D: partition, and you copy the cdm probe configuration to a cdm probe on a computer with
only a C: partition, the cdm probe on this computer will try to monitor a D: partition (which is missing) and report an error.
Follow these steps:
1. Log on to the robot where your configured cdm probe resides.
2. Select the cdm probe to be copied from the probe list in the Infrastructure Manager and drag and drop the probe into the Archive.
3. Click Rename and enter a unique package name the copy of the cdm probe archive. For example, rename the package to cdm_master.
5.
The distribution progress window appears and the configuration of the probe is completed after distribution process is finished.
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info
Default: Yes
This key has been introduced in the Raw Configuration section to make the Device Id of shared drive and local drive identical. You
are required to set this key as No to enable this feature.
This feature is introduced because of the following two reasons.
a. When file system is mounted on linux through cifs, then, on fresh deployment of the cdm probe, the Device Id and Metric Id for QoS
and alarms of respective mounted file system are missing.
b. On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
Type of regular
expression
Explanation
[A-Z]
Standard (PCRE)
[A-Z:\\]
Custom
Matches with the Uppercase character type of the local disk available on the respective
box
Standard (PCRE)
[*.\\]
Custom
Standard (PCRE)
\d*
Custom
Setup Tab
General Tab
Control Properties Tab
Message Definitions Tab
Cluster Tab
Edit Alarm or QoS Source
Status Tab
Disk Usage
Disk Usage Modification
New Share Properties
UNIX platforms
Edit Disk Properties
Delete a Disk
Modify Default Disk Parameters
Enable Space Monitoring
The Multi CPU Tab
Advanced Tab
Custom Tab
New CPU Profile
New Disk Profile
New Memory Profile
Setup Tab
The Setup tab is used to configure general preferences for the probe. There are tabs within this tab that you can use to specify General, Control
Properties and Message Definitions. A fourth tab, the Cluster tab, displays if the probe is running within a clustered environment.
General Tab
Important! If the Set QoS source to robot name option is set in the controller you will get the robot name also as target.
The Control Properties tab defines the time limit after which the probe asks for data and the number of samples the probe should store to
calculate the values used to determine the threshold breaches.
The fields are separated into the following three sections:
Disk properties
CPU properties
Memory & Paging properties
The field description of each section is given below:
Interval
Specify the time limit in minutes between probe requests for data. This field is common for all three sections.
Samples
Allows you to specify how many samples the probe should store for calculating values used to determine threshold breaches. This field is
common for all three sections.
Note: Even if you set the sample value as 0, the QoS for disk are generated based on the default sample value.
Note: This option is available for nonwindows platforms only, like Linux.
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE 'qos_cpu_usage'
AND target NOT IN('user','system','wait','idle')
To update table for new target:
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE
'qos_cpu_multi_usage' AND (target NOT LIKE 'User%' AND target NOT LIKE 'System%'
AND target NOT LIKE 'Wait%' AND target NOT LIKE 'Idle%')
To update table for new target:
The Message Definitions tab offers functionality to customize the messages sent whenever a threshold is breached. A message is defined as a
text string with a severity level. Each message has a token that identifies the associated alarm condition.
The fields are explained below:
Message Pool
This section lists all messages with their associated message ID. You can right-click in the message pool window to create new message and
edit/delete an existing message.
Active Messages
This section contains tabs to allow you to associate messages with the thresholds. You can drag the alarm message from the message pool and
drop it into the threshold field. The available tabs are explained below:
CPU
High (error) and Low (warning) threshold for total CPU usage.
High (error) threshold for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the threshold).
CPU difference threshold (alarms are sent when the difference in CPU usage between different CPUs in multi-CPU systems breaches
the threshold).
Disk
The thresholds for disks can be modified by double-clicking the disk-entries under the Status tab.
Memory
Depends on what memory view is selected in the memory usage graph, where you may toggle among three views (see the Status tab).
Memory usage
High (error) and Low (warning) threshold for pagefile usage and paging activity
Physical memory
Swap memory (Unix systems)
Computer
Allows you to select the alarm message to be issued if the computer is rebooted.
Default: The time when the computer was rebooted.
Other
You can select the alarm message to be sent if the probe is not able to fetch data.
Default: Contains information about the error condition.
Cluster Tab
The Cluster tab is displayed only when the cdm probe is hosted in clustered environment and it is configured as a part of a cluster.
It displays a list of detected virtual groups belonging to the cluster. By editing the entries (refer Edit Alarm or QoS source section), you can set
the alarm source and QoS source to be used for disks belonging to that virtual group.
The available options for alarm source and QoS source are:
<cluster ip>
<cluster name>
<cluster name>.<group name>
Edit Alarm or QoS Source
You can edit the alarm source or QoS source.
Follow these steps:
1. Double-click a virtual group entry.
2. On the Group Sources dialog, select the Alarm source and QoS source.
3. Click OK.
Note: QoS messages can also be sent on Disk usage (both in % and MB), and availability for shared disks (also disk usage on NFS file
systems if the Enable space monitoring option is set for the file system as described in the section Setup > Cluster). These options can
be selected when defining the threshold values for these options under the Status tab.
Status Tab
The Status tab sets up high and low thresholds for the CPU, memory and paging activity for the selected file system. It is also the default tab of
the cdm probe GUI.
The fields are explained below:
Graphs
The graphs display actual samples in purple, averages in blue, error threshold (if configured) in red, and warning threshold (if configured) in
yellow.
CPU usage: graph of the CPU usage.
Memory usage: three separate graphs (% of total available memory, physical, and virtual memory). Use the buttons M, S, and P on the top right
corner of the graph to toggle through the three graphs.
The disk usage section displays the details of all disks installed on the system and the disk usage details such as file system type, amount of free
space and total disk usage. You can monitor each disk individually, with individual threshold values, messages and severity levels.
Note: When using NFS mounts in the cdm probe, be aware that the server where the mount point is pointing will appear in the
discovery in USM.
Note: For UNIX platforms, this option is used to monitor NFS file systems.
To enable or disable space monitoring of the Windows share/mounted drive, right-click a monitored windows share/mounted drive in the list and
select the enable/disable space monitoring option.
Note: The shares are tested from the service context, and the cdm probe just checks that it is possible to mount the share.
UNIX platforms
To enable/disable space monitoring of the file system, right-click a monitored NFS file system in the list and select the enable/disable space
monitoring option. Enabling space monitoring of a NFS file system may cause problems for the cdm probe if the communication with the NFS
server is disrupted (e.g. stale NFS handles). By default, the NFS file systems are monitored for availability only.
Change between each sample. The change in disk usage will be calculated after each sample is collected.
Threshold settings
This section allows you to define the alarm conditions:
Type of change. You can select whether alarms should be issued on increase, decrease or both increase and decrease in disk usage.
High threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be sent. The default value is 2 Mbytes.
Low threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be issued. The default value is 2 Mbytes.
QoS
You can send QoS messages on disk usage change in Mbytes.
Delete a Disk
Use this option to delete the disk from being monitored by the cdm probe. When you use the Delete option a confirmation dialog appears. Click Y
es to delete the disk from the list.
Modify Default Disk Parameters
Use the Modify Default Disk Parameters option to change fixed disk properties.
If you modify the default settings than every disk that you add from that point forward will have the new settings as the default disk properties.
Enable Space Monitoring
The Enable space monitoring option appears only for the shared drive/folder (using the New Share... option) being monitored by the cdm probe.
To enable/disable space monitoring of the windows share/mounted drive/NFS file system, right-click a monitored windows share/mounted drive/
NFS file system in the list and select the enable/disable space monitoring option.
The Multi CPU Tab
Use the Multi CPU option to display the alarm threshold and the CPU usage for the different CPUs in a multi-CPU configuration. You can specify
the maximum threshold, CPU difference threshold and processors to display.
Note: This tab only visible when the cdm probe is running on a multi-CPU computer.
A multi-core processor (multi-CPU) is a single computing component with two or more independent actual processors (called "cores"), which are
the units that read and execute program instructions. A multi-core processor implements multiprocessing in a single physical package.
This tab contains a graph displaying the alarm threshold and the CPU usage for each processor in a multi-CPU configuration.
The thresholds and options available in the above dialog are explained below:
Maximum
High (error) threshold (in %) for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the
threshold).
Difference
CPU difference threshold (in %). Alarms are sent when the difference in CPU usage among the CPUs in a multi-CPU system
breaches the threshold).
Select processors to view
Select the processor(s) to view in the graph. By default all available processor are shown.
Click Update to refresh the graph with the most current sample values.
Advanced Tab
Use the Advanced tab to customize the QoS messages, for example an alarm on processor queue length, an alarm on detected reboot, and
paging measurements.
The fields are explained below:
Quality of Service Messages
Selecting any of the following settings enables QoS messages to be sent as per the time intervals defined under Control properties tab.
Processor Queue Length (Windows only)
Measures the number of queued processes, divided by the number of processors, waiting for time on the CPU for Windows system. For
AIX, SGI, Linux and Solaris, this QoS message refers to System Load.
Computer uptime (hourly)
Measures the computer uptime in seconds every hour.
Memory Usage
Measures the amount of total available memory (physical + virtual memory) used in Mbytes.
Memory in %
Measures the amount of total available memory (physical + virtual memory) used in %.
Memory Paging in Kb/s
Measures the amount of memory that has been sent to or read from virtual memory in Kbytes/second.
Memory Paging in Pg/s
Measures the amount of memory that has been sent to or read from virtual memory in pages per second.
Note: If you have been running CDM version 3.70 or earlier, the QoS settings in the cdm probe GUI are different than CDM
version 3.72. However, if CDM version 3.70 or earlier already has created QoS entries in the database for kilobytes per second
(Kb/s) and/or pages per second (Pg/s), these entries will be kept and updated with QoS data from the newer CDM version (3.72
and higher).
Notes:
If running on a multi-CPU system, the queued processes will be shared on the number of processors. For example, if running
on a system with four processors and using the default Max Queue Length value (4), alarm messages will be generated if the
number of queued processes exceeds 16.
To enable the QoS metric QOS_PROC_QUEUE_LEN for per CPU, you are required to add a key system_load_per_cpu with
value as Yes under the CPU section through the raw configure option. The probe calculates the system load on Linux, Solaris
and AIX as Load/Number of CPU if this key is set to Yes.
Lparstat i command
Total Capacity =( maxVirtualCPU/ maxCapacity)*100;
CPU User = CPU user *EntCap)/TotCapacity;
cpuStats->fSystem = (double)((cpuStats->fSystem *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fWait = (double)((cpuStats->fWait *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fIdle = (double)((cpuStats->fIdle *
cpuStats->fEntCap)/TotCapacity);
Paging measured in
Paging can be measured in Kilobytes per second or pages per second.
Paging is the amount of memory which has been sent to or read from virtual memory. This option lets you select the paging to be measured in
one of the following units:
Kilobytes per second (KB/s)
Pages per second (Pg/s). Note that the size of the pages may vary between different operating systems.
Note: When changing the paging selection, the header of the Paging graph on the Status tab will immediately change to show the
selected unit, but the values in the graph will not change until the next sample is measured.
TOP on the other hand gathers information about anonymous pages in the VM, which is quicker and easier to gather but do not represent a
true picture of the amount of swap space available and used. The reason is that anonymous pages also take into account physical memory
that is potentially available for use as swap space. Thus, the TOP utility will report more total swap space since it is also factoring in
physical memory not in use at this time.
CDM and TOP gather physical memory information in similar ways, so the differences in available physical memory should be insignificant.
Since CDM does not differentiate between available swap and physical memory (after all, it is only when you run out of both the resources
that things stop working on the system), the accumulated numbers are used. The accumulated numbers for TOP will be off, since the free
portions of physical memory will be counted twice in many instances. While we could easily represent the data in the same format that TOP
does, we feel it does not give a correct picture of the memory/swap usage on the system.
Custom Tab
The Custom tab displays a list of all currently defined custom profiles. Custom profiles are used to get additional thresholds and alarms for
checkpoints that are available in the probe. All the alarm situations are available, except for those available for multi-CPU and cluster disks. A
custom profile allows you to fine-tune monitoring of resources for alarming purposes.
The alarms for each custom profile will be sent using suppression keys unique to the profile so that you can get multiple alarms for what is
basically the same alarm situation (for instance, the a breach of the memory usage threshold).
You can right-click inside the above dialog to create new custom profiles to monitor the CPU, disk or memory. Once a custom profile is created
you can select one or more custom profiles to edit, delete or activate/deactivate as and when required.
New CPU Profile
You can create a custom profile for a local disk, shared disk, or for a disk available on a network.
The fields are explained below:
Name: defines the disk profile name.
Description: defines the description of the disk profile.'
Regular Expression for Mount Point: defines a regular expression through which you can monitor your Custom Local Disk (for
Windows platform) and Custom Local and NFS (for Linux, Solaris and AIX platforms).
Note: on selecting this option, the drop-down menu Mount point and the field Remote Disk are disabled which means that
monitoring is enabled either through the regular expression or through the drop-down menu.
Active: activates the alarm generation if the disk is unavailable or not mounted.
Allow space monitoring: lets you configure three new checkpoints to monitor the disk, which are Disk free space, Inodes free, Space usage
change.
For more information on these checkpoints, refer to the Control Properties Tab section.
Note: You are required to enable NFS drives from the Status tab to see custom NFS inode alarms.
You can configure the probe to monitor local disks as well as shared disks (cluster). When monitoring shared disks (such as NFS mounts) over
low-performance or over-utilized lines, you may experience slow response times. If quota is turned on for a disk on a Windows system, the size
reported is the total size, and the free disk space is calculated after quota.
Note: This probe can be configured using the probe configuration interface or by copying configuration parameters from another cdm pr
obe.
The following diagram outlines the process to configure the probe to monitor CPU, disks and memory.
Note: If the monitored environment also includes cluster disks, these disks are also included in the diskname node with the same
alarms configurations as local disks. However, for such environments, a Cluster section is displayed in the cdm node where you can
view and modify alarm and QoS sources for the cluster resources.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Note: To configure cdm probe on robots deployed in other environments such as Linux or Solaris, proceed directly to step 6 since a
shared network disk profile can only be created on windows robots.
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info: makes the Device Id of shared drive and local drive identical. Navigate to the setup folder and set this key as No
to enable this feature.
Default: Yes
This feature is introduced because of the following two reasons:
When file system is mounted on Linux through cifs and the cdm probe is deployed fresh, the Device Id and Metric Id for QoS and
alarms for the respective mounted file system are missing.
On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size: indicates the total size of the disk. Navigate to disk > fixed_default and set this key to yes to enable this feature.
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
cdm node
<hostname> node
Disks node
<diskname> node
Disk Usage node
Disk Usage Change node
<diskname> Inode Usage node
<shareddiskname> node
Memory node
Memory Paging node
Physical Memory node
Swap Memory node
Total Memory node
Network node
Processor node
Individual CPU node
Total CPU node
Iostat node (Linux, Solaris, and AIX)
Device Iostat node
cdm node
Navigation: cdm
This node lets you view the probe information, configure the logging properties and set data management values.
Set or modify the following values as required:
cdm > Probe Information
This section provides the basic probe information and is read-only.
cdm > General Configuration
This section provides general configuration details.
Log Level: Sets the amount of detail that is logged to the log file. Default: 0 - Fatal
Log size (KB): Sets the maximum size of the log. When using the up and down arrows, the value increases or decreases by 5. Default:
100 KB
Send alarm on each sample: If selected, the probe generates an alarm on each sample where there is a threshold breach. If not
selected, the probe waits for the number of samples (specified in Samples in the cdm > Disk Configuration, cdm > Memory or cdm >
Processor configuration screens) before sending the alarm. The sample count is cleared on de-activation of the probe.
Send short name for QoS source: If selected, sends only the host name. If not selected, sends the full host name with domain.
Allow QoS source as target: A number of QoS messages, by default, use the host name as their target. If selected, the target name is
changed to be the same as the QoS source name.
Monitor iostat (Linux and Solaris only): Enables the iostat monitoring of the host system devices.
Count Buffer-Cache as Used Memory (Linux and Solaris only): Counts the buffer and cache memory as used memory while
monitoring the physical and system memory utilization. If not selected, the buffer and cache memory is counted as free memory.
cdm > Cluster
This section is only visible in when monitoring clustered environments and displays the cluster resources associated to the monitored system.
The following fields are displayed for each resource:
Virtual Group: displays the resource group of the cluster where the host system of the robot is a node.
Cluster Name: displays the name of the cluster.
Cluster IP: displays the IP address of the cluster. This is used as the default source for alarm and QoS.
Alarm Source: defines the source of the alarms to be generated by the probe for cluster resources.
QoS Source: defines the source of the QoS to be monitored by the probe for cluster resources.
Note: The Alarm Source and QoS source fields can have the following values:
<cluster ip>
<cluster name>
<cluster name>.<group name>
The default value for both the fields is <cluster ip>
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization is the average for the last 25 minutes. Default: 1
Ignore Filesystems: defines the file system to be excluded from monitoring. For example, specifying the regular expression C:\\ in this
field excludes the Disk C of the system from monitoring. and also stops displaying the disk in navigation pane.
Timeout: specifies the time limit in seconds for the probe for collecting the disk-related data. This option is useful at time of disk fail/crash
in stale File system and avoid a hang situation for the probe. Default: 10
Filesystem Type Filter: specifies the type of the file system to be monitored as a regular expression. For example, specifying ext* in this
field will enable monitoring of only file systems such as ext4 or ext5.
Note: The first three fields are common to Memory and Processor configuration sections.
Enable High Threshold: lets you define a threshold for generating a higher severity alarm.
Threshold: defines the high threshold value.
Alarm Message: specifies the alarm message when the high threshold value breaches. Similarly, you can configure the low threshold
value where the alarm severity is lower.
Publishing Data in MB: measures the QoS for Disk Usage MBytes.
Publishing Data in Percent: measures the QoS for Disk Usage in percentage.
cdm > Disks > Inode Usage Defaults (UNIX only)
This section lets you configure default alarms and inode usage by number of files (count) and percent. You can also configure high and low
threshold values as in the Disk Usage Defaults section.
cdm > Disks > Disk Usage Change Defaults
This section lets you configure default thresholds and alarms for changes in disk usage. You can also configure high and low threshold values as
in the Disk Usage Defaults section.
Type of Change: specifies the type of change you want to monitor: increasing, decreasing, or both.
Change Calculation: specifies the way of calculating the disk change. Select one of the following values:
Summarized over all samples: calculates the difference between the first and last sample values. The number samples are specified in
the Disk Configuration section.
Between each sample: calculates the change in disk usage by comparing the values of two successive intervals.
cdm > Disks > Disk Read (B/S)
This section lets you activate the monitoring of disk read throughput and generate QoS at scheduled interval. You can also configure low and high
thresholds for generating alarms.
Note: The disk read throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Note: The disk write throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Note: The disk total throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Enable Folder Availability Monitoring: enables the availability state monitoring of the network disk while creating the profile.
The shared network disk is displayed as the <shareddiskname> node.
Important! You can only add shared disks to be monitored in windows robots.
<diskname> node
Navigation: cdm > host name > Disks > disk name
The disk name node lets you configure alarms and QoS for disk availability and size for an individual local or cluster disk.
Disk Missing: configure QoS disk availability status and generate alarm when the probe fails to connect with the disk.
Disk Size: configure QoS disk size and generate alarm when the probe fails to calculate the disk size.
Note: The configuration of disk size alarms and QoS are supported only on the Windows, Linux and AIX platforms.
The following attributes are common to many probe configuration fields in the cdm user interface. Here they pertain to disk usage, elsewhere they
pertain to memory or CPU usage, depending on context.
Enable High Threshold: enables the high threshold for disk usage change. This threshold is evaluated first and if it is not exceeded,
then the low threshold is evaluated.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the high threshold.
Enable Low Threshold: enables the low threshold for disk usage change. This threshold is evaluated only if the high threshold has not
been breached.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the low threshold.
Disk Usage node
Navigation: cdm > host name > Disks > disk name > Disk Usage
This node lets you configure disk usage individually for each monitored disk (diskname1, diskname2, etc). You can set attributes for alarm
thresholds, disk usage (%) and disk usage (MB).
Note: The alarms are generated for free disk space and QoS are generated for disk usage.
Navigation: cdm > host name > Disks > disk name > Disk Usage Change
This node lets you configure thresholds and alarm messages sent with changes in disk usage for each monitored disk.
Change Calculation: indicates how you want to calculate the disk change. Select from the drop-down menu either of the following:
Summarized over all samples: The change in disk usage is the difference between the latest sample and the first sample in the
"samples window," which is configured at the Disk Configuration level.
Between each sample: The change in disk usage is calculated after each sample is collected
<diskname> Inode Usage node
Navigation: cdm > Disks > disk name > Inode Usage > Alarm Thresholds
You can individually configure inode usage for each monitored disk on a Unix host.
Inode Usage Alarm Based on Threshold for: indicates the usage measurement units. Select either percent or count.
<shareddiskname> node
Navigation: cdm > host name > Disks > shared disk name
A shared network disk is added under the Disks node in the navigation pane. You can select the shared disk and update user name, password,
and disk availability monitoring properties.
Enable Space Monitoring
This section allows you to enable network disk usage monitoring for the profile by selecting the Enable Space Monitoring checkbox.
Network Connection
This section allows you to view or edit the user credentials for the shared network disk specified in the Add New Share window while creating a
network disk monitoring profile.
Shared Folder Availability
This section allows you to specify or edit the thresholds and alarms for the availability state of the network disk.
Note: The Disk Usage and the Disk Usage Change nodes for the <shareddiskname> node are the same as defined for the
<diskname> node.
Memory node
Navigation: cdm > Memory > Memory Paging > Alarm Thresholds
You can individually configure alarm and memory paging thresholds for alarm sent with changes in memory paging for each monitored disk. See
Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Physical Memory node
System Memory
User Memory
Note: The system and user memory monitoring is supported only on the Windows, Linux and AIX platforms.
Navigation: cdm > Memory > Swap Memory > Swap Memory (%)
A swap memory is a reserved space on hard drive which is used by the system when the physical memory (RAM) is full. However, the swap
memory is not a replacement of the physical memory due to lower data access rate.
The CPU, Disk, and Memory Monitoring probe calculates the swap memory similar to the swap -l command of Solaris. However, the probe use
pages instead of blocks. You can compare the swap memory information of the probe and the swap -l command by using the following formula:
Swap Memory (calculated by probe) in MB = (Blocks returned by the swap -l command * 512)/ (1024*1024).
Total Memory node
Navigation: cdm > Memory > Total Memory > Memory Usage (%)
Network node
Note: These network metrics are supported only on the Windows, Linux and AIX platforms.
Processor node
not select the Send alarm on each sample checkbox, in the Probe Configuration pane, the probe waits for the number of samples
(specified in this field) before sending the alarm. Do not specify 0 (Zero) in this field.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, If the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization reported will be the average for the last 25 minutes.Set QoS
Target as 'Total': Select this checkbox if you want the QoS target to be set to Total. Default: 5
Include CPU Wait in CPU Usage: includes the CPU Wait in the CPU Usage calculation.
Number of CPUs: displays the number of CPUs. This is a read-only field.
Maximum Queue Length: indicates the maximum number of items in the queue before an alarm is sent.
Alarm Message: sends the alarm message when the queue has been exceeded.
Individual CPU node
Navigation: cdm > Processor > Total CPU > Total CPU Idle
This section lets you configure thresholds to send alarm messages when the CPU usage gets below the configured thresholds. Some of the
configuration fields are:
Enable High Threshold: sets the high threshold for disk usage. This threshold is evaluated first and if it is not exceeded, then the low
threshold is evaluated.
Threshold: sends an alarm message when the CPU usage gets below this value. The value in percent of the CPU usage.
Alarm Message: sends the alarm message when the CPU usage on the disk is below the high threshold.
Enable Low Threshold: sets the low threshold for disk usage. This threshold is evaluated only if the high threshold has not been
exceeded.
Iostat node (Linux, Solaris, and AIX)
The probe executes the iostat command for fetching the iostat monitors value. The QoS values are obtained from the second sample value of the
devices.
Set or modify the following values as required:
Interval (minutes): defines the time interval for fetching the sample values from the device. Default: 5
Sample: defines the time interval in seconds, which is used with iostat command for fetching the iostat data for that time duration. This
value must be less than Interval (minutes) field value. Default: 10
Device Iostat node
Iostat Monitors
Iostat Average Queue Length
Iostat Average Request Size
Iostat Average Service Time (Linux)
Iostat Average Wait Time (active, by default)
Iostat Read Requests Merged Per Second
Iostat Reads Per Second
Iostat Sector Reads Per Second
Iostat Sector Writes Per Second
Iostat Utilization Percentage (active, by default)
Iostat Write Requests Merged Per Second
Iostat Writes Per Second
Solaris
AIX
The probe detects the underlying OS and filters the list of monitors. This section lets you enable the iostat monitoring for the device. This option is
disabled, by default.
This section represents the actual monitor name of the device for configuration.
QoS Name: identifies the QoS name of the monitor.
Units: identifies a unit of the monitor. For example, % and Mbytes.
Publish Data: publishes the QoS data of the monitor.
Enable High Threshold: lets you configure the high threshold parameters. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatError
Enable Low Threshold: lets you configure the low threshold parameters. Typically, the low threshold generates a warning alarm and the
high threshold generates an error alarm. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatWarning
Similarly, you can configure other monitors because each monitor contains the same set of fields.
Note: You can add network disks to be monitored using Windows robots using the New Share option in the Disk Usage section of the
Status tab. You can monitor availability state and usage of network disks using any robot where the probe is deployed by clicking the E
nable Space Monitoring option in the Disk Usage section of the Status tab.
Probe Defaults
You can use the sample configuration file to configure a probe with default monitoring values.
Follow these steps:
1. Navigate to the Program Files\Nimsoft\Probes\System\<probe_name> folder.
2. Make the desired configuration in the <probe_name>.cfg file.
3. Run/restart the probe in Infrastructure Manager to initialize the configuration.
You can now use the newly added default monitoring values, such as templates, in the left pane as per requirement.
Note: When performing this operation with the cdm probe, you must ensure that the disk partitions are the same on the source and the
target computers.
For example, If the source computer has a C: and a D: partition, and you copy the cdm probe configuration to a cdm probe on a computer with
only a C: partition, the cdm probe on this computer will try to monitor a D: partition (which is missing) and report an error.
Follow these steps:
1. Log on to the robot where your configured cdm probe resides.
2. Select the cdm probe to be copied from the probe list in the Infrastructure Manager and drag and drop the probe into the Archive.
3. Click Rename and enter a unique package name the copy of the cdm probe archive. For example, rename the package to cdm_master.
The distribution progress window appears and the configuration of the probe is completed after distribution process is finished.
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info
Default: Yes
This key has been introduced in the Raw Configuration section to make the Device Id of shared drive and local drive identical. You
are required to set this key as No to enable this feature.
This feature is introduced because of the following two reasons.
a. When file system is mounted on linux through cifs, then, on fresh deployment of the cdm probe, the Device Id and Metric Id for QoS
and alarms of respective mounted file system are missing.
b. On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
Type of regular
expression
Explanation
[A-Z]
Standard (PCRE)
[A-Z:\\]
Custom
Matches with the Uppercase character type of the local disk available on the respective
box
Standard (PCRE)
[*.\\]
Custom
Standard (PCRE)
\d*
Custom
Setup Tab
General Tab
Control Properties Tab
Message Definitions Tab
Cluster Tab
The Setup tab is used to configure general preferences for the probe. There are tabs within this tab that you can use to specify General, Control
Properties and Message Definitions. A fourth tab, the Cluster tab, displays if the probe is running within a clustered environment.
General Tab
the probe. For more details about the samples, see the Control Properties tab.
Important! If the Set QoS source to robot name option is set in the controller you will get the robot name also as target.
The Control Properties tab defines the time limit after which the probe asks for data and the number of samples the probe should store to
calculate the values used to determine the threshold breaches.
The fields displayed in the above dialog are separated into the following three sections:
Disk properties
CPU properties
Memory & Paging properties
The field description of each section is given below:
Interval
Specify the time limit in minutes between probe requests for data. This field is common for all three sections.
Samples
Allows you to specify how many samples the probe should store for calculating values used to determine threshold breaches. This field is
common for all three sections.
Note: Even if you set the sample value as 0, the QoS for disk are generated based on the default sample value.
Note: This option is available for nonwindows platforms only, like Linux.
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE 'qos_cpu_usage'
AND target NOT IN('user','system','wait','idle')
To update table for new target:
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE
'qos_cpu_multi_usage' AND (target NOT LIKE 'User%' AND target NOT LIKE 'System%'
AND target NOT LIKE 'Wait%' AND target NOT LIKE 'Idle%')
To update table for new target:
The Message Definitions tab offers functionality to customize the messages sent whenever a threshold is breached. A message is defined as a
text string with a severity level. Each message has a token that identifies the associated alarm condition.
The Cluster tab is displayed only when the cdm probe is hosted in clustered environment and it is configured as a part of a cluster.
It displays a list of detected virtual groups belonging to the cluster. By editing the entries (refer Edit Alarm or QoS source section), you can set
the alarm source and QoS source to be used for disks belonging to that virtual group.
The available options for alarm source and QoS source are:
<cluster ip>
<cluster name>
<cluster name>.<group name>
Edit Alarm or QoS Source
You can edit the alarm source or QoS source.
Follow these steps:
1. Double-click a virtual group entry.
2. On the Group Sources dialog, select the Alarm source and QoS source.
3. Click OK.
Note: QoS messages can also be sent on Disk usage (both in % and MB), and availability for shared disks (also disk usage on NFS file
systems if the Enable space monitoring option is set for the file system as described in the section Setup > Cluster). These options can
be selected when defining the threshold values for these options under the Status tab.
Status Tab
The Status tab sets up high and low thresholds for the CPU, memory and paging activity for the selected file system. It is also the default tab of
the cdm probe GUI.
The disk usage section displays the details of all disks installed on the system and the disk usage details such as file system type, amount of free
space and total disk usage. You can monitor each disk individually, with individual threshold values, messages and severity levels.
Note: When using NFS mounts in the cdm probe, be aware that the server where the mount point is pointing will appear in the
discovery in USM.
You can modify the monitoring properties of disk by right-clicking on a monitored disk in the list.
Use the New Share option to modify the disk usage properties.
You can specify the network disk or folder to be monitored by the cdm probe.The network location is specified in the Share field using the format:
\\computer\share. In addition, specify the user name and password to be used when testing the availability of the share, and the Message ID to be
sent if a share is determined to be unavailable. You can use the domain user if the machine is a member of a domain.
Select the Folder Availability Quality of Service Message option to send QoS messages on availability of the shared folder.
Note: For UNIX platforms, this option is used to monitor NFS file systems.
To enable or disable space monitoring of the Windows share/mounted drive, right-click a monitored windows share/mounted drive in the list and
select the enable/disable space monitoring option.
Note: The shares are tested from the service context, and the cdm probe just checks that it is possible to mount the share.
UNIX platforms
To enable/disable space monitoring of the file system, right-click a monitored NFS file system in the list and select the enable/disable space
monitoring option. Enabling space monitoring of a NFS file system may cause problems for the cdm probe if the communication with the NFS
server is disrupted (e.g. stale NFS handles). By default, the NFS file systems are monitored for availability only.
Edit Disk Properties
Use the Edit option to modify the disk usage properties.
The disk usage configuration GUI displays tabs for each section of the disk configuration, which are explained below:
Disk usage and Thresholds tab
The page displays the amount of total, used, and free disk space for the file system.
You can configure the following threshold settings:
Monitor disk using either Mbytes or %.
High threshold for the disk. If you select this option, set the value (based on either Mbytes or %) and select the alarm message to be
sent. When the amount of free space gets below this value, the specified alarm message will be sent. This threshold is evaluated first and
if it is not exceeded, then the low threshold is evaluated.
Low threshold for the disk. If you select this option, set the value (based on either Mbytes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be sent. This threshold is evaluated only if the
Change between each sample. The change in disk usage will be calculated after each sample is collected.
Threshold settings
This section allows you to define the alarm conditions:
Type of change. You can select whether alarms should be issued on increase, decrease or both increase and decrease in disk usage.
High threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be sent. The default value is 2 Mbytes.
Low threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be issued. The default value is 2 Mbytes.
QoS
You can send QoS messages on disk usage change in Mbytes.
Delete a Disk
Use this option to delete the disk from being monitored by the cdm probe. When you use the Delete option a confirmation dialog appears. Click Y
es to delete the disk from the list.
Modify Default Disk Parameters
Use the Modify Default Disk Parameters option to change fixed disk properties.
If you modify the default settings than every disk that you add from that point forward will have the new settings as the default disk properties.
Enable Space Monitoring
The Enable space monitoring option appears only for the shared drive/folder (using the New Share... option) being monitored by the cdm probe.
To enable/disable space monitoring of the windows share/mounted drive/NFS file system, right-click a monitored windows share/mounted drive/
NFS file system in the list and select the enable/disable space monitoring option.
The Multi CPU Tab
Use the Multi CPU option to display the alarm threshold and the CPU usage for the different CPUs in a multi-CPU configuration. You can specify
the maximum threshold, CPU difference threshold and processors to display.
Note: This tab only visible when the cdm probe is running on a multi-CPU computer.
A multi-core processor (multi-CPU) is a single computing component with two or more independent actual processors (called "cores"), which are
the units that read and execute program instructions. A multi-core processor implements multiprocessing in a single physical package.
This tab contains a graph displaying the alarm threshold and the CPU usage for each processor in a multi-CPU configuration.
The thresholds and options available in the above dialog are explained below:
Maximum
High (error) threshold (in %) for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the
threshold).
Difference
CPU difference threshold (in %). Alarms are sent when the difference in CPU usage among the CPUs in a multi-CPU system
breaches the threshold).
Select processors to view
Select the processor(s) to view in the graph. By default all available processor are shown.
Click Update to refresh the graph with the most current sample values.
Advanced Tab
Use the Advanced tab to customize the QoS messages, for example an alarm on processor queue length, an alarm on detected reboot, and
paging measurements.
Notes:
If running on a multi-CPU system, the queued processes will be shared on the number of processors. For example, if running
on a system with four processors and using the default Max Queue Length value (4), alarm messages will be generated if the
number of queued processes exceeds 16.
To enable the QoS metric QOS_PROC_QUEUE_LEN for per CPU, you are required to add a key system_load_per_cpu with
value as Yes under the CPU section through the raw configure option. The probe calculates the system load on Linux, Solaris
and AIX as Load/Number of CPU if this key is set to Yes.
Lparstat i command
Total Capacity =( maxVirtualCPU/ maxCapacity)*100;
CPU User = CPU user
*EntCap)/TotCapacity;
cpuStats->fSystem = (double)((cpuStats->fSystem *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fWait = (double)((cpuStats->fWait * cpuStats->fEntCap)/TotCapacity);
cpuStats->fIdle = (double)((cpuStats->fIdle * cpuStats->fEntCap)/TotCapacity);
Paging measured in
Paging can be measured in Kilobytes per second or pages per second.
Paging is the amount of memory which has been sent to or read from virtual memory. This option lets you select the paging to be measured in
one of the following units:
Kilobytes per second (KB/s)
Pages per second (Pg/s). Note that the size of the pages may vary between different operating systems.
Note: When changing the paging selection, the header of the Paging graph on the Status tab will immediately change to show the
selected unit, but the values in the graph will not change until the next sample is measured.
The Custom tab displays a list of all currently defined custom profiles. Custom profiles are used to get additional thresholds and alarms for
checkpoints that are available in the probe. All the alarm situations are available, except for those available for multi-CPU and cluster disks. A
custom profile allows you to fine-tune monitoring of resources for alarming purposes.
The alarms for each custom profile will be sent using suppression keys unique to the profile so that you can get multiple alarms for what is
basically the same alarm situation (for instance, the a breach of the memory usage threshold).
You can right-click inside the above dialog to create new custom profiles to monitor the CPU, disk or memory. Once a custom profile is created
you can select one or more custom profiles to edit, delete or activate/deactivate as and when required.
New CPU Profile
You can create a custom profile for a local disk, shared disk, or for a disk available on a network.
Note: on selecting this option, the drop-down menu Mount point and the field Remote Disk are disabled which means that
monitoring is enabled either through the regular expression or through the drop-down menu.
Active: activates the alarm generation if the disk is unavailable or not mounted.
Allow space monitoring: lets you configure three new checkpoints to monitor the disk, which are Disk free space, Inodes free, Space usage
change.
For more information on these checkpoints, refer to the Control Properties Tab section.
Note: You are required to enable NFS drives from the Status tab to see custom NFS inode alarms.
Configuration Overview
How to set up disk monitoring
How to set up CPU monitoring
How to set up memory monitoring
Alarm Thresholds
Edit the Configuration File Using Raw Configure
Configuration Overview
The following diagram outlines the process to configure the probe to monitor CPU, disks and memory.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info: makes the Device Id of shared drive and local drive identical. Navigate to the setup folder and set this key as No
to enable this feature.
Default: Yes
This feature is introduced because of the following two reasons:
When file system is mounted on Linux through cifs and the cdm probe is deployed fresh, the Device Id and Metric Id for QoS and
alarms for the respective mounted file system are missing.
On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size: indicates the total size of the disk. Navigate to disk > fixed_default and set this key to yes to enable this feature.
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
<hostname> node
Disks
<diskname> node
Disk Usage node
Disk Usage Change node
<diskname> Inode Usage node
Add a Shared Disk for Monitoring
Memory node
Memory Paging node
Physical Memory node
Swap Memory node
Total Memory node
Network node
Processor node
Individual CPU node
Total CPU node
Iostat node (Linux, Solaris, and AIX)
Device Iostat node
cdm node
Navigation: cdm
This node lets you view the probe information, configure the logging properties and set data management values.
Set or modify the following values as required:
cdm > Probe Information
This section provides the basic probe information and is read-only.
cdm > General Configuration
This section provides general configuration details.
Log Level: Sets the amount of detail that is logged to the log file. Default: 0 - Fatal
Log size (KB): Sets the maximum size of the log. When using the up and down arrows, the value increases or decreases by 5. Default:
100 KB
Send alarm on each sample: If selected, the probe generates an alarm on each sample where there is a threshold breach. If not
selected, the probe waits for the number of samples (specified in Samples in the cdm > Disk Configuration, cdm > Memory or cdm >
Processor configuration screens) before sending the alarm. The sample count is cleared on de-activation of the probe.
Send short name for QoS source: If selected, sends only the host name. If not selected, sends the full host name with domain.
Allow QoS source as target: A number of QoS messages, by default, use the host name as their target. If selected, the target name is
changed to be the same as the QoS source name.
Monitor iostat (Linux and Solaris only): Enables the iostat monitoring of the host system devices.
Count Buffer-Cache as Used Memory (Linux and Solaris only): Counts the buffer and cache memory as used memory while
monitoring the physical and system memory utilization. If not selected, the buffer and cache memory is counted as free memory.
cdm > Messages
This section provides a listing of alarm messages issued by the cdm probe and is read-only. Selecting a row displays additional alarm message
attributes below the main list, including Message Token, Subsystem, and I18N Token.
<hostname> node
Disks
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization is the average for the last 25 minutes. Default: 1
Ignore Filesystems: defines the filesystem to be excluded from monitoring. For example, specifying the regular expression C:\\ in this
field excludes the C drive of the system from monitoring and also stops displaying the disk in navigation pane.
Timeout: specifies the time limit in seconds for the probe for collecting the disk-related data. This option is useful at time of disk fail/crash
in stale File system and avoid a hang situation for the probe. Default: 10
Note: The first three fields are common to Memory and Processor configuration sections.
Between each sample: calculates the change in disk usage by comparing the values of two successive intervals.
cdm > Disks > Disk Read (B/S)
This section lets you activate the monitoring of disk read throughput and generate QoS at scheduled interval. You can also configure low and high
thresholds for generating alarms.
Note: The disk read throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Note: The disk write throughput monitoring is supported only on the Windows, Linux and AIX platforms.
Note: The disk total throughput monitoring is supported only on the Windows, Linux and AIX platforms.
<diskname> node
Navigation: cdm > host name > Disks > disk name
The disk name node lets you configure alarms and QoS for disk availability and size for an individual disk.
Disk Missing: configure QoS disk availability status and generate alarm when the probe fails to connect with the disk.
Disk Size: configure QoS disk size and generate alarm when the probe fails to calculate the disk size.
Note: The configuration of disk size alarms and QoS are supported only on the Windows, Linux and AIX platforms.
The following attributes are common to many probe configuration fields in the cdm user interface. Here they pertain to disk usage, elsewhere they
pertain to memory or CPU usage, depending on context.
Enable High Threshold: enables the high threshold for disk usage change. This threshold is evaluated first and if it is not exceeded,
then the low threshold is evaluated.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the high threshold.
Enable Low Threshold: enables the low threshold for disk usage change. This threshold is evaluated only if the high threshold has not
been breached.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the low threshold.
Disk Usage node
Navigation: cdm > host name > Disks > disk name > Disk Usage
This node lets you configure disk usage individually for each monitored disk (diskname1, diskname2, etc). You can set attributes for alarm
thresholds, disk usage (%) and disk usage (MB).
Note: The alarms are generated for free disk space and QoS are generated for disk usage.
Navigation: cdm > host name > Disks > disk name > Disk Usage Change
This node lets you configure thresholds and alarm messages sent with changes in disk usage for each monitored disk.
Change Calculation: indicates how you want to calculate the disk change. Select from the drop-down menu either of the following:
Summarized over all samples: The change in disk usage is the difference between the latest sample and the first sample in the
"samples window," which is configured at the Disk Configuration level.
Between each sample: The change in disk usage is calculated after each sample is collected
<diskname> Inode Usage node
Navigation: cdm > Disks > disk name > Inode Usage > Alarm Thresholds
You can individually configure inode usage for each monitored disk on a Unix host.
Inode Usage Alarm Based on Threshold for: indicates the usage measurement units. Select either percent or count.
Thresholds and alarms attributes are the same as listed in Disk Usage Change Defaults.
Add a Shared Disk for Monitoring
As a system administrator you want to monitor the availability and usage of the shared disk. The disk availability ensures that it is accessible to
authorized users and application. You want get an alarm when the disk is not available and QoS data on disk usage. The CPU, Disk, and Memory
Performance probe lets you add a shared disk or folder which can be configured for monitoring for generating QoS data and alarms as you do for
a local disk.
Follow these steps:
1. Click the Options icon next to the Disks node in the navigation pane.
2. Select Add New Share.
3. Configure following fields in the Add New Share dialog:
Share: defines the network path of the shared disk for folder. The network location is specified in the \\computer\share format.
User: defines the user name for authenticating the probe to have appropriate access to the shared disk or folder. Define the user
name in <domain name>\<user name> when the shared disk is available on a domain.
Password: defines the password for authenticating the user.
Alarm Message: specifies the alarm message when the disk is not available. Default: ConnectionError
Enable Folder Availability Monitoring: activates the QoS data on shared disk availability. Default: Not selected
4. Click Submit.
The shared disk is added under the Disks node in the navigation pane. You can select the shared disk and update user name, password, and
disk availability monitoring properties.
Memory node
(specified in this field) before sending the alarm. Do not specify 0 (Zero) in this field.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, If the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization reported will be the average for the last 25 minutes. Default: 1
Set QoS Target as 'Memory': sets the QoS target to Memory. Default: Not selected
Memory Paging node
Navigation: cdm > Memory > Memory Paging > Alarm Thresholds
You can individually configure alarm and memory paging thresholds for alarm sent with changes in memory paging for each monitored disk. See
Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Physical Memory node
Navigation: cdm > Memory > Swap Memory > Swap Memory (%)
A swap memory is a reserved space on hard drive which is used by the system when the physical memory (RAM) is full. However, the swap
memory is not a replacement of the physical memory due to lower data access rate.
The CPU, Disk, and Memory Monitoring probe calculates the swap memory similar to the swap -l command of Solaris. However, the probe use
pages instead of blocks. You can compare the swap memory information of the probe and the swap -l command by using the following formula:
Swap Memory (calculated by probe) in MB = (Blocks returned by the swap -l command * 512)/ (1024*1024).
Total Memory node
Navigation: cdm > Memory > Total Memory > Memory Usage (%)
Network node
Important! The probe monitors only physical NICs of system and sum up the metric values when multiple NICs are installed on the
monitored system.
Note: These network metrics are supported only on the Windows, Linux and AIX platforms.
Processor node
Navigation: cdm > Processor > Total CPU > Total CPU Idle
This section lets you configure thresholds to send alarm messages when the CPU usage gets below the configured thresholds. Some of the
configuration fields are:
Enable High Threshold: sets the high threshold for disk usage. This threshold is evaluated first and if it is not exceeded, then the low
threshold is evaluated.
Threshold: sends an alarm message when the CPU usage gets below this value. The value in percent of the CPU usage.
Alarm Message: sends the alarm message when the CPU usage on the disk is below the high threshold.
Enable Low Threshold: sets the low threshold for disk usage. This threshold is evaluated only if the high threshold has not been
exceeded.
The probe executes the iostat command for fetching the iostat monitors value. The QoS values are obtained from the second sample value of the
devices.
Set or modify the following values as required:
Interval (minutes): defines the time interval for fetching the sample values from the device. Default: 5
Sample: defines the time interval in seconds, which is used with iostat command for fetching the iostat data for that time duration. This
value must be less than Interval (minutes) field value. Default: 10
Device Iostat node
Iostat Monitors
Iostat Average Queue Length
Iostat Average Request Size
Iostat Average Service Time (Linux)
Iostat Average Wait Time (active, by default)
Iostat Read Requests Merged Per Second
Iostat Reads Per Second
Iostat Sector Reads Per Second
Iostat Sector Writes Per Second
Iostat Utilization Percentage (active, by default)
Iostat Write Requests Merged Per Second
Iostat Writes Per Second
Solaris
AIX
The probe detects the underlying OS and filters the list of monitors. This section lets you enable the iostat monitoring for the device. This option is
disabled, by default.
This section represents the actual monitor name of the device for configuration.
QoS Name: identifies the QoS name of the monitor.
Units: identifies a unit of the monitor. For example, % and Mbytes.
Publish Data: publishes the QoS data of the monitor.
Enable High Threshold: lets you configure the high threshold parameters. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatError
Enable Low Threshold: lets you configure the low threshold parameters. Typically, the low threshold generates a warning alarm and the
high threshold generates an error alarm. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatWarning
Similarly, you can configure other monitors because each monitor contains the same set of fields.
Configuration Overview
How to set up disk monitoring
How to set up CPU monitoring
How to set up memory monitoring
Probe Defaults
How to Copy Probe Configuration Parameters
Configuration Overview
The following diagram outlines the process to configure the probe to monitor CPU, disks and memory.
Probe Defaults
You can use the sample configuration file to configure a probe with default monitoring values.
Follow these steps:
1. Navigate to the Program Files\Nimsoft\Probes\System\<probe_name> folder.
2. Make the desired configuration in the <probe_name>.cfg file.
3. Run/restart the probe in Infrastructure Manager to initialize the configuration.
You can now use the newly added default monitoring values, such as templates, in the left pane as per requirement.
Note: When performing this operation with the cdm probe, you must ensure that the disk partitions are the same on the source and the
target computers.
For example, If the source computer has a C: and a D: partition, and you copy the cdm probe configuration to a cdm probe on a computer with
only a C: partition, the cdm probe on this computer will try to monitor a D: partition (which is missing) and report an error.
Follow these steps:
1. Log on to the robot where your configured cdm probe resides.
2. Select the cdm probe to be copied from the probe list in the Infrastructure Manager and drag and drop the probe into the Archive.
3. Click Rename and enter a unique package name the copy of the cdm probe archive. For example, rename the package to cdm_master.
The distribution progress window appears and the configuration of the probe is completed after distribution process is finished.
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info
Default: Yes
This key has been introduced in the Raw Configuration section to make the Device Id of shared drive and local drive identical. You
are required to set this key as No to enable this feature.
This feature is introduced because of the following two reasons.
a. When file system is mounted on linux through cifs, then, on fresh deployment of the cdm probe, the Device Id and Metric Id for QoS
and alarms of respective mounted file system are missing.
b. On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
Type of regular
expression
Explanation
[A-Z]
Standard (PCRE)
[A-Z:\\]
Custom
Matches with the Uppercase character type of the local disk available on the respective
box
Standard (PCRE)
[*.\\]
Custom
Standard (PCRE)
\d*
Custom
Setup Tab
The Setup tab is used to configure general preferences for the probe. There are tabs within this tab that you can use to specify general, control
properties and message definitions. A fourth tab, the Cluster tab, displays if the probe is running within a clustered environment.
General Tab
The fields are explained below:
Log level
Sets the level of detail written to the log file. Log as little as possible during normal operation, to minimize disk consumption.
Log size
Sets the size of the probe's log file where probe-internal log messages are written. Upon reaching this size, the contents of the file are cleared.
The default size is 100 KB.
Send alarm on each sample
If selected, the probe generates an alarm on each sample. If not selected, the probe waits for the number of samples (specified in the samples
field of the Control properties tab) before sending the alarm. This check box is selected by default.
For example, Interval values is set to 1 minute and number of sample is set to 2 and:
Option is Unchecked: the first alarm will be generated in 2 minutes and the respective alarms will generate in 1 minute time interval
each.
Option is Checked: the first alarm will be generated in 1 minute and the respective alarms will generate in 1 minute time interval each.
Note: The sample collected at the start of the probe is considered to be the first sample. The sample count is cleared on de-activation of
the probe. For more details about the samples, see the Control Properties tab.
Important! If the Set QoS source to robot name option is set in the controller you will get the robot name also as target.
Note: Even if you set the sample value as 0, the QoS for disk are generated based on the default sample value.
Note: This option is available for nonwindows platforms only, like Linux.
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE 'qos_cpu_usage'
AND target NOT IN('user','system','wait','idle')
To update table for new target:
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE
'qos_cpu_multi_usage' AND (target NOT LIKE 'User%' AND target NOT LIKE 'System%'
AND target NOT LIKE 'Wait%' AND target NOT LIKE 'Idle%')
To update table for new target:
Cluster Tab
The Cluster tab is displayed only when the cdm probe is hosted in clustered environment and it is configured as a part of a cluster.
It displays a list of detected virtual groups belonging to the cluster. By editing the entries (refer Edit Alarm or QoS source section), you can set the
alarm source and QoS source to be used for disks belonging to that virtual group.
The available options for alarm source and QoS source are:
<cluster ip>
<cluster name>
<cluster name>.<group name>
Edit Alarm or QoS Source
Status Tab
The Status tab sets up high and low thresholds for the CPU, memory and paging activity for the selected file system. It is also the default tab of
the cdm probe GUI.
The fields are explained below:
Graphs
The graphs display actual samples in purple, averages in blue, error threshold (if configured) in red, and warning threshold (if configured) in
yellow.
CPU usage: graph of the CPU usage.
Memory usage: three separate graphs (% of total available memory, physical, and virtual memory). Use the buttons M, S, and P on the top right
corner of the graph to toggle through the three graphs.
% of available memory: in % of total available memory
Physical memory: in % of available physical memory (RAM).
Swap memory: on UNIX systems, this value refers to the % of available swap space.
Note: Typing <Ctrl>+S on your keyboard will save the current view for this graph, and this view will be shown the next time you open
the probe GUI.
Note: When using NFS mounts in the cdm probe, be aware that the server where the mount point is pointing will appear in the
discovery in USM.
New Share
Edit
Delete
Modify Default Disk Parameters
Enable Space Monitoring
Note: For UNIX platforms, this option is used to monitor NFS file systems.
To enable or disable space monitoring of the Windows share/mounted drive, right-click a monitored windows share/mounted drive in the list and
select the enable/disable space monitoring option.
Note: The shares are tested from the service context, and the cdm probe just checks that it is possible to mount the share.
UNIX platforms
To enable/disable space monitoring of the file system, right-click a monitored NFS file system in the list and select the enable/disable space
monitoring option. Enabling space monitoring of a NFS file system may cause problems for the cdm probe if the communication with the NFS
server is disrupted (e.g. stale NFS handles). By default, the NFS file systems are monitored for availability only.
High threshold for the disk. If you select this option, set the value (based on either inodes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be sent.
Low threshold for the disk. If you select this option, set the value (based on either inodes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be issued.
You can configure the Quality of Service message, which can have information about the disk usage in inodes, % or both depending on your
selections.
Disk Usage Change and Thresholds tab
This tab lets you specify the alarm conditions for alarms to be sent when changes in disk usage occur.
Disk usage change calculation
You can select one of the following:
Change summarized over all samples. The change in disk usage is the difference between the latest sample and the first sample in
the "samples window". The number of samples the cdm probe will keep in memory for threshold comparison is set as Samples on
the Setup > Control Properties tab.
Note: There may be some discrepancy between the values in QoS and values in alarms when the Change summarized over all
samples option is selected. This is because the QoS are generated on every interval and Alarms are generated based on the selection
of the option Change summarized over all samples.
Change between each sample. The change in disk usage will be calculated after each sample is collected.
Threshold settings
This section allows you to define the alarm conditions:
Type of change. You can select whether alarms should be issued on increase, decrease or both increase and decrease in disk usage.
High threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be sent. The default value is 2 Mbytes.
Low threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be issued. The default value is 2 Mbytes.
QoS
You can send QoS messages on disk usage change in Mbytes.
Delete a Disk
Use this option to delete the disk from being monitored by the cdm probe. When you use the Delete option a confirmation dialog appears. Click Y
es to delete the disk from the list.
Use the Multi CPU option to display the alarm threshold and the CPU usage for the different CPUs in a multi-CPU configuration. You can specify
the maximum threshold, CPU difference threshold and processors to display.
Note: This tab only visible when the cdm probe is running on a multi-CPU computer.
A multi-core processor (multi-CPU) is a single computing component with two or more independent actual processors (called "cores"), which are
the units that read and execute program instructions. A multi-core processor implements multiprocessing in a single physical package.
This tab contains a graph displaying the alarm threshold and the CPU usage for each processor in a multi-CPU configuration.
The thresholds and options available in the above dialog are explained below:
Maximum
High (error) threshold (in %) for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the
threshold).
Difference
CPU difference threshold (in %). Alarms are sent when the difference in CPU usage among the CPUs in a multi-CPU system
breaches the threshold).
Select processors to view
Select the processor(s) to view in the graph. By default all available processor are shown.
Click Update to refresh the graph with the most current sample values.
Advanced Tab
Use the Advanced tab to customize the QoS messages, for example an alarm on processor queue length, an alarm on detected reboot, and
paging measurements.
The fields are explained below:
Quality of Service Messages
Selecting any of the following settings enables QoS messages to be sent as per the time intervals defined under Control properties tab.
Processor Queue Length (Windows only)
Measures the number of queued processes, divided by the number of processors, waiting for time on the CPU for Windows system. For
AIX, SGI, Linux and Solaris, this QoS message refers to System Load.
Computer uptime (hourly)
Measures the computer uptime in seconds every hour.
Memory Usage
Measures the amount of total available memory (physical + virtual memory) used in Mbytes.
Memory in %
Measures the amount of total available memory (physical + virtual memory) used in %.
Memory Paging in Kb/s
Measures the amount of memory that has been sent to or read from virtual memory in Kbytes/second.
Memory Paging in Pg/s
Measures the amount of memory that has been sent to or read from virtual memory in pages per second.
Note: If you have been running CDM version 3.70 or earlier, the QoS settings in the cdm probe GUI are different than CDM
version 3.72. However, if CDM version 3.70 or earlier already has created QoS entries in the database for kilobytes per second
(Kb/s) and/or pages per second (Pg/s), these entries will be kept and updated with QoS data from the newer CDM version (3.72
and higher).
Notes:
If running on a multi-CPU system, the queued processes will be shared on the number of processors. For example, if running
on a system with four processors and using the default Max Queue Length value (4), alarm messages will be generated if the
number of queued processes exceeds 16.
To enable the QoS metric QOS_PROC_QUEUE_LEN for per cpu, you are required to add a key system_load_per_cpu with
value as Yes under the CPU section through the raw configure option. The probe calculates the system load on Linux, Solaris
and AIX as Load/Number of CPU if this key is set to Yes.
Lparstat i command
Total Capacity =( maxVirtualCPU/ maxCapacity)*100;
CPU User = CPU user *EntCap)/TotCapacity;
cpuStats->fSystem = (double)((cpuStats->fSystem *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fWait = (double)((cpuStats->fWait *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fIdle = (double)((cpuStats->fIdle *
cpuStats->fEntCap)/TotCapacity);
Paging measured in
Paging can be measured in Kilobytes per second or pages per second.
Paging is the amount of memory which has been sent to or read from virtual memory. This option lets you select the paging to be measured in
one of the following units:
Kilobytes per second (KB/s)
Pages per second (Pg/s). Note that the size of the pages may vary between different operating systems.
Note: When changing the paging selection, the header of the Paging graph on the Status tab will immediately change to show the
selected unit, but the values in the graph will not change until the next sample is measured.
Custom Tab
The Custom tab displays a list of all currently defined custom profiles. Custom profiles are used to get additional thresholds and alarms for
checkpoints that are available in the probe. All the alarm situations are available, except for those available for multi-CPU and cluster disks. A
custom profile allows you to fine-tune monitoring of resources for alarming purposes.
The alarms for each custom profile will be sent using suppression keys unique to the profile so that you can get multiple alarms for what is
basically the same alarm situation (for instance, the a breach of the memory usage threshold).
You can right-click inside the above dialog to create new custom profiles to monitor the CPU, disk or memory. Once a custom profile is created
you can select one or more custom profiles to edit, delete or activate/deactivate as and when required.
New CPU Profile
The fields are explained below:
Name: defines the CPU profile name.
Description: defines the CPU profile description.
Alarm On: specifies that the alarm threshold is considered as the average of defined threshold values, or the individual threshold values.
High and Low: activates the alarm generation in case high, or low threshold values of selected checkpoint are breached.
New Disk Profile
You can create a custom profile for a local disk, shared disk, or for a disk available on a network.
The fields are explained below:
Name: defines the disk profile name.
Description: defines the description of the disk profile.'
Regular Expression for Mount Point: defines a regular expression through which you can monitor your Custom Local Disk (for
Windows platform) and Custom Local and NFS (for Linux, Solaris and AIX platforms).
Note: on selecting this option, the drop-down menu Mount point and the field Remote Disk are disabled which means that
monitoring is enabled either through the regular expression or through the drop-down menu.
Active: activates the alarm generation if the disk is unavailable or not mounted.
Allow space monitoring: lets you configure three new checkpoints to monitor the disk, which are Disk free space, Inodes free, Space usage
change.
For more information on these checkpoints, refer to the Control Properties Tab section.
Note: You are required to enable NFS drives from the Status tab to see custom NFS inode alarms.
Overview
Using the Admin Console to Access the cdm Configuration GUI
cdm
Probe Information
General Configuration
Messages
Configuring Computer Uptime and System Reboot QoS Data
Disks
Disk Configuration
Disk Missing Defaults
Disk Usage Defaults
Inode Usage Defaults (UNIX only)
Disk Usage Change Defaults
Disk Read (B/S)
Disk Write (B/S)
Disk Read and Write (B/S)
<diskname> Configuration
Disk Usage Configuration
Disk Usage Change Configuration
<diskname> Inode Usage Configuration
Add a Shared Disk for Monitoring
Memory
Memory Paging Configuration
Physical Memory Configuration
Swap Memory Configuration
Total Memory Configuration
Network
Processor
Individual CPU Configuration
Total CPU Configuration
Iostat (Linux, Solaris, and AIX)
Device Iostat Configuration
Edit the Configuration File Using Raw Configure
Overview
The following table provides an overview of the configuration settings available for the cdm probe.
Node
Subnode
cdm
Available settings
<hostname>
Disks
Computer Uptime and System Reboot alarms for the cdm host system
<diskname1>
Disk Usage
<diskname2>
Memory
Memory Paging
Physical Memory
Swap Memory
Total Memory
Processor
Individual CPU
Total CPU
cdm
Navigation: cdm
This section lets you view probe and QoS information, change the log level, and set data management values.
Probe Information
This section provides a listing of alarm messages issued by the cdm probe and is read-only. Selecting a row displays additional alarm message
attributes below the main list, including Message Token, Subsystem, and I18N Token.
Publish Data: Publishes QoS data on computer uptime; unchecked by default. All other fields are read-only.
cdm > <probe hostname> System Reboot
Publish Alarms: Generates an alarm when the system reboots; unchecked by default
Alarm Message for Detected Reboot: Choose the desired alarm message from the pull-down menu.
See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Disks
Navigation: cdm > Disks
The Disks node lets you configure the global monitoring metrics and default attribute values for each individual disk. The Disks node also includes
the shared drives of the host system. For example, cifs is a shared windows disk that is mounted on the Linux environment, and gfs which is a
shared disk of a clustered environment.
Disk Configuration
This section lets you configure the time interval and number of samples for fetching metric values from the system. These properties are
applicable for all the monitoring disks of the system.
Interval (minutes): specifies the time in minutes for how often the probe retrieves sample data. Default: 15
Samples: specifies how many samples the probe is keeping in memory for calculating average and threshold values. Default: 4
Note: In case, the Send alarm on each sample option is not selected, the probe waits for the number of samples then sends the alarm.
Do not specify 0 (Zero) in this field.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization is the average for the last 25 minutes. Default: 1
Ignore Filesystems: defines the filesystem to be excluded from monitoring. For example, specifying the regular expression *C:* in this
field excludes the C drive of the system from monitoring and also stops displaying the disk in navigation pane.
Timeout: specifies the time limit in seconds for the probe for collecting the disk-related data. This option is useful at time of disk fail/crash
in stale File system and avoid a hang situation for the probe. Default: 10
Note: The first three fields are common to Memory and Processor configuration sections.
This section lets you configure alarms when a specific disk is missing (not mounted or available) for monitoring.
Disk Usage Defaults
This section lets you configure default thresholds and alarm messages for disk usage in MB and percent.
Publishing Alarm Based on: specifies the usage measurement units. Select either percent or Mbytes.
Enable High Threshold: lets you define a threshold for generating a higher severity alarm.
Threshold: defines the high threshold value.
Alarm Message: specifies the alarm message when the high threshold value breaches. Similarly, you can configure the low threshold
value where the alarm severity is lower.
Publishing Data in MB: measures the QoS for Disk Usage MBytes.
Publishing Data in Percent: measures the QoS for Disk Usage in percentage.
Inode Usage Defaults (UNIX only)
This section lets you configure default alarms and inode usage by number of files (count) and percent. You can also configure high and low
threshold values as in the Disk Usage Defaults section.
Disk Usage Change Defaults
This section lets you configure default thresholds and alarms for changes in disk usage. You can also configure high and low threshold values as
in the Disk Usage Defaults section.
Type of Change: specifies the type of change you want to monitor: increasing, decreasing, or both.
Change Calculation: specifies the way of calculating the disk change. Select one of the following values:
Summarized over all samples: calculates the difference between the first and last sample values. The number samples are specified in
the Disk Configuration section.
Between each sample: calculates the change in disk usage by comparing the values of two successive intervals.
Disk Read (B/S)
This section lets you activate the monitoring of disk read throughput and generate QoS at scheduled interval. You can also configure low and high
thresholds for generating alarms. See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To
Threshold alarms.
Note: The disk read throughput monitoring is supported only on the Windows, Linux and AIX platforms.
This section lets you activate the monitoring of disk write throughput and generate QoS at scheduled interval. You can also configure low and high
thresholds for generating alarms. See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To
Threshold alarms.
Note: The disk write throughput monitoring is supported only on the Windows, Linux and AIX platforms.
This section lets you activate the monitoring of total throughput of the disk and generate QoS at scheduled interval. You can also configure low
and high thresholds for generating alarms. See Set Thresholds for more details.
Note: The disk total throughput monitoring is supported only on the Windows, Linux and AIX platforms.
<diskname> Configuration
Navigation: cdm > host name > Disks > disk name
The disk name node lets you configure alarms and QoS for disk availability and size for an individual disk. See Set Thresholds for more details
about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Disk Missing: configure QoS disk availability status and generate alarm when the probe fails to connect with the disk.
Disk Size: configure QoS disk size and generate alarm when the probe fails to calculate the disk size.
Note: The configuration of disk size alarms and QoS are supported only on the Windows, Linux and AIX platforms.
The following attributes are common to many probe configuration fields in the cdm user interface. Here they pertain to disk usage, elsewhere they
pertain to memory or CPU usage, depending on context.
Enable High Threshold: enables the high threshold for disk usage change. This threshold is evaluated first and if it is not exceeded,
then the low threshold is evaluated.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the high threshold.
Enable Low Threshold: enables the low threshold for disk usage change. This threshold is evaluated only if the high threshold has not
been breached.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the low threshold.
Disk Usage Configuration
Navigation: cdm > host name > Disks > disk name > Disk Usage
You can configure disk usage individually for each monitored disk (diskname1, diskname2, etc). You can set attributes for alarm thresholds, disk
usage (%) and disk usage (MB). See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold
alarms.
Navigation: cdm > host name > Disks > disk name > Disk Usage Change
You can individually configure thresholds and alarm messages sent with changes in disk usage for each monitored disk. See Set Thresholds for
more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Change Calculation: indicates how you want to calculate the disk change. Select from the drop-down menu either of the following:
Summarized over all samples - The change in disk usage is the difference between the latest sample and the first sample in the
"samples window," which is configured at the Disk Configuration level.
Between each sample - The change in disk usage is calculated after each sample is collected
<diskname> Inode Usage Configuration
Navigation: cdm > Disks > disk name > Inode Usage > Alarm Thresholds
You can individually configure inode usage for each monitored disk on a Unix host.
Inode Usage Alarm Based on Threshold for: indicates the usage measurement units. Select either percent or count.
Thresholds and alarms attributes are the same as listed in Disk Usage Change Defaults.
Memory
Navigation: cdm > Memory > Memory Paging > Alarm Thresholds
You can individually configure alarm and memory paging thresholds for alarm sent with changes in memory paging for each monitored disk. See
Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Physical Memory Configuration
See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Swap Memory Configuration
Navigation: cdm > Memory > Swap Memory > Swap Memory (%)
A swap memory is a reserved space on hard drive which is used by the system when the physical memory (RAM) is full. However, the swap
memory is not a replacement of the physical memory due to lower data access rate.
The CPU, Disk, and Memory Monitoring probe calculates the swap memory similar to the swap -l command of Solaris. However, the probe use
pages instead of blocks. You can compare the swap memory information of the probe and the swap -l command by using the following formula:
Swap Memory (calculated by probe) in MB = (Blocks returned by the swap -l command * 512)/ (1024*1024).
See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Total Memory Configuration
Navigation: cdm > Memory > Total Memory > Memory Usage (%)
Network
Note: These network metrics are supported only on the Windows, Linux and AIX platforms.
Processor
Navigation: cdm > Processor
The Processor node lets you configure processor-related metrics and their corresponding time interval for fetching the monitoring data. The probe
lets you configure the number of samples and returns the average of computed values. All calculations are based on the number of CPU ticks
returned, for example, the /proc/stat command returns in Linux. The probe adds the column values (user, nice, system, idle, and iowait) for
calculating the total CPU ticks. In a multi-CPU environment, the total for all CPU column values are added.
Similarly, the delta values are calculated by comparing the total CPU tick values of last and current interval. Then, the percentage values are
calculated for each column based on the total CPU ticks value. The QoS for total CPU value is the sum of CPU System, CPU User, and (if
configured) CPU Wait.
Configure the following fields:
Interval (minutes): specifies the time in minutes for how often the probe retrieves sample data. Default: 5
Samples: specifies how many samples the probe should keep in memory to calculate average and threshold values. Default: 5 If you did
not select the Send alarm on each sample checkbox, in the Probe Configuration pane, the probe waits for the number of samples
(specified in this field) before sending the alarm. Do not specify 0 (Zero) in this field.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, If the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization reported will be the average for the last 25 minutes.Set QoS
Target as 'Total': Select this checkbox if you want the QoS target to be set to Total. Default: 5
Include CPU Wait in CPU Usage: includes the CPU Wait in the CPU Usage calculation.
Number of CPUs: displays the number of CPUs. This is a read-only field.
Maximum Queue Length: indicates the maximum number of items in the queue before an alarm is sent.
Alarm Message: sends the alarm message when the queue has been exceeded.
See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Individual CPU Configuration
Individual CPU System - lets you generate QoS data on the amount of time during which CPU executed processes in kernel mode.
Individual CPU Usage - lets you generate QoS data for monitoring CPU usage in percent as compared to the CPU capacity.
Individual CPU User - llets you generate QoS data on the amount of time during which CPU executed processes in kernel mode.
Individual CPU Wait - lets you generate QoS data on the amount of time during which CPU is waiting for I/O process to complete.
Maximum CPU Usage - lets you generate alarm when the CPU usage percent breaches the maximum usage limit.
See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Total CPU Configuration
Navigation: cdm > Processor > Total CPU > Total CPU Idle
This section lets you configure thresholds to send alarm messages when the CPU usage gets below the configured thresholds. Some of the
configuration fields are:
Enable High Threshold: sets the high threshold for disk usage. This threshold is evaluated first and if it is not exceeded, then the low
threshold is evaluated.
Threshold: sends an alarm message when the CPU usage gets below this value. The value in percent of the CPU usage.
Alarm Message: sends the alarm message when the CPU usage on the disk is below the high threshold.
Enable Low Threshold: sets the low threshold for disk usage. This threshold is evaluated only if the high threshold has not been
exceeded.
See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
The probe executes the iostat command for fetching the iostat monitors value. The QoS values are obtained from the second sample value of the
devices.
Set or modify the following values as required:
Interval (minutes): defines the time interval for fetching the sample values from the device. Default: 5
Sample: defines the time interval in seconds, which is used with iostat command for fetching the iostat data for that time duration. This
value must be less than Interval (minutes) field value. Default: 10
Device Iostat Configuration
Iostat Monitors
Linux
Solaris
AIX
The probe detects the underlying OS and filters the list of monitors. This section lets you enable the iostat monitoring for the device. This option is
disabled, by default.
This section represents the actual monitor name of the device for configuration.
QoS Name: identifies the QoS name of the monitor.
Units: identifies a unit of the monitor. For example, % and Mbytes.
Publish Data: publishes the QoS data of the monitor.
Enable High Threshold: lets you configure the high threshold parameters. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatError
Enable Low Threshold: lets you configure the low threshold parameters. Typically, the low threshold generates a warning alarm and the
high threshold generates an error alarm. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatWarning
Similarly, you can configure other monitors because each monitor contains the same set of fields.
For example, the following options can be configured from Raw Configuration:
ignore_device: ignores specified disks. Navigate to the <disk> section and edit this key as follows:
ignore_device = /<regular expression>/
ignore_filesystem: ignores specified file systems. Navigate to the <file> section and edit this key as follows:
ignore_filesystem = /<regular expression>/
The value must be a regular expression that matches all disks or file systems or both that probe must ignore. Here is an example to ignore
Auto-mounted disks that are recognized on each "disk interval":
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info: makes the Device Id of shared drive and local drive identical. Navigate to the setup folder and set this key as No
to enable this feature.
Default: Yes
This feature is introduced because of the following two reasons:
When file system is mounted on Linux through cifs and the cdm probe is deployed fresh, the Device Id and Metric Id for QoS and
alarms for the respective mounted file system are missing.
On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size: indicates the total size of the disk. Navigate to disk > fixed_default and set this key to yes to enable this feature.
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
Configuration Overview
How to set up disk monitoring
How to set up CPU monitoring
How to set up memory monitoring
Probe Defaults
Probe Configuration Interface Installation for cdm
Probe Configuration
Setup Tab
General Tab
Control Properties Tab
Message Definitions Tab
Cluster Tab
Edit Alarm or QoS Source
Status Tab
Disk Usage Modification
New Share Properties
Edit Disk Properties
Delete a Disk
Modify Default Disk Parameters
Enable Space Monitoring
Configuration Overview
The following diagram outlines the process to configure the probe to monitor CPU, disks and memory.
Probe Defaults
You can use the sample configuration file to configure a probe with default monitoring values.
Follow these steps:
1. Navigate to the Program Files\Nimsoft\Probes\System\<probe_name> folder.
2. Make the desired configuration in the <probe_name>.cfg file.
3. Run/restart the probe in Infrastructure Manager to initialize the configuration.
You can now use the newly added default monitoring values, such as templates, in the left pane as per requirement.
Probe Configuration
The CPU, Disk and Memory Monitor (cdm) probe configuration interface displays a screen with tabs for configuring sections of this probe. This
probe can be set up in three types of environments: single computer, multi-CPU and cluster.
There are five main tabs:
Setup
Status
Multi CPU
Advanced
Custom
Setup Tab
The Setup tab is used to configure general preferences for the probe. There are tabs within this tab that you can use to specify general, control
properties and message definitions. A fourth tab, the Cluster tab, displays if the probe is running within a clustered environment.
General Tab
Important! If the Set QoS source to robot name option is set in the controller you will get the robot name also as target.
The Control Properties tab defines the time limit after which the probe asks for data and the number of samples the probe should store to
calculate the values used to determine the threshold breaches.
The fields displayed in the above dialog are separated into the following three sections:
Disk properties
CPU properties
Memory & Paging properties
The field description of each section is given below:
Interval
Specify the time limit in minutes between probe requests for data. This field is common for all three sections.
Samples
Allows you to specify how many samples the probe should store for calculating values used to determine threshold breaches. This field is
common for all three sections.
QoS Interval (Multiple of 'Interval')
Allows you to specify the time limit in minutes between sending of QoS data. For example, If the interval is set to 5 minutes and number of
samples is set to 5, the CPU utilization will be the average for the last 25 minutes. This field is common for all three sections.
Ignore Filesystems
Defines the filesystem to be excluded from monitoring. This field is specific to Disk Properties section only. For example, specifying the regular
expression C:\\ in this field excludes the Disk C of the system from monitoring. A red symbol is displayed next to the disk drive which is excluded
from monitoring in the Disk usage section of the Status tab.
Timeout
Specifies the time limit for the probe to collect CPU, Disk and Memory related data. This option is useful at time of disk fail/crash in stale File
system to avoid hang situation for the probe. A default timeout of 5 seconds was used to avoid hang situation to get disk statistics. But when
system is having high cpu load, 5 seconds timeout is not good enough in certain situations. Recommended timeout is 10 seconds and should be
increased under situations like high cpu load.
Note: This option is available for nonwindows platforms only, like Linux.
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE 'qos_cpu_usage'
AND target NOT IN('user','system','wait','idle')
To update table for new target:
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE
'qos_cpu_multi_usage' AND (target NOT LIKE 'User%' AND target NOT LIKE 'System%'
AND target NOT LIKE 'Wait%' AND target NOT LIKE 'Idle%')
To update table for new target:
The Message Definitions tab offers functionality to customize the messages sent whenever a threshold is breached. A message is defined as a
text string with a severity level. Each message has a token that identifies the associated alarm condition.
Find the following field description:
Message Pool
This section lists all messages with their associated message ID. You can right-click in the message pool window to create new message and
edit/delete an existing message.
Active Messages
This section contains tabs to allow you to associate messages with the thresholds. You can drag the alarm message from the message pool and
drop it into the threshold field. The available tabs are explained below:
CPU
High (error) and Low (warning) threshold for total CPU usage.
High (error) threshold for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the threshold).
CPU difference threshold (alarms are sent when the difference in CPU usage between different CPUs in multi-CPU systems breaches
the threshold).
Disk
The thresholds for disks can be modified by double-clicking the disk-entries under the Status tab.
Memory
Depends on what memory view is selected in the memory usage graph, where you may toggle among three views (see the Status tab).
Memory usage
High (error) and Low (warning) threshold for pagefile usage and paging activity
Physical memory
Swap memory (Unix systems)
Computer
Allows you to select the alarm message to be issued if the computer is rebooted.
Default: The time when the computer was rebooted.
Other
You can select the alarm message to be sent if the probe is not able to fetch data.
Default: Contains information about the error condition.
Cluster Tab
The Cluster tab is displayed only when the cdm probe is hosted in clustered environment and it is configured as a part of a cluster.
It displays a list of detected virtual groups belonging to the cluster. By editing the entries (refer Edit Alarm or QoS source section), you can set the
alarm source and QoS source to be used for disks belonging to that virtual group.
The available options for alarm source and QoS source are:
<cluster ip>
<cluster name>
<cluster name>.<group name>
Edit Alarm or QoS Source
You can edit the alarm source or QoS source.
Follow these steps:
1. Double-click a virtual group entry.
2. On the Group Sources dialog, select the Alarm source and QoS source.
3. Click OK.
Note: QoS messages can also be sent on Disk usage (both in % and MB), and availability for shared disks (also disk usage on NFS file
systems if the Enable space monitoring option is set for the file system as described in the section Setup > Cluster). These options can
be selected when defining the threshold values for these options under the Status tab.
Status Tab
The Status tab sets up high and low thresholds for the CPU, memory and paging activity for the selected file system. It is also the default tab of
the cdm probe GUI.
The fields displayed in the above dialog are explained below:
Graphs
The graphs display actual samples in purple, averages in blue, error threshold (if configured) in red, and warning threshold (if configured) in
yellow.
CPU usage: graph of the CPU usage.
Memory usage: three separate graphs (% of total available memory, physical, and virtual memory). Use the buttons M, S, and P on the top right
corner of the graph to toggle through the three graphs.
% of available memory: in % of total available memory
Physical memory: in % of available physical memory (RAM).
Swap memory: on UNIX systems, this value refers to the % of available swap space.
Note: Typing <Ctrl>+S on your keyboard will save the current view for this graph, and this view will be shown the next time you open
the probe GUI.
Note: When using NFS mounts in the cdm probe, be aware that the server where the mount point is pointing will appear in the
discovery in USM.
You can modify the monitoring properties of disk by right-clicking on a monitored disk in the list.
You have the following options, depending on the disk type:
New Share
Edit
Delete
Modify Default Disk Parameters
Enable Space Monitoring
Note: For UNIX platforms, this option is used to monitor NFS file systems.
To enable or disable space monitoring of the Windows share/mounted drive, right-click a monitored windows share/mounted drive in the list and
select the enable/disable space monitoring option.
Note: The shares are tested from the service context, and the cdm probe just checks that it is possible to mount the share.
UNIX platforms
To enable/disable space monitoring of the file system, right-click a monitored NFS file system in the list and select the enable/disable space
monitoring option. Enabling space monitoring of a NFS file system may cause problems for the cdm probe if the communication with the NFS
server is disrupted (e.g. stale NFS handles). By default, the NFS file systems are monitored for availability only.
When the amount of free space gets below this value, the specified alarm message will be sent. This threshold is evaluated only if the
high threshold has not been exceeded.
You can configure the Quality of Service message, which can have information about the disk usage in Mbytes, % or both depending on
your selections.
Inode Usage and Thresholds tab
This tab is only available for UNIX systems; otherwise it remains disabled. The tab indicates the amount of total, used, and free inodes on the file
system.
You can configure the following threshold settings:
Monitor disk using either inodes or %.
High threshold for the disk. If you select this option, set the value (based on either inodes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be sent.
Low threshold for the disk. If you select this option, set the value (based on either inodes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be issued.
You can configure the Quality of Service message, which can have information about the disk usage in inodes, % or both depending on your
selections.
Disk Usage Change and Thresholds tab
This tab lets you specify the alarm conditions for alarms to be sent when changes in disk usage occur.
Disk usage change calculation
You can select one of the following:
Change summarized over all samples. The change in disk usage is the difference between the latest sample and the first sample in
the "samples window". The number of samples the cdm probe will keep in memory for threshold comparison is set as Samples on the
Setup > Control Properties tab.
Change between each sample. The change in disk usage will be calculated after each sample is collected.
Threshold settings
This section allows you to define the alarm conditions:
Type of change. You can select whether alarms should be issued on increase, decrease or both increase and decrease in disk usage.
High threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be sent. The default value is 2 Mbytes.
Low threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be issued. The default value is 2 Mbytes.
QoS
You can send QoS messages on disk usage change in Mbytes.
Delete a Disk
Use this option to delete the disk from being monitored by the cdm probe. When you use the Delete option a confirmation dialog appears. Click Y
es to delete the disk from the list.
Use the Multi CPU option to display the alarm threshold and the CPU usage for the different CPUs in a multi-CPU configuration. You can specify
the maximum threshold, CPU difference threshold and processors to display.
Note: This tab only visible when the cdm probe is running on a multi-CPU computer.
A multi-core processor (multi-CPU) is a single computing component with two or more independent actual processors (called "cores"), which are
the units that read and execute program instructions. A multi-core processor implements multiprocessing in a single physical package.
This tab contains a graph displaying the alarm threshold and the CPU usage for each processor in a multi-CPU configuration.
The thresholds and options available in the above dialog are explained below:
Maximum
High (error) threshold (in %) for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the
threshold).
Difference
CPU difference threshold (in %). Alarms are sent when the difference in CPU usage among the CPUs in a multi-CPU system
breaches the threshold).
Select processors to view
Select the processor(s) to view in the graph. By default all available processor are shown.
Click Update to refresh the graph with the most current sample values.
Advanced Tab
Use the Advanced tab to customize the QoS messages, for example an alarm on processor queue length, an alarm on detected reboot, and
paging measurements.
The fields displayed in the above dialog are explained below:
Quality of Service Messages
Selecting any of the following settings enables QoS messages to be sent as per the time intervals defined under Control properties tab.
Processor Queue Length (Windows only)
Measures the number of queued processes, divided by the number of processors, waiting for time on the CPU for Windows system. For
AIX, SGI, Linux and Solaris, this QoS message refers to System Load.
Computer uptime (hourly)
Measures the computer uptime in seconds every hour.
Memory Usage
Measures the amount of total available memory (physical + virtual memory) used in Mbytes.
Memory in %
Measures the amount of total available memory (physical + virtual memory) used in %.
Memory Paging in Kb/s
Measures the amount of memory that has been sent to or read from virtual memory in Kbytes/second.
Memory Paging in Pg/s
Measures the amount of memory that has been sent to or read from virtual memory in pages per second.
Notes:
If you have been running CDM version 3.70 or earlier, the QoS settings in the cdm probe GUI are different than CDM version
3.72. However, if CDM version 3.70 or earlier already has created QoS entries in the database for kilobytes per second (Kb/s)
and/or pages per second (Pg/s), these entries will be kept and updated with QoS data from the newer CDM version (3.72 and
higher).
To enable the QoS metric QOS_PROC_QUEUE_LEN for per CPU, you are required to add a key system_load_per_cpu with
value as Yes under the CPU section through the raw configure option. The probe calculates the system load on Linux, Solaris
and AIX as Load/Number of CPU if this key is set to Yes.
Note: If running on a multi-CPU system, the queued processes will be shared on the number of processors. For example, if running on
a system with four processors and using the default Max Queue Length value (4), alarm messages will be generated if the number of
queued processes exceeds 16.
Lparstat i command
Total Capacity =( maxVirtualCPU/ maxCapacity)*100;
CPU User = CPU user
*EntCap)/TotCapacity;
cpuStats->fSystem = (double)((cpuStats->fSystem *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fWait = (double)((cpuStats->fWait * cpuStats->fEntCap)/TotCapacity);
cpuStats->fIdle = (double)((cpuStats->fIdle * cpuStats->fEntCap)/TotCapacity);
Paging measured in
Paging can be measured in Kilobytes per second or pages per second.
Paging is the amount of memory which has been sent to or read from virtual memory. This option lets you select the paging to be measured in
one of the following units:
Kilobytes per second (KB/s)
Pages per second (Pg/s). Note that the size of the pages may vary between different operating systems.
Note: When changing the paging selection, the header of the Paging graph on the Status tab will immediately change to show the
selected unit, but the values in the graph will not change until the next sample is measured.
Custom Tab
The Custom tab displays a list of all currently defined custom profiles. Custom profiles are used to get additional thresholds and alarms for
checkpoints that are available in the probe. All the alarm situations are available, except for those available for multi-CPU and cluster disks. A
custom profile allows you to fine-tune monitoring of resources for alarming purposes.
The alarms for each custom profile will be sent using suppression keys unique to the profile so that you can get multiple alarms for what is
basically the same alarm situation (for instance, the a breach of the memory usage threshold).
You can right-click inside the above dialog to create new custom profiles to monitor the CPU, disk or memory. Once a custom profile is created
you can select one or more custom profiles to edit, delete or activate/deactivate as and when required.
New CPU Profile
You can create a custom profile for a local disk, shared disk, or for a disk available on a network.
The fields are explained below:
Name: defines the disk profile name.
Description: defines the description of the disk profile.'
Regular Expression for Mount Point: defines a regular expression through which you can monitor your Custom Local Disk (for
Windows platform) and Custom Local and NFS (for Linux, Solaris and AIX platforms).
Note: on selecting this option, the drop-down menu Mount point and the field Remote Disk are disabled which means that
monitoring is enabled either through the regular expression or through the drop-down menu.
Active: activates the alarm generation if the disk is unavailable or not mounted.
Allow space monitoring: lets you configure three new checkpoints to monitor the disk, which are Disk free space, Inodes free, Space
usage change.
For more information on these checkpoints, refer to the Control Properties Tab section.
ignore_device = /autofosmount.*|.*:V.*/
Note: Ensure that you add these two keys manually and then set up the respective configuration.
allow_remote_disk_info
Default: Yes
This key has been introduced in the Raw Configuration section to make the Device Id of shared drive and local drive identical. You
are required to set this key as No to enable this feature.
This feature is introduced because of the following two reasons.
1.
When file system is mounted on linux through cifs, then, on fresh deployment of the cdm probe, the Device Id and Metric Id for QoS
and alarms of respective mounted file system are missing.
On restarting, the probe is unable to mark cifs drives as network drive and hence generates wrong Device Id and Metric Id.
qos_disk_total_size
Default: No
This key has been introduced in Fixed default section under disk for Windows, Linux and AIX platforms.
Note: When performing this operation with the cdm probe, you must ensure that the disk partitions are the same on the source and the
target computers.
For example, If the source computer has a C: and a D: partition, and you copy the cdm probe configuration to a cdm probe on a computer with
only a C: partition, the cdm probe on this computer will try to monitor a D: partition (which is missing) and report an error.
Follow these steps:
1. Log on to the robot where your configured cdm probe resides.
2. Select the cdm probe to be copied from the probe list in the Infrastructure Manager and drag and drop the probe into the Archive.
3. Click Rename and enter a unique package name the copy of the cdm probe archive. For example, rename the package to cdm_master.
The distribution progress window appears and the configuration of the probe is completed after distribution process is finished.
Type of regular
expression
Explanation
[A-Z]
Standard (PCRE)
[A-Z:\\]
Custom
Matches with the Uppercase character type of the local disk available on the respective
box
Standard (PCRE)
[*.\\]
Custom
Standard (PCRE)
\d*
Custom
Overview
Using the Admin Console to Access the cdm Configuration GUI
cdm
Probe Information
General Configuration
Messages
Configuring Computer Uptime and System Reboot QoS Data
Disks
Disk Configuration
Disk Missing Defaults
Disk Usage Defaults
Inode Usage Defaults (UNIX only)
Disk Usage Change Defaults
Disk Read (B/S)
Disk Write (B/S)
Disk Read and Write (B/S)
<diskname> Configuration
Disk Usage Configuration
Disk Usage Change Configuration
<diskname> Inode Usage Configuration
Add a Shared Disk for Monitoring
Memory
Memory Paging Configuration
Physical Memory Configuration
Swap Memory Configuration
Total Memory Configuration
Network
Processor
Individual CPU Configuration
Total CPU Configuration
Iostat (Linux, Solaric, and AIX)
Device Iostat Configuration
Overview
The following table provides an overview of the configuration settings available for the cdm probe.
Node
Subnode
cdm
Available settings
<hostname>
Disks
Computer Uptime and System Reboot alarms for the cdm host system
<diskname1>
Disk Usage
<diskname2>
Memory
Memory Paging
Physical Memory
Swap Memory
Total Memory
Processor
Individual CPU
Total CPU
cdm
Navigation: cdm
This section lets you view probe and QoS information, change the log level, and set data management values.
Probe Information
This section provides a listing of alarm messages issued by the cdm probe and is read-only. Selecting a row displays additional alarm message
attributes below the main list, including Message Token, Subsystem, and I18N Token.
Disks
This section lets you configure the time interval and number of samples for fetching metric values from the system. These properties are
applicable for all the monitoring disks of the system.
Interval (minutes): specifies the time in minutes for how often the probe retrieves sample data. Default: 15
Samples: specifies how many samples the probe is keeping in memory for calculating average and threshold values. Default: 4
Note: In case, the Send alarm on each sample option is not selected, the probe waits for the number of samples then sends the alarm.
Do not specify 0 (Zero) in this field.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization is the average for the last 25 minutes. Default: 1
Ignore Filesystems: defines the filesystem to be excluded from monitoring. For example, specifying the regular expression *C:* in this
field excludes the C drive of the system from monitoring and also stops displaying the disk in navigation pane.
Timeout: specifies the time limit in seconds for the probe for collecting the disk-related data. This option is useful at time of disk fail/crash
in stale File system and avoid a hang situation for the probe. Default: 10
Note: The first three fields are common to Memory and Processor configuration sections.
This section lets you configure alarms when a specific disk is missing (not mounted or available) for monitoring.
Disk Usage Defaults
This section lets you configure default thresholds and alarm messages for disk usage in MB and percent.
Publishing Alarm Based on: specifies the usage measurement units. Select either percent or Mbytes.
Enable High Threshold: lets you define a threshold for generating a higher severity alarm.
Threshold: defines the high threshold value.
Alarm Message: specifies the alarm message when the high threshold value breaches. Similarly, you can configure the low threshold
value where the alarm severity is lower.
Publishing Data in MB: measures the QoS for Disk Usage MBytes.
Publishing Data in Percent: measures the QoS for Disk Usage in percentage.
Inode Usage Defaults (UNIX only)
This section lets you configure default alarms and inode usage by number of files (count) and percent. You can also configure high and low
threshold values as in the Disk Usage Defaults section.
Disk Usage Change Defaults
This section lets you configure default thresholds and alarms for changes in disk usage. You can also configure high and low threshold values as
in the Disk Usage Defaults section.
Type of Change: specifies the type of change you want to monitor: increasing, decreasing, or both.
Change Calculation: specifies the way of calculating the disk change. Select one of the following values:
Summarized over all samples: calculates the difference between the first and last sample values. The number samples are specified in
the Disk Configuration section.
Between each sample: calculates the change in disk usage by comparing the values of two successive intervals.
This section lets you activate the monitoring of disk read throughput and generate QoS at scheduled interval. You can also configure low and high
thresholds for generating alarms. See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To
Threshold alarms.
Note: The disk read throughput monitoring is supported only on the Windows, Linux and AIX platforms.
This section lets you activate the monitoring of disk write throughput and generate QoS at scheduled interval. You can also configure low and high
thresholds for generating alarms. See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To
Threshold alarms.
Note: The disk write throughput monitoring is supported only on the Windows, Linux and AIX platforms.
This section lets you activate the monitoring of total throughput of the disk and generate QoS at scheduled interval. You can also configure low
and high thresholds for generating alarms. See Set Thresholds for more details.
Note: The disk total throughput monitoring is supported only on the Windows, Linux and AIX platforms.
<diskname> Configuration
Navigation: cdm > host name > Disks > disk name
The disk name node lets you configure alarms and QoS for disk availability and size for an individual disk. See Set Thresholds for more details
about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Disk Missing: configure QoS disk availability status and generate alarm when the probe fails to connect with the disk.
Disk Size: configure QoS disk size and generate alarm when the probe fails to calculate the disk size.
The following attributes are common to many probe configuration fields in the cdm user interface. Here they pertain to disk usage, elsewhere they
pertain to memory or CPU usage, depending on context.
Enable High Threshold: enables the high threshold for disk usage change. This threshold is evaluated first and if it is not exceeded,
then the low threshold is evaluated.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the high threshold.
Enable Low Threshold: enables the low threshold for disk usage change. This threshold is evaluated only if the high threshold has not
been breached.
Threshold: indicates the value in Mbytes of the free space on the disk. When disk free space gets below this value, an alarm message is
sent.
Alarm Message: sends the alarm message when the free space on the disk is below the low threshold.
Disk Usage Configuration
Navigation: cdm > host name > Disks > disk name > Disk Usage
You can configure disk usage individually for each monitored disk (diskname1, diskname2, etc). You can set attributes for alarm thresholds, disk
usage (%) and disk usage (MB). See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold
alarms.
Navigation: cdm > host name > Disks > disk name > Disk Usage Change
You can individually configure thresholds and alarm messages sent with changes in disk usage for each monitored disk. See Set Thresholds for
more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Change Calculation: indicates how you want to calculate the disk change. Select from the drop-down menu either of the following:
Summarized over all samples - The change in disk usage is the difference between the latest sample and the first sample in the
"samples window," which is configured at the Disk Configuration level.
Between each sample - The change in disk usage is calculated after each sample is collected
<diskname> Inode Usage Configuration
Navigation: cdm > Disks > disk name > Inode Usage > Alarm Thresholds
You can individually configure inode usage for each monitored disk on a Unix host.
Inode Usage Alarm Based on Threshold for: indicates the usage measurement units. Select either percent or count.
Thresholds and alarms attributes are the same as listed in Disk Usage Change Defaults.
Memory
Navigation: cdm > Memory > Memory Configuration
At the Memory level, set or modify the following global memory attributes based on your requirements.
The fields are common to all three probe configuration sections (Disks, Memory, Processor).
Interval (minutes): specifies the time in minutes for how often the probe retrieves sample data. Default: 5
Samples: specifies how many samples the probe should keep in memory to calculate average and threshold values. Default: 5 If you did
not select the Send alarm on each sample check box in the Probe Configuration details pane, the probe waits for the number of samples
(specified in this field) before sending the alarm. Do not specify 0 (Zero) in this field.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, If the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization reported will be the average for the last 25 minutes. Default: 1
Set QoS Target as 'Memory': sets the QoS target to Memory. Default: Not selected
Navigation: cdm > Memory > Memory Paging > Alarm Thresholds
You can individually configure alarm and memory paging thresholds for alarm sent with changes in memory paging for each monitored disk. See
Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Physical Memory Configuration
See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Swap Memory Configuration
Navigation: cdm > Memory > Swap Memory > Swap Memory (%)
A swap memory is a reserved space on hard drive which is used by the system when the physical memory (RAM) is full. However, the swap
memory is not a replacement of the physical memory due to lower data access rate.
The CPU, Disk, and Memory Monitoring probe calculates the swap memory similar to the swap -l command of Solaris. However, the probe use
pages instead of blocks. You can compare the swap memory information of the probe and the swap -l command by using the following formula:
Swap Memory (calculated by probe) in MB = (Blocks returned by the swap -l command * 512)/ (1024*1024).
See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Total Memory Configuration
Navigation: cdm > Memory > Total Memory > Memory Usage (%)
Network
Navigation: cdm > Network
This node lets you monitor the outbound and inbound traffic of your system Network Interface Card (NIC). The NIC monitoring lets you analyze
the network bandwidth that is being utilized which can impact the overall network performance. For example, your NIC capacity is 100 MBPS and
aggregated traffic is more than 90 MBPS then it can slow down the data transfer rate. This monitoring helps you take preventive actions before
the network goes down. For example, upgrade your NIC or install more NICs and implement the load-balancing solution.
This node lets you monitor the following network metrics:
Inbound Traffic: Monitors the traffic coming from LAN or a public network to the monitored system in bytes per second.
Outbound Traffic: Monitors the traffic going from the monitored system to LAN or a public network in bytes per second.
Aggregated Traffic: Monitors both inbound and traffic in bytes per second.
Important! The probe monitors only physical NICs of system and sum up the metric values when multiple NICs are installed on the
monitored system.
Note: These network metrics are supported only on the Windows, Linux and AIX platforms.
Processor
Navigation: cdm > Processor
The Processor node lets you configure processor-related metrics and their corresponding time interval for fetching the monitoring data. The probe
lets you configure the number of samples and returns the average of computed values. All calculations are based on the number of CPU ticks
returned, for example, the /proc/stat command returns in Linux. The probe adds the column values (user, nice, system, idle, and iowait) for
calculating the total CPU ticks. In a multi-CPU environment, the total for all CPU column values are added.
Similarly, the delta values are calculated by comparing the total CPU tick values of last and current interval. Then, the percentage values are
calculated for each column based on the total CPU ticks value. The QoS for total CPU value is the sum of CPU System, CPU User, and (if
configured) CPU Wait.
Configure the following fields:
Interval (minutes): specifies the time in minutes for how often the probe retrieves sample data. Default: 5
Samples: specifies how many samples the probe should keep in memory to calculate average and threshold values. Default: 5 If you did
not select the Send alarm on each sample checkbox, in the Probe Configuration pane, the probe waits for the number of samples
(specified in this field) before sending the alarm. Do not specify 0 (Zero) in this field.
QoS Interval (Multiple of 'Interval'): specifies the time in minutes for how often the probe calculates QoS. For example, If the interval is
set to 5 minutes and number of samples is set to 5, the CPU utilization reported will be the average for the last 25 minutes.Set QoS
Target as 'Total': Select this checkbox if you want the QoS target to be set to Total. Default: 5
Include CPU Wait in CPU Usage: includes the CPU Wait in the CPU Usage calculation.
Number of CPUs: displays the number of CPUs. This is a read-only field.
Maximum Queue Length: indicates the maximum number of items in the queue before an alarm is sent.
Alarm Message: sends the alarm message when the queue has been exceeded.
See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
Individual CPU Configuration
Navigation: cdm > Processor > Total CPU > Total CPU Idle
This section lets you configure thresholds to send alarm messages when the CPU usage gets below the configured thresholds. Some of the
configuration fields are:
Enable High Threshold: sets the high threshold for disk usage. This threshold is evaluated first and if it is not exceeded, then the low
threshold is evaluated.
Threshold: sends an alarm message when the CPU usage gets below this value. The value in percent of the CPU usage.
Alarm Message: sends the alarm message when the CPU usage on the disk is below the high threshold.
Enable Low Threshold: sets the low threshold for disk usage. This threshold is evaluated only if the high threshold has not been
exceeded.
See Set Thresholds for more details about setting dynamic, static, Time Over Threshold, and Time To Threshold alarms.
The probe executes the iostat command for fetching the iostat monitors value. The QoS values are obtained from the second sample value of the
devices.
Set or modify the following values as required:
Interval (minutes): defines the time interval for fetching the sample values from the device. Default: 5
Sample: defines the time interval in seconds, which is used with iostat command for fetching the iostat data for that time duration. This
value must be less than Interval (minutes) field value. Default: 10
Device Iostat Configuration
Iostat Monitors
Iostat Average Queue Length
Iostat Average Request Size
Iostat Average Service Time (Linux)
Iostat Average Wait Time (active, by default)
Iostat Read Requests Merged Per Second
Iostat Reads Per Second
Iostat Sector Reads Per Second
Iostat Sector Writes Per Second
Iostat Utilization Percentage (active, by default)
Iostat Write Requests Merged Per Second
Iostat Writes Per Second
Solaris
AIX
The probe detects the underlying OS and filters the list of monitors. This section lets you enable the iostat monitoring for the device. This option is
disabled, by default.
This section represents the actual monitor name of the device for configuration.
QoS Name: identifies the QoS name of the monitor.
Units: identifies a unit of the monitor. For example, % and Mbytes.
Publish Data: publishes the QoS data of the monitor.
Enable High Threshold: lets you configure the high threshold parameters. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatError
Enable Low Threshold: lets you configure the low threshold parameters. Typically, the low threshold generates a warning alarm and the
high threshold generates an error alarm. Default: Disabled
Threshold: defines the threshold value for comparing the actual value. Default: 90
Alarm Message: specifies the alarm message when the threshold value breaches. Default: IostatWarning
Similarly, you can configure other monitors because each monitor contains the same set of fields.
Probe Defaults
Probe Configuration Interface Installation for cdm
Probe Configuration
Setup Tab
General Tab
Control Properties Tab
Message Definitions Tab
Cluster Tab
Probe Defaults
You can use the sample configuration file to configure a probe with default monitoring values.
Follow these steps:
1. Navigate to the Program Files\Nimsoft\Probes\System\<probe_name> folder.
2. Make the desired configuration in the <probe_name>.cfg file.
3. Run/restart the probe in Infrastructure Manager to initialize the configuration.
You can now use the newly added default monitoring values, such as templates, in the left pane as per requirement.
Probe Configuration
The CPU, Disk and Memory Monitor (cdm) probe configuration interface displays a screen with tabs for configuring sections of this probe. This
probe can be set up in three types of environments: single computer, multi-CPU and cluster.
There are five main tabs:
Setup
Status
Multi CPU
Advanced
Custom
Setup Tab
The Setup tab is used to configure general preferences for the probe. There are tabs within this tab that you can use to specify general, control
properties and message definitions. A fourth tab, the Cluster tab, displays if the probe is running within a clustered environment.
General Tab
Important! If the Set QoS source to robot name option is set in the controller you will get the robot name also as target.
The Control Properties tab defines the time limit after which the probe asks for data and the number of samples the probe should store to
calculate the values used to determine the threshold breaches.
The fields are separated into the following three sections:
Disk properties
CPU properties
Memory & Paging properties
The field description of each section is given below:
Interval
Specify the time limit in minutes between probe requests for data. This field is common for all three sections.
Samples
Allows you to specify how many samples the probe should store for calculating values used to determine threshold breaches. This field is
common for all three sections.
QoS Interval (Multiple of 'Interval')
Allows you to specify the time limit in minutes between sending of QoS data. For example, If the interval is set to 5 minutes and number of
samples is set to 5, the CPU utilization will be the average for the last 25 minutes. This field is common for all three sections.
Ignore Filesystems
Defines the filesystem to be excluded from monitoring. This field is specific to Disk properties section only. For example, *C:* will not monitor the
disk usage matching the given regular expression *C:* (or Disk C).
Timeout
Specifies the time limit for the probe to collect CPU, Disk and Memory related data. This option is useful at time of disk fail/crash in stale File
system to avoid hang situation for the probe. A default timeout of 5 seconds was used to avoid hang situation to get disk statistics. But when
system is having high cpu load, 5 seconds timeout is not good enough in certain situations. Recommended timeout is 10 seconds and should be
increased under situations like high cpu load.
Note: This option is available for nonwindows platforms only, like Linux.
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE 'qos_cpu_usage'
AND target NOT IN('user','system','wait','idle')
To update table for new target:
SELECT * FROM dbo.s_qos_data WHERE probe LIKE 'cdm' AND qos LIKE
'qos_cpu_multi_usage' AND (target NOT LIKE 'User%' AND target NOT LIKE 'System%'
AND target NOT LIKE 'Wait%' AND target NOT LIKE 'Idle%')
To update table for new target:
The Message Definitions tab offers functionality to customize the messages sent whenever a threshold is breached. A message is defined as a
text string with a severity level. Each message has a token that identifies the associated alarm condition.
The fields are explained below:
Message Pool
This section lists all messages with their associated message ID. You can right-click in the message pool window to create new message and
edit/delete an existing message.
Active Messages
This section contains tabs to allow you to associate messages with the thresholds. You can drag the alarm message from the message pool and
drop it into the threshold field. The available tabs are explained below:
CPU
High (error) and Low (warning) threshold for total CPU usage.
High (error) threshold for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the threshold).
CPU difference threshold (alarms are sent when the difference in CPU usage between different CPUs in multi-CPU systems breaches
the threshold).
Disk
The thresholds for disks can be modified by double-clicking the disk-entries under the Status tab.
Memory
Depends on what memory view is selected in the memory usage graph, where you may toggle among three views (see the Status tab).
Memory usage
High (error) and Low (warning) threshold for pagefile usage and paging activity
Physical memory
Swap memory (Unix systems)
Computer
Allows you to select the alarm message to be issued if the computer is rebooted.
Default: The time when the computer was rebooted.
Other
You can select the alarm message to be sent if the probe is not able to fetch data.
Default: Contains information about the error condition.
Cluster Tab
The Cluster tab is displayed only when the cdm probe is hosted in clustered environment and it is configured as a part of a cluster.
It displays a list of detected virtual groups belonging to the cluster. By editing the entries (refer Edit Alarm or QoS source section), you can set the
alarm source and QoS source to be used for disks belonging to that virtual group.
The available options for alarm source and QoS source are:
<cluster ip>
<cluster name>
<cluster name>.<group name>
Edit Alarm or QoS Source
You can edit the alarm source or QoS source.
Follow these steps:
1. Double-click a virtual group entry.
2. On the Group Sources dialog, select the Alarm source and QoS source.
3. Click OK.
Note: QoS messages can also be sent on Disk usage (both in % and MB), and availability for shared disks (also disk usage on NFS file
systems if the Enable space monitoring option is set for the file system as described in the section Setup > Cluster). These options can
be selected when defining the threshold values for these options under the Status tab.
Status Tab
The Status tab sets up high and low thresholds for the CPU, memory and paging activity for the selected file system. It is also the default tab of
the cdm probe GUI.
The fields are explained below:
Graphs
The graphs display actual samples in purple, averages in blue, error threshold (if configured) in red, and warning threshold (if configured) in
yellow.
CPU usage: graph of the CPU usage.
Memory usage: three separate graphs (% of total available memory, physical, and virtual memory). Use the buttons M, S, and P on the top right
corner of the graph to toggle through the three graphs.
% of available memory: in % of total available memory
Physical memory: in % of available physical memory (RAM).
Swap memory: on UNIX systems, this value refers to the % of available swap space.
Note: Typing <Ctrl>+S on your keyboard will save the current view for this graph, and this view will be shown the next time you open
the probe GUI.
You can modify the monitoring properties of disk by right-clicking on a monitored disk in the list.
Note: For UNIX platforms, this option is used to monitor NFS file systems.
To enable or disable space monitoring of the Windows share/mounted drive, right-click a monitored windows share/mounted drive in the list and
select the enable/disable space monitoring option.
Note: The shares are tested from the service context, and the cdm probe just checks that it is possible to mount the share.
UNIX platforms
To enable/disable space monitoring of the file system, right-click a monitored NFS file system in the list and select the enable/disable space
monitoring option. Enabling space monitoring of a NFS file system may cause problems for the cdm probe if the communication with the NFS
server is disrupted (e.g. stale NFS handles). By default, the NFS file systems are monitored for availability only.
system.
You can configure the following threshold settings:
Monitor disk using either inodes or %.
High threshold for the disk. If you select this option, set the value (based on either inodes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be sent.
Low threshold for the disk. If you select this option, set the value (based on either inodes or %) and select the alarm message to be sent.
When the amount of free space gets below this value, the specified alarm message will be issued.
You can configure the Quality of Service message, which can have information about the disk usage in inodes, % or both depending on your
selections.
Disk Usage Change and Thresholds tab
This tab lets you specify the alarm conditions for alarms to be sent when changes in disk usage occur.
Disk usage change calculation
You can select one of the following:
Change summarized over all samples. The change in disk usage is the difference between the latest sample and the first sample in
the "samples window". The number of samples the cdm probe will keep in memory for threshold comparison is set as Samples on the
Setup > Control Properties tab.
Change between each sample. The change in disk usage will be calculated after each sample is collected.
Threshold settings
This section allows you to define the alarm conditions:
Type of change. You can select whether alarms should be issued on increase, decrease or both increase and decrease in disk usage.
High threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be sent. The default value is 2 Mbytes.
Low threshold for the disk. If you select this option, set the value in Mbytes and select the alarm message to be sent. When the
amount of free space gets below this value, the specified alarm message will be issued. The default value is 2 Mbytes.
QoS
You can send QoS messages on disk usage change in Mbytes.
Delete a Disk
Use this option to delete the disk from being monitored by the cdm probe. When you use the Delete option a confirmation dialog appears. Click Y
es to delete the disk from the list.
Use the Multi CPU option to display the alarm threshold and the CPU usage for the different CPUs in a multi-CPU configuration. You can specify
the maximum threshold, CPU difference threshold and processors to display.
Note: This tab only visible when the cdm probe is running on a multi-CPU computer.
A multi-core processor (multi-CPU) is a single computing component with two or more independent actual processors (called "cores"), which are
the units that read and execute program instructions. A multi-core processor implements multiprocessing in a single physical package.
This tab contains a graph displaying the alarm threshold and the CPU usage for each processor in a multi-CPU configuration.
The thresholds and options available in the above dialog are explained below:
Maximum
High (error) threshold (in %) for individual CPU usage (alarms are sent when one of the CPUs in multi-CPU systems breaches the
threshold).
Difference
CPU difference threshold (in %). Alarms are sent when the difference in CPU usage among the CPUs in a multi-CPU system
breaches the threshold).
Select processors to view
Select the processor(s) to view in the graph. By default all available processor are shown.
Click Update to refresh the graph with the most current sample values.
Advanced Tab
Use the Advanced tab to customize the QoS messages, for example an alarm on processor queue length, an alarm on detected reboot, and
paging measurements.
The fields are explained below:
Quality of Service Messages
Selecting any of the following settings enables QoS messages to be sent as per the time intervals defined under Control properties tab.
Processor Queue Length (Windows only)
Measures the number of queued processes, divided by the number of processors, waiting for time on the CPU for Windows system. For
AIX, SGI, Linux and Solaris, this QoS message refers to System Load.
Computer uptime (hourly)
Measures the computer uptime in seconds every hour.
Memory Usage
Measures the amount of total available memory (physical + virtual memory) used in Mbytes.
Memory in %
Measures the amount of total available memory (physical + virtual memory) used in %.
Memory Paging in Kb/s
Measures the amount of memory that has been sent to or read from virtual memory in Kbytes/second.
Memory Paging in Pg/s
Measures the amount of memory that has been sent to or read from virtual memory in pages per second.
Note: If you have been running CDM version 3.70 or earlier, the QoS settings in the cdm probe GUI are different than CDM
version 3.72. However, if CDM version 3.70 or earlier already has created QoS entries in the database for kilobytes per second
(Kb/s) and/or pages per second (Pg/s), these entries will be kept and updated with QoS data from the newer CDM version (3.72
and higher).
Swap Memory in %
Measures the space on the disk used for the swap file in %.
CPU Usage
This section is divided into two tabs: Total CPU and Individual CPU. These measurements are all in %.
Note: The Individual CPU tab remains disabled in a single CPU configuration.
Note: If running on a multi-CPU system, the queued processes will be shared on the number of processors. For example, if running on
a system with four processors and using the default Max Queue Length value (4), alarm messages will be generated if the number of
queued processes exceeds 16.
Lparstat i command
Total Capacity =( maxVirtualCPU/ maxCapacity)*100;
CPU User = CPU user
*EntCap)/TotCapacity;
cpuStats->fSystem = (double)((cpuStats->fSystem *
cpuStats->fEntCap)/TotCapacity);
cpuStats->fWait = (double)((cpuStats->fWait * cpuStats->fEntCap)/TotCapacity);
cpuStats->fIdle = (double)((cpuStats->fIdle * cpuStats->fEntCap)/TotCapacity);
Paging measured in
Paging can be measured in Kilobytes per second or pages per second.
Paging is the amount of memory which has been sent to or read from virtual memory. This option lets you select the paging to be measured in
one of the following units:
Kilobytes per second (KB/s)
Pages per second (Pg/s). Note that the size of the pages may vary between different operating systems.
Note: When changing the paging selection, the header of the Paging graph on the Status tab will immediately change to show the
selected unit, but the values in the graph will not change until the next sample is measured.
For NFS file systems, you can select QoS message on Disk availability to be sent. For this, right-click the filesystem on the Status tab and select
Edit. Select the Disk Available Quality of Service in the properties dialog and click OK. See Edit Disk Properties for more details.
Memory usage on Solaris systems
There seems to be some confusion about the memory usage the cdm probe reports on Solaris systems. Most often, the issue is that cdm
does not provide the same numbers that the popular TOP utility does. The main reason for this is that TOP and CDM gather swap
information differently.
CDM gathers swap information in a similar way as the Solaris utility swap-l does, but using pages instead of blocks. To compare the swap
information between CDM and the swap utility you take the blocks swap reports and run it through the formula: (blocks * 512) / (1024 *
1024) = total_swap Mb. This is the same number of MB the CDM probe uses in its calculations.
TOP on the other hand gathers information about anonymous pages in the VM, which is quicker and easier to gather but do not represent a
true picture of the amount of swap space available and used. The reason is that anonymous pages also take into account physical memory
that is potentially available for use as swap space. Thus, the TOP utility will report more total swap space since it is also factoring in
physical memory not in use at this time.
CDM and TOP gather physical memory information in similar ways, so the differences in available physical memory should be insignificant.
Since CDM does not differentiate between available swap and physical memory (after all, it is only when you run out of both the resources
that things stop working on the system), the accumulated numbers are used. The accumulated numbers for TOP will be off, since the free
portions of physical memory will be counted twice in many instances. While we could easily represent the data in the same format that TOP
does, we feel it does not give a correct picture of the memory/swap usage on the system.
Custom Tab
The Custom tab displays a list of all currently defined custom profiles. Custom profiles are used to get additional thresholds and alarms for
checkpoints that are available in the probe. All the alarm situations are available, except for those available for multi-CPU and cluster disks. A
custom profile allows you to fine-tune monitoring of resources for alarming purposes.
The alarms for each custom profile will be sent using suppression keys unique to the profile so that you can get multiple alarms for what is
basically the same alarm situation (for instance, the a breach of the memory usage threshold).
You can right-click inside the above dialog to create new custom profiles to monitor the CPU, disk or memory. Once a custom profile is created
you can select one or more custom profiles to edit, delete or activate/deactivate as and when required.
New CPU Profile
You can create a custom profile for a local disk, shared disk, or for a disk available on a network.
The fields are explained below:
Name: defines the disk profile name.
Description: defines the description of the disk profile.'
Regular Expression for Mount Point: defines a regular expression through which you can monitor your Custom Local Disk (for
Windows platform) and Custom Local and NFS (for Linux, Solaris and AIX platforms).
Note: on selecting this option, the drop-down menu Mount point and the field Remote Disk are disabled which means that
monitoring is enabled either through the regular expression or through the drop-down menu.
Active: activates the alarm generation if the disk is unavailable or not mounted.
Allow space monitoring: lets you configure three new checkpoints to monitor the disk, which are Disk free space, Inodes free, Space
usage change.
For more information on these checkpoints, refer to the Control Properties Tab section.
To access the raw configuration pages hold the Shift key and right click the cdm probe in Infrastructure Manager. Then select the Raw Configure
option from the right-click menu. The raw configuration allows you to edit the configuration file or edit the data file.
Below are some useful options that can be set, using the Raw Configuration tool:
ignore_device
ignore_filesystem
Note: Ensure that you add these two keys manually and then set up the respective configuration.
To ignore certain disks and/or file systems, you can edit one of these two keys in the <disk> section:
ignore_device = /<regular expression>/
ignore_filesysem = /<regular expression>/
The value should be a regular expression that would match all disks and/or filesystems that you want the probe to ignore. Here is an example to
ignore Auto-mounted disks that are recognized on each "disk interval":
ignore_device = /autofosmount.*|.*:V.*/
The following key has been introduced in the Raw Configuration section to make the device Id of shared drive and local drive identical. You are
required to set this key as No to enable this feature.
allow_remote_disk_info
Default: Yes
Note: When performing this operation with the cdm probe, you must ensure that the disk partitions are the same on the source and the
target computers.
For example, If the source computer has a C: and a D: partition, and you copy the cdm probe configuration to a cdm probe on a computer with
only a C: partition, the cdm probe on this computer will try to monitor a D: partition (which is missing) and report an error.
Follow these steps:
1. Log on to the robot where your configured cdm probe resides.
2. Select the cdm probe to be copied from the probe list in the Infrastructure Manager and drag and drop the probe into the Archive.
3. Click Rename and enter a unique package name the copy of the cdm probe archive. For example, rename the package to cdm_master.
The distribution progress window appears and the configuration of the probe is completed after distribution process is finished.
EMC Celerra storage systems support environments which include mainframes, desktops, servers, applications, cloud, and business services.
The EMC Celerra Monitoring (celerra) probe monitors the performance and availability of EMC Celerra storage systems. The celerra probe uses
ssh (a secure shell) in combination with the Celerra command interface provided by EMC. The probe connects to a Celerra Network Server node
to collect and store data and information from the monitored system. You can define alarms that are generated when the specified thresholds are
breached.
More information:
celerra (EMC Celerra Monitoring) Release Notes
This article describes the configuration concepts and procedures for setting up the celerra probe. The following figure provides an overview of the
process you must follow to configure a working probe.
Delete Resource
You can delete an existing resource when you no longer need it.
Follow these steps:
1. Click the Options icon beside the Profile-resource name node that you want to delete.
2. Click the Delete Resource option.
The resource is deleted.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
celerra Node
Profile-<Resource Name> Node
<Resource Name> Node
Storage Node
<Monitor Name> Node
The EMC Celerra Monitoring probe can handle all common monitoring and data collection tasks for EMC Celerra Storage Systems.
You can define alarms to be raised and propagated when the specified thresholds are breached.
The following components are monitored:
Data Movers
Storage Systems and Processors
File Systems
Note: The celerra probe generates an alert if the File System is deleted or renamed. You must manually reconfigure the monitor if the File System
is renamed.
Networks
celerra Node
This node contains configuration details specific to the EMC Celerra Monitoring probe. In this node, you can view the probe information and can
configure the general setup properties of the EMC Celerra Monitoring probe. You can view a list of all alarm messages that are available in the
EMC Celerra Monitoring probe. You can also add a resource in the EMC Celerra Monitoring probe.
Navigation: celerra
Set or modify the following values that are based on your requirement:
celerra > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the vendor who created the probe.
celerra > General Setup
This section allows you to configure the general setup properties of the EMC Celerra Monitoring probe.
Log Level: Specifies the level of details that are written to the log file.
Default: 2 - Warn
Enable GUI Autorefresh (60 sec): Allows you to autorefresh the GUI in 60 seconds. The field reflects the most current measured
values from the checkpoints and status of the nodes in the tree structure.
Default: Not selected
Java Home: Defines a folder containing Java home.
celerra > Message pool
This section displays a list of all alarm messages that are available in the EMC Celerra Monitoring probe.
Identification Name: Identifies the name of the message.
Token: Identifies the token which is used for internalization.
Error Severity: Indicates the severity level that is assigned to the alarm messages.
Error Alarm Text: Identifies the alarm message text that is issued on the error alarm.
Clear Alarm Text (OK): Identifies the alarm message text that is issued on the clear alarm.
Subsystem String/Id: Identifies the subsystem ID of the alarm that the watcher generates.
celerra > Options Icon > Add New Resource
This section allows you to add a resource in the EMC Celerra Monitoring probe.
Control Station IP: Defines the IP address of the Celerra Network Control Station.
Port: Specifies the ssh port that is used to connect to the Celerra Network Control Station.
Default: 0
Source: Overrides the default QoS source with the provided value. The default value for the QoS source is the robotname where the
probe is deployed.
Important! Recommendation is to not change the Source field after the initial configuration. In case you change the QoS
source later, multiple graphs display on the Unified Service management (USM) Metrics view (one for every QoS source
value). Further, it is also recommended to keep identical source for both alarm and QoS.
The Profile-resource name node allows you to configure the host information of EMC Celerra Monitoring probe.
Navigation: celerra > profile-resource name
Set or modify the following values that are based on your requirement:
profile-resource name > Celerra Host Information
This section allows you to configure the host information of the EMC Celerra Monitoring probe.
<Resource Name> Node
The resource name node represents the resource name of the profile that is used to monitor the EMC Celerra Monitoring probe.
Note: This node is user-configurable and depends on the resource name of the profile. Hence, this node is referred as the resource
name throughout this document.
Storage Node
The Storage node allows you to configure the resource that has been added. This node also lets you add and delete the monitors to be
measured.
The Storage node is of two types of monitors:
Control Station
Storage
Navigation: celerra > Profile-resource name > resource name > Storage
Set or modify the following values that are based on your requirement:
Storage > Resource Configuration
This section allows you to set up the resource configuration of the EMC Celerra Monitoring probe.
<Monitor Name> Node
The monitor name node allows you to select one or more monitors from a list of all available monitors.
Navigation: celerra > Profile-resource name > resource name > Storage > monitor name
Set or modify the following values that are based on your requirement:
monitor name > Monitors List
This section allows you to select one or more monitors from a list of available monitors.
2. The Resource dialog appears. Define a resource that represents a celerra system.
Field
Description
Resource
Name
Source
Name
Control
Station IP
Port
The ssh port used to connect to the Celerra Network Control Station. Unless specifically created otherwise when the
Control Station was set up, this will be port 22.
Reserved
Enter any character in this field. It is unused in the first release but may be activated in future releases.
Username
This is the username created at the time the Celerra Network Control Station was created. It is usually "nasadmin".
Password
Group
Here you can select which group you want the resource to belong to. Normally you just have the Default group.
Alarm
message
Select the alarm message to be sent if the resource does not respond. Note that you can edit the message or define your
own using the Message Pool Manager.
Check
interval
The check interval defines how often the probe checks the values of the monitors.
Login
Scope
The login scope for the Celerra system user. It is recommended to use Global or Local users for monitoring the Celerra
system in order to remove any dependency on an LDAP server.
Note that you may also add monitors to be measured using templates.
Selecting the All Monitors node, all monitors currently being measured will be listed in the right pane. Note that you can also select/deselect
monitors here.
Using Templates
Templates are useful tools for defining monitors to be measured on the various elements of a Celerra system:
You may create templates and define a set of monitors belonging to that template. These templates can be applied to a folder or element by
dragging and dropping the template on the node in the tree where you want to measure the monitors defined for the template. You may also drop
a template on a resource in the tree structure, and the template will be applied to all elements for the resource.
Creating a template
Right-click the Templates node in the left window pane and select New.
Note that you may also edit an existing template by selecting one of the templates defined (found by expanding the Templates node
and selecting Edit).
The Template Properties dialog appears, letting you specify a Name and a Description for the new template.
You may now edit the properties for the monitors under the template as described in the section .
Applying a template
Drag and drop the template on the element where you want to monitor the checkpoints defined for the template. Note that you may also drop the
template on a folder containing multiple elements.
NOTE: Adding many monitors/templates to the Auto Configurations node may result in a very large number of Auto Monitors
(see below).
Auto Monitors
This node lists Auto Monitors created for previously unmonitored devices, based on the contents added to the Auto Configuration node.
The Auto Monitors will only be created for devices that are currently NOT monitored.
Adding a template to the Auto Configurations node
You can add a template (see the section to learn more about templates) by selecting the Templates node in the left pane. All templates available
will now be listed in the right pane. Add a template to the Auto Configurations node by dragging the template, dropping it on the Auto
Configurations node.
Click the Auto Configurations node and verify that the template was successfully added. Note that you must also click the Apply button and restart
the probe to activate the configuration.
click the Apply button and restart the probe to activate the Auto Configuration feature.
Right-clicking in the list, selecting Edit, lets you edit the properties for the template or monitor. See the section for detailed information.
Right-clicking in the list, selecting Delete, lets you delete the template or monitor from the list.
Field
Description
Name
This is the name of the monitor. The name will be inserted into this field when the monitor is created, but you are allowed to
modify the name.
Key
Description
This is a description of the monitor. This description will be inserted into this field when the monitor is created, but you are
allowed to modify it.
Value Definition
This drop-down list lets you select which value to be used, both for alarming and QoS:
You have the following options:
The current value, meaning that the most current value measured will be used.
The delta value (current - previous). This means that the delta value calculated from the current and the previous measured
sample will be used.
Delta per second. This means that the delta value calculated from the samples measured within a second will be used.
The average value of the last and current sample:
(current + previous) / 2.
Active
Enable
Monitoring
Alarms
Operator
Select from the drop-down list the operator to be used when setting the alarm threshold for the measured value.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exactly 90.
Threshold
The alarm threshold value. An alarm message will be sent if this threshold is exceeded.
Unit
Message Token
Select the alarm message to be issued if the specified threshold value is breached. These messages are kept in the
message pool. The messages can be modified in the Message Pool Manager.
Advanced
Publish Quality
of Service
Select this option if you want QoS messages to be issued on the monitor.
QoS Name
Groups
You can create new groups by right-clicking a group and selecting New Group (or by clicking the New Group tool button).
Resources
A group contains one or more resources. On this probe you normally just create one resource. This resource is configured as a link to the
Celerra system. It is possible to move a resource from one group to another, using drag and drop.
Under the Resource node you will find:
The Auto Configurations sub node
One or more checkpoints (or templates) can be added to this node, using drag and drop to be used for auto configuring unmonitored
devices. See the section Using Automatic Configurations for further information.
The Auto Monitors sub node
This node lists Auto Monitors, created for previously unmonitored devices, based on the contents added to the Auto Configuration
node.
Templates
This section enables you to create a template and define a set of checkpoints belonging to that template. Right-clicking on any template under the
Templates node lets you add a new template, edit or delete an existing template.
QoS
This node contain the standard QoS definitions included with the probe package. These can be selected when editing the monitoring properties
for a monitor. To define your own QoS definitions, right-click the QoS node and select New.
Right-clicking in the left pane
Right-clicking in the left pane opens a pop-up menu, displaying the following options:
New Resource
Available only when a group or a resource is selected.
Opens the Resource dialog, enabling you to define a new resource to be monitored.
New Group
Available only when a group or a resource is selected.
Creates a new group where you can place resources. The new group will appear in the pane with the name New Group.
Right-click the new group and select Rename to give the group a name of your own choice.
Edit
Available only when a resource is selected.
Lets you edit the properties for the selected resource.
Delete
Lets you delete the selected element (group, resource or QoS definition). Note that the Default group cannot be deleted, but if you
remove all elements from the group, it will not appear the next time you restart the probe.
Rename
Lets you delete the selected element (group or resource). Note that the Default group cannot be renamed.
Reload
Refreshes the window to display the most current measured values for the monitors.
Note: When selecting refresh, you will not get updated values until the next time the probe has polled the Celerra system. This interval
is set by the Check Interval set on the properties dialog for the Resource.
The contents of the right pane depends on what you select in the left pane:
Resources when a group is selected in the left pane.
Monitors when a resource is selected in the left pane.
Note the following icons:
Monitor where no value has been measured yet.
Black: Indicates that the monitor is NOT activated for monitoring. That means that the Enable Monitoring option is not set in the
properties dialog for the monitor.
Green: Indicates that the monitor is activated for monitoring, and the threshold value defined in the properties dialog for the monitor
is not exceeded.
Other colors: Indicates that the monitor is activated for monitoring, and the threshold value defined in the properties dialog for the
monitor is exceeded. The color reflects the message token selected in the properties dialog for the monitor.
Monitor where QoS is enabled but alarms are not.
QoS definitions when the QoS sub-node is selected in the left-pane.
Right-clicking in the pane gives you the following possibilities:
When the QoS definitions are listed in the pane:
Right-clicking in the list opens a small menu, giving you the possibility to Add (New) or Delete a QoS definition.
When the resources are listed in the pane:
Right-clicking in the list opens a small menu, giving you the following options:
New
Opens the Resource dialog, allowing you to define a new resource.
Edit
Opens the Resource dialog for the selected resource, allowing you to modify the properties.
Delete
Deletes the selected resource.
Activate
Activates the selected resource.
Deactivate
Deactivates the selected resource.
When the monitors are listed in the pane:
Right-clicking in the list opens a small menu, giving you the following options:
Edit
Opens the Monitor properties dialog for the selected monitor, allowing you to modify the properties.
Delete
Unavailable in this view. It will greyed-out.
Refresh
Refreshes the window to display the most current measured values for the monitors.
Note: When selecting refresh, you will not get updated values until the next time the probe has polled the Celerra system.
This interval is set by the Check Interval set on the properties dialog for the Resource.
Add to Template
Allows for the addition of the specified monitor to any existing template.
Clicking the General Setup button opens the General Setup dialog.
Field
General
Description
Log-level
Sets the level of details written to the log file. Log as little as possible during normal operation, to minimize disk consumption.
0= Fatal errors.
1= Errors.
2= Warnings.
3= Information.
4= Debug info.
5= Debug information, extremely detailed.
Enable GUI
auto-refresh
When this option is selected, the GUI will be refreshed each 60 seconds. This will reflect the most current measured values
from the checkpoints and status of the nodes in the tree structure that can be seen when manoeuvring through the Celerra
system appearing under a resource node.
If not checked, you have to press the F5 button to refresh the GUI.
Note: When pressing F5 to refresh, you will not get updated values until the next time the probe has polled the Celerra system.
This interval is set by the Check Interval set on the properties dialog for the Resource.
Environment
Reflects the folder containing Java home.
celerra Metrics
This section contains the QoS metrics for the EMC Celerra Monitoring (celerra) probe. The probe uses string and numeric monitors. String
monitors generate alarms with text values for resources. Numeric monitors generate Quality of Service (QoS) data and alarms with numeric
values for resources.
The numeric metrics in this probe are categorized according to the monitored resource as follows:
Units
Name
QOS_STORAGE_CFG_TOTAL_CAPACITY
GB
QOS_STORAGE_CFG_FREE_CAPACITY
GB
QOS_STORAGE_CFG_FREE_CAPACITY_PERCENT
Percent
QOS_STORAGE_CFG_USED_CAPACITY
GB
QOS_STORAGE_RAW_TOTAL_CAPACITY
GB
QOS_STORAGE_RAW_FREE_CAPACITY
GB
QOS_STORAGE_RAW_FREE_CAPACITY_PERCENT
Percent
QOS_STORAGE_RAW_USED_CAPACITY
GB
QOS_STORAGE_NUM_OF_CONFIGURED_DISKS
Count
QOS_STORAGE_NUM_OF_DEVICES
Count
QOS_STORAGE_NUM_OF_DISKS
Count
QOS_STORAGE_NUM_OF_HOT_SPARES
Count
QOS_STORAGE_NUM_OF_UNCONFIGURED_DISKS
Count
Units
Name
QOS_DMBM_BLOCK_MAP_CONSUMED
KB
QOS_DMBM_BLOCK_MAP_QUOTA
KB
QOS_DMBM_PAGE_IN_RATE
Count
QOS_DMBM_PAGE_OUT_RATE
Count
QOS_DMBM_TOTAL_PAGED_IN
Count
QOS_DMBM_TOTAL_PAGED_OUT
Count
Units
Name
QOS_DMFS_AVAILABLE_CAPACITY
KB
QOS_DMFS_CAPACITY_FREE_PERCENT
Percent
QOS_DMSF_CAPACITY_USED_PERCENT
Percent
QOS_DMFS_TOTAL_CAPACITY
KB
QOS_DMFS_USED_CAPACITY
KB
Units
Name
QOS_DMSS_IDLE_CPU_PERCENT
Percent
QOS_DMSS_MEMORY_FREE
KB
QOS_DMSS_THREADS_BLOCKED
Count
QOS_DMSS_THREADS_IJZ
Count
QOS_DMSS_THREADS_RUNABLE
Count
Network - ICMP
QoS Monitor
Units
Name
QOS_ICMP_CALLS_TO_ERROR
Count
QOS_ICMP_MESSAGES_RECEIVED
Count
QOS_ICMP_MESSAGES_SENT
Count
Network - IP
QoS Monitor
Units
Name
QOS_IP_TOTAL_PACKETS_RECEIVED
Count
QOS_IP_BAD_HEADER_CHECKSUMS
Count
QOS_IP_WITH_UNKNOWN_PROTOCOL
Count
QOS_IP_FRAGMENTS_RECEIVED
Count
QOS_IP_FRAGMENTS_DROPPED
Count
QOS_IP_FRAGMENTS_DROPPED_AFTER_TIMEOUT
Count
QOS_IP_PACKETS_REASSEMBLED
Count
QOS_IP_PACKETS_FORWARDED
Count
QOS_IP_PACKETS_NOT_FORWARDABLE
Count
QOS_IP_NO_ROUTES
Count
QOS_IP_PACKETS_DELIVERED
Count
QOS_IP_TOTAL_PACKETS_SENT
Count
QOS_IP_PACKETS_FRAGMENTED
Count
QOS_IP_PACKETS_NOT_FRAGMENTABLE
Count
QOS_IP_FRAGMENTS_CREATED
Count
Network - TCP
QoS Monitor
Units
Name
QOS_TCP_PACKETS_SENT
Count
QOS_TCP_DATA_PACKETS_RETRANSMITTED
Count
QOS_TCP_RESETS
Count
QOS_TCP_PACKETS_RECEIVED
Count
QOS_TCP_CONNECTION_REQUESTS
Count
QOS_TCP_CONNECTIONS_LINGERED
Count
Network - UDP
QoS Monitor
Units
Name
QOS_UDP_BAD_PORTS
Count
QOS_UDP_INPUT_PACKETS_DELIVERED
Count
QOS_UDP_INCOMPLETE_HEADERS
Count
QOS_UDP_PACKETS_SENT
Count
Storage Systems
QoS Monitor
Units
Name
QOS_SS_CACHE_PAGE_SIZE
Count
QOS_SS_HIGH_WATER_MARK
Count
QOS_SS_LOW_WATER_MARK
Count
QOS_SS_NUMBER_OF_DEVICES
Count
QOS_SS_NUMBER_OF_DISKS
Count
QOS_SS_NUMBER_OF_PHYSICAL_DEVICES
Count
QOS_SS_NUMBER_OF_RAID_GROUPS
Count
QOS_SS_NUMBER_OF_STORAGE_GROUPS
Count
QOS_SS_UNASSIGNED_CACHE
Count
Units
Name
QOS_SSDG_USED_PERCENT
Percent
QOS_SSDG_FREE_PERCENT
Percent
QOS_SSDG_LOGICAL_CAPACITY
Bytes
QOS_SSDG_RAW_CAPACITY
Bytes
QOS_SSDG_USED_CAPACITY
Bytes
QOS_SSDG_FREE_CAPACITY
Bytes
Units
Name
QOS_SSPIN_CAPACITY
Bytes
QOS_SSPIN_USED_CAPACITY
Bytes
Units
Name
QOS_SSSP_FREE_MEMORY
Count
QOS_SSSP_RAID_3_MEMORY_SIZE
Count
QOS_SSSP_READ_CACHE
Count
QOS_SSSP_SYSTEM_BUFFER
Count
QOS_SSSP_WRITE_CACHE
Count
QOS_SSSP_PHYSICAL_MEMORY
Count
Units
Name
QOS_SVD_PERCENT_USED
Percent
QOS_SVD_SIZE_TOTAL
MB
QOS_SVD_SIZE_AVAILABLE
MB
QOS_SVD_SIZE_USED
MB
QoS Monitor
Units
Name
QOS_SVG_PERCENT_USED
Percent
QOS_SVG_SIZE_TOTAL
MB
QOS_SVG_SIZE_AVAILABLE
MB
QOS_SVG_SIZE_USED
MB
QoS Monitor
Units
Name
QOS_SVM_PERCENT_USED
Percent
QOS_SVM_SIZE_TOTAL
MB
QOS_SVM_SIZE_AVAILABLE
MB
QOS_SVM_SIZE_USED
MB
Units
Name
QOS_SVSL_PERCENT_USED
Percent
QOS_SVSL_SIZE_TOTAL
MB
QOS_SVSL_SIZE_AVAILABLE
MB
QOS_SVSL_SIZE_USED
MB
QOS_SVSL_SIZE
Count
QOS_SVSL_OFFSET
Count
QoS Monitor
Units
Name
QOS_SVST_PERCENT_USED
Percent
QOS_SVST_SIZE_TOTAL
MB
QOS_SVST_SIZE_AVAILABLE
MB
QOS_SVST_SIZE_USED
MB
QOS_SVST_SIZE
Count
More information:
cisco_monitor (Cisco SNMP Device Monitoring) Release Notes
cisco_monitor IM Configuration
You can configure the cisco_monitor probe to map selected SNMP enabled Cisco hardware devices and define variable settings to monitor the
performance of these devices.
SNMP V3 Support
The cisco_monitor probe enables you to monitor hosts/agents based on the SNMP V3 protocol. You must adhere to the following guidelines when
monitoring the SNMP V3 hosts:
If the same probe instance is monitoring multiple SNMP V3 hosts/agent, ensure that the EngineID of all the hosts/agents is unique. The
absence of unique EngineID causes sporadic connection timeouts and failure alarms.
The probe does not support creating multiple monitoring profiles for one V3 host/agent. Adding such duplicate profiles is disabled in the
probe GUI at most of the places, except the Bulk Configure screen. Do not use the Raw Configure option or add directly in the
configuration file for creating multiple profiles for the same V3 host/agent. This can cause some unpredictable results.
The following diagram outlines the process to configure the cisco_monitor probe.
Contents
Verify Prerequisites
Configure General Properties
Configure Agent Group
Create SNMP Agent Profile
Configure the Variable Properties
Set Up Bulk Configuration of SNMP Agents
Get Oid Values
Monitor a Variable
Monitor Devices Using Ping Sweep
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see cisco_monitor (Cisco SNMP
Device Monitoring) Release Notes.
SNMP request timeout: specifies the timeout value for the SNMP requests. You can also override this value for each profile.
Default alarm message string: specifies the default alarm message issued when alarm situations occur.
Log-level: specifies the level of details that are written to the log file.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
3. Click OK.
The general properties of the probe are now configured.
Note: You can drag and drop agents to move them between groups. Deleting a group also deletes the agents of that group. A
message displays to confirm whether you want to delete the group with its agent definitions or not.
Rename
Lets you rename the selected group or agent. You are not allowed to rename the Default group.
Ping
Pings the selected agent. If the agent responds, you are notified with the ping round-trip time.
This option is available only when an agent is selected.
Information
Opens a dialog containing system and configuration information for the selected agent.
This option is available only when an agent is selected.
Follow these steps:
1. Right click on the Default Group and select New Group. You can also click the Create a New Agent Group button in the tool bar.
A folder New Group(n) is created in the tree.
2. Rename the group name, as required.
3. Click OK to save the group.
When an agent is unavailable, a red icon is displayed next to the agent name.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Severity: specifies the severity for alarm messages if the agent host does not respond.
Group: specifies the agent group in which the agent is to be placed.
SNMP Settings: specifies the SNMP parameters for the profile. For more information, see the Configure SNMP Parameters section.
QoS Identification Method: enables you to define a QoS source.
3. Click Test to send a test SNMP query.
The probe displays a successful connection message.
4. Click OK and Apply to save the configuration.
The profile is created for the required SNMP Agent. The probe queries the SNMP agent and displays an
This indicates that the host is active and available for monitoring.
7. Select Show password to display the string in the Community/password field as plain text.
Default: Not Selected
8. Specify a username to access the monitored device.
Note: The Username, Security, Priv. Protocol, and Priv. Passphrase fields are enabled only when SNMPv3 is selected from
the SNMP Version field.
9.
9. Specify the security level for the user. Valid levels are NoAuthNoPriv, AuthNoPriv, and AuthPriv.
Note: The Priv. Protocol and Priv Passphrase fields are only enabled when AuthPriv is selected.
10. Specify the privacy protocol to use in the Priv. Protocol field.
11. Specify your privacy passphrase in the Priv. Passphrase field. It must be at least eight characters long.
The SNMP parameters are configured for the profile.
Important! You must set a rule before the Extend Rule option is selected. The Extend Rule tab is disabled if you select
the operator = in the Rule tab.
For example: If you create a rule as: Operator =>, Threshold Value 5, Severity Level Major, and you create an extended rule as: Op
erator =>, Threshold Value 10, Severity Level Critical. This means that if the measured value is 6, the alarm message specified on
the Rule tab (severity level Major) is issued. If the measured value is 13, the alarm message specified on the Extended Rule tab
(severity level Critical) is issued.
Operator: select an operator when defining a threshold value for alarms to be issued.
Threshold Value: specify an alarm threshold value, which when breached, issues an alarm.
Severity level: select the severity level of the alarms, issued when the specified threshold value is breached.
Unit: indicates the unit that is combined with the threshold value (for example, MB, and, %).
Message String: specifies a text string that describes the alarm situation. If this field is kept blank, the default message (defined
under the General Properties for the probe) is displayed.
Clear Message String: specifies a text string that overrules the default clear message text. If this field is kept blank, the default clear
message is displayed.
Publish Quality of Service (QoS): if selected, enables you to send QoS data at the specified check interval for the profile.
2. Click OK.
Notes:
The Average Value is calculated on the basis of the value provided in the Calculate average based on samples field in the C
onfigure the Variable Properties section.
For state variables (Fan State, Temperature State, and so on), the cisco_monitor probe does not calculate the average value. It
displays the actual values of these state variables as average values under the Average Value column. The alarms and QoS
for these variables are also generated based on actual values.
The probe generates alarms on the basis of values displayed under the Average Value column, depending on the conditions
specified in the Configure the Variable Properties dialog (through Operator and Threshold Value fields).
The probe always generates QoS on actual value.
The Misses Oids (Big Buffer Misses, Huge Buffer Misses, Large Buffer Misses, and so on) generate alarms on change in
average value and QoS on change in actual value.
Monitor a Variable
The Monitor option for a variable allows for a continuous monitoring of specific variables, and runs independent of the probes general QoS/alarm
monitoring of an agent. You can right click a SNMP variable and select the Monitor option to monitor the variable.
Note: Click the Interval field from the monitor window and select the sample interval from the interval drop-down list, that appears. This
sample interval is for the monitor window only and has nothing to do with the sample rate defined in the agent profile.
1.
The Ping Sweep dialog appears.
2. Enter the starting IP address of an IP address range in the IP Address field and the number of hosts in the range in the Number field.
3. Click Generate.
The specified range appears in the form of a table in the Ping Sweep dialog.
Note: The probe ignores duplicate IP addresses. You can generate multiple ranges before starting the ping sweep.
4. Select a connection information from the Connection information drop-down list that you want to use for the router detection. The
Advanced Tab in the Setup window is used to Add Connection Information, which is used in the Ping Sweep feature to connect to the
device.
5. Click Start.
If the Auto option is selected in the Connection information dialog then each IP address connection is checked. A red icon indicates
failure and a green icon indicates a successful connection. Further, for device discovery, the sysname OID is checked for Cisco strings.
If the sysname does not contain Cisco string then ping sweep does not detect the Cisco device.
6. You can drag one or more (multi select) discovered agents and drop into any group in the left pane or to the root level.
1.
Username: specifies the username that needs to be used (valid only in case of SNMPv3.
Security: specifies the security descriptor to be used (valid only in case of SNMPv3).
Privacy Protocol: specifies the Privacy protocol to be used (valid only in case of SNMPv3).
Privacy Passphrase: specifies the Privacy Passphrase to be used (valid only in case of SNMPv3).
2. Click OK.
The connection information is now added.
Note: If you do not want to use this connection, select the connection and click Delete.
cisco_monitor Metrics
This article describes the metrics that can be configured using the Cisco SNMP Device Monitoring (cisco_ucm) probe.
QoS Metrics
The following table describes the checkpoint metrics that can be configured using the cisco_monitor probe.
Monitor Name
Units
Description
Version
QOS_CISCO_BUFFER_MISSES
Count
The average number of times that the router failed to queue a packet in a buffer.
3.2
QOS_CISCO_ENVIRONMENT
State
3.2
QOS_CPU_USAGE
Percent
3.2
QOS_MEMORY_USAGE
Megabytes
3.2
QOS_MEMORY_USAGE_PERC
Percent
3.2
Note: The QoS for the monitor Cisco Buffer Misses are calculated based on the delta value. Thus, these QoS are delayed according
to the check interval, for the first time.
Error
Threshold
Error
Severity
Description
90
Minor
90
Minor
90
Minor
10240
Minor
The average number of times that the router failed to queue a packet in a small buffer (64 to 104
bytes).
Medium Buffer
Misses
10240
Minor
The average number of times that the router failed to queue a packet in a medium buffer (105 to 600
bytes).
10240
Minor
The average number of times that the router failed to queue a packet in a big buffer (601 to 1524
bytes).
10240
Minor
The average number of times that the router failed to queue a packet in a large buffer (1525 to 5024
bytes).
10240
Minor
The average number of times that the router failed to queue a packet in a very large buffer.
10240
Minor
The average number of times that the router failed to queue a packet in a huge buffer (5025 to 120924
bytes).
Memory Free
Minor
Indicates the number of megabytes from the memory pool that are currently unused on the managed
device.
Memory Used
10
Minor
Indicates the number of megabytes from the memory pool that are currently in use by applications on
the managed device.
Memory Percent
Free
10
Minor
Minor
Temperature State
Minor
Fan State
Minor
Voltage State
Minor
Operational Status
Critical
More information:
cisco_qos (Cisco Class-Based QoS Statistics Monitoring) Release Notes
cisco_qos Metrics
Contents
The following table describes the QoS metrics that can be configured using the cisco_qos probe.
QoS Name
Units
Description
Version
QOS_CISCO_DROP_BITRATE
Bits/sec
1.2
QOS_CISCO_DROP_BYTE
Bytes/sec
1.2
QOS_CISCO_DROP_PKTS
Packets
Dropped Packets
1.2
QOS_CISCO_POST_POLICY_BITRATE
Bits/sec
1.2
QOS_CISCO_POST_POLICY_BYTE
Bytes/sec
1.2
QOS_CISCO_PRE_POLICY_BITRATE
Bits/sec
1.2
The following table describes the default settings for the cisco_qos alert metrics.
QoS Metric
Error Threshold
Error Severity
Description
Version
MsgAgentError
Critical
1.2
MsgDropBitrate
Minor
1.2
MsgPostPolicyBitrate
Minor
1.2
MsgDropPackets
Minor
Alarms to be issued when the dropped packets per second is below threshold.
1.2
MsgDropBytes
Minor
1.2
MsgPostPolicyBytes
Minor
1.2
MsgPrePolicyBitrate
Minor
1.2
ClassMap - A user-defined traffic class that contains one or many match statements used to classify packets into different categories.
Feature Action - An action is a QoS feature. Features include police, traffic-shaping, queueing, random detect and packet marking(set). After the
traffic is being classified, based on the traffic classification, we can apply these action to each traffic class.
PolicyMap - A user-defined policy that associates each QoS action to the user-defined traffic class (ClassMap).
Service Policy - Service policy is a policymap that is being attached to a logical interface. Because a policymap can also be a part of the
hierarchical structure (inside a classmap), only a policymap that is directly attached to a logical interface is considered a service policy. Each
service policy is uniquely identified by an index called cbQosPolicyIndex. This number is usually identical to its cbQosObjectsIndex as a
policymap.
Probe GUI
The cisco_qos probe is configured by double-clicking the line representing the probe in the Infrastructure Manager. This brings up the
configuration tool.
The window consists of two window panes: Left Pane and Right Pane.
Delete
Lets you delete the selected group or host from the list of monitored devices. Note that you are not allowed to delete the Default group.
Expand group
Expands/opens the group folder to show all associated hosts.
Collapse group
Collapses/closes the group folder to hide all associated hosts.
Show all agents
Selecting this option shows a folder called All Agents containing all agents (SNMP hosts) defined. If you clear this option, it will hide the
folder.
Note: This overrides the Show 'All Agents' option in the Setup dialog. In other words, if the option is selected here, but not in
the Setup dialog, the All Agents folder will be shown.
Query SNMP Agent Ctrl + Q
Selecting this option sends a query to the selected agent to check the current status. A successful response should look like the one
shown below:
Rediscover interfaces/policies
Rediscovers all interfaces and service policies on the selected host. The option is useful if one or more interfaces have been added on
the host.
Note: All interfaces on the host will be rediscovered, and you will be asked if you want to use the default monitoring parameters
(these can be edited by clicking the Set the default classmap parameters button in the toolbar.
You must click Yes to the following question to proceed, and No or Cancel to exit.
The last dialog asks if you wish to clear the existing interfaces and policies.
Clicking Yes, all interfaces and policies on the host will be rediscovered, and the default parameters will be used for all of them.
Clicking No, all interfaces and policies on the host will be rediscovered:
The ones already monitored (listed in the right pane when the host is selected in the left pane) will keep their monitoring properties.
The policies rediscovered without a current monitoring profile will be assigned the default monitoring parameters.
Probe Configuration
This section contains specific configuration for the probe.
General Setup
to open the Setup dialog for the probe. The Setup dialog contains two tabs: General tab and Advanced t
General Tab
Note: Right-clicking in the left pane of the user interface and selecting Show All Agents overrides this option. In other words,
even if the option is not selected here but is selected in the right-click menu of the left pane, the All Agents folder will be
shown.
Log-size
Sets the size of the probes log file to which probe-internal log messages are written. The default size is 100 KB.
When this size is reached, the contents of the file are cleared.
Advanced Tab
You may create a new folder/group by selecting the New folder button
1.
1. Select the group it should belong to and press the New SNMP host icon
in the menu-bar.
The Host Profile [New] dialog will appear and prompt you for the hostname or IP-address to the device/system.
2. Enter the requested data for the fields as described below.
Host address
Defines the host name or ip-address of the host to be monitored.
Poll interval
Specifies the poll interval individually specified for the selected host. You may select the default polling interval, which means the
general value set for the probe (General setup), or you may select another value from the drop-down list.
SNMP version
Defines the SNMP software version number supported by the monitored device.
Authentication (SNMPv3 only)
Specifies the type of authentication strategy (none, HMAC-MD5-96 or HMAC-SHA-96).
Port
Defines the port to be used by the SNMP device. Default is 161.
Timeout
Specifies the timeout value for the SNMP requests. Use the default value, or select another value from the drop-down list. The default
value is 1 second.
Retries
Sets the number of times to send SNMP requests without response from the device before giving up, regarding the device as not
available. The default number of retries is 5.
Community / Password
Specifies a password for the profile.
Username (SNMPv3 only)
Specifies a username defined on the monitored device.
Show Community/ Password
When selected, the entry in the password field will be shown as plain text.
Security
Specifies the security level for the user. Valid levels are NoAuthNoPriv, AuthNoPriv, and AuthPriv.
Priv. Protocol
Specifies the privacy protocol to use.
Priv Passphrase
Defines your privacy passphrase; not needed if the security level is NoAuthNoPriv or AuthNoPriv. Must be at least eight characters
long.
Monitoring group
Specifies the name of the group to which you want the host to belong.
Description
Provides the description of the profile.
Alarm Message
Selects the alarm message to be sent on agent error: Either use the default message ID, or another one defined in the Message Pool
Manager.
Message Identification Settings
Alarm identification method
Select one of the Alarm identification methods in order to specify the alarm source.
QoS identification method
Select one of the QoS identification methods in order to specify the QoS source.
Start Query
Sends a query to the host to check the response and that the communication works.
3. When finished, press the Start Query button and wait for about 10 seconds.
If the query was successful then a dialog-box showing some system information about the device will appear and the OK button will
become active.
3. Set the SNMP properties (refer properties for host profiles as described in section Create New Host/Device Profile) and Monitoring
group (the name of the group in which you want to place the hosts).
4. Click the Start Query button to collect SNMP data from the hosts.
5. Click OK to activate the hosts answering the query (indicated with a green indicator in the SNMP query window).
The alarm messages for each alarm situation are stored in the Message Pool. Using the Message Pool Manager, you can customize the alarm
text, and you may also create your own messages.
Note that variable expansion in the message text is supported. If typing a $ in the Alarm text field, a dialog pops up, offering a set of variables to
be chosen.
Click the Set the default classmap parameters icon in the toolbar
default monitoring properties for the service policies.
to open the Default Classmap Settings dialog, enabling you to set the
These default parameters will be used when running the Rediscover interfaces/policy function. This is activated by right-clicking a host in the
left pane and selecting Rediscover interfaces/policy, if not selecting to merge data.
Note that all interfaces on the host will be rediscovered, and you will be asked if you want to use the default monitoring parameters (defined by
clicking the Default Service policy parameters button in the toolbar of the UI.
You must click Yes to the following question to proceed, and No or Cancel to exit.
The last dialog asks if you wish to clear the existing interfaces and policies.
Clicking Yes, all interfaces and policies on the host will be rediscovered, and the default parameters will be used for all of them.
Clicking No, all interfaces and policies on the host will be rediscovered:
The ones already monitored (listed in the right pane when the host is selected in the left pane) will keep their monitoring properties.
The policies rediscovered without a current monitoring profile will be assigned the default monitoring parameters.
More information:
cisco_ucm (Cisco Unified Communications Manager Monitoring) Release Notes
cisco_ucm AC Configuration
This article describes the configuration concepts and procedures to set up the Cisco Unified Communications Manager Monitoring
(cisco_ucm) probe. The cisco_ucm probe allows you to monitor the health and performance of UCM systems and services.
The probe is configured to create two types of monitoring profiles:
Host Profile: This profile represents the system on which the probe is deployed. You can add custom checkpoints to the host profile and
can enable respective monitors.
CAR Profile: CAR stands for CDR (Call Data Record) Analysis Report. CAR profile lets you track the call statistics and generate health
and performance report of network calls. The cisco_ucm probe uses File Transfer Protocol (FTP) to retrieve CDR files and generate the
CDR Analysis Report. You can configure the FTP settings through the probe.
The following diagram outlines the procedure the configure the probe.
Contents
Verify Prerequisites
Configure General Properties
Create a Resource
Create Host Profile
Select and Configure Host Counters
Create CAR Profile
Alarm Thresholds
View Properties of the Alarm Message
Verify Prerequisites
Verify that required hardware and software is available and installation prerequisites are met before you configure the probe. For more
information, see cisco_ucm (Cisco Unified Communications Manager Monitoring) Release Notes.
Note:Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Log Size: specifies the size of the log file to which the internal log messages of the probe are written, in kilobytes.
Note:New log file entries are added and the older entries are deleted when the log file is of the specified size.
Send Session Alarms: enables you to generate alarms for deactivated sessions.
Check Interval: specifies the time interval to retrieve monitoring information from the UCM system.
Default: 1
Note:Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Check Interval Unit: allows you to select the unit for the Check Interval field.
Default: Minutes
Sample Interval: specifies the time duration for calculation of the average value for the monitors.
Example: if you specify 12 hours, the probe calculates the average value for the monitors from the last 12 hours.
Default: 6
Sample Interval Unit: allows you to select the unit for the Sample Interval field.
Default: Hours
AXL Timeout (sec): specifies the threshold value (in seconds) to connect to the Cisco UCM server through Administrative XML Layer
(AXL). The probe generates an alarm after the connection attempt time exceeds the timeout value.
Default: 60
3. Set or modify the following fields to configure CAR properties of the probe, as required:
Report Timeout (sec): specifies the time interval (in seconds) within which the cisco_ucm probe generates the call session report.
Default: 60
CAR Interval: specifies the time interval between each instance when the probe retrieves the CDR data.
CAR Interval Unit: allows you to select the unit for the CAR Interval field.
Maximum Num of Threads: specifies the total number of profiles that the probe can simultaneously monitor.
Default: 10
Insert CAR Details to SLM NIS Database: enables the probe to write the data, which is collected for the CAR profile, to the tbnCDR
database. The database resides at the path which is specified in the Data Engine Address field.
Note: The probe supports Microsoft SQL Server database as the backend database for the Insert CAR Details to SLM NIS
Database option.
Data Engine Address: specifies the address of the data engine of the robot.
4. Select Validate Data Engine from the Actions menu to verify the data engine path.
Note: The Validate Data Engine is required only for CAR reporting.
5. Click Save.
The general configuration settings of the probe are saved.
Create a Resource
You can add a resource in the probe. The resource connects to the Cisco UCM server and monitors through host and CAR monitoring profiles.
Follow these steps:
1. Click the Options (icon) next to the cisco_ucm node in the navigation pane.
Note: Enter valid user credentials with administrative privileges to the Cisco UCM server.
Node List: allows you to select the service of the Cisco UCM to use for monitoring the resource. Each name in the list represents a
service of the Cisco UCM available for monitoring.
The list of values for this field are automatically populated after valid details are provided in the previous fields.
4. Click Submit.
The resource is visible as the hostname node under the cisco_ucm node.
5. Navigate to the hostname node.
The properties of the resource are displayed in the Resource Configuration section.
6. Select Validate Data Engine from the Actions menu to generate a list of nodes available for monitoring.
Note: You can also delete a resource to prevent the probe to connect to the Cisco UCM server. Select Delete from the Options (icon)
next to the hostname node.
Note: The probe selects the default node name which is specified at the resource profile for an invalid node name.
4.
The host profile is created and is visible as the ProfileName node.
5. Navigate to the ProfileName node.
6. Select Test Profile from the Actions menu to verify the connection information.
7. Click Save.
Note: You can also delete a host profile to prevent the probe to monitor the services of the Cisco UCM. Select Delete from the Options
(icon) next to the ProfileName node.
Note: You can also double-click a counter from the Selected list or click
4. Click Submit.
The selected counters are available under the ProfileName node in the navigation pane.
5. Select the required counter.
A list of configurable monitors is displayed as CounterName nodes.
6. Select the required monitor from this list.
7. Set or modify the following fields to configure the monitors to generate QoS data and alarms:
Publish data: allows you to enable QoS.
Note: When you select Publish Data, the value of Data column for the monitor in the table changes from Off to On.
Note: The Operator, Threshold, Unit, and Message Token fields are enabled only when Publish Alarms is selected.
Note: Select Description from the Actions menu to retrieve a predefined description for the monitor. You can also copy the
description text to the Description field.
Last Interval: specifies the time interval when the probe collects values to compare with the threshold value.
Last Interval Unit: allows you to select the unit of measurement for the Last Interval field.
Default: Minutes
Value Definition: specifies the type of value that is compared with the threshold value. The value is used for QoS messages and
alarms. You can select one of the following value types:
Current Value: uses the last retrieved value for comparison.
Average Value: uses the average of retrieved values for comparison. The sample time is specified in the Sample Interval field in
the General Configuration section.
Delta Value (Current - Previous): uses the difference between the current value and previous value for comparison.
Operator: specifies the comparison operator for thresholds.
Threshold: defines the threshold value of the monitor.
Unit: defines the unit of measurement of the threshold value.
Important! Do not specify a value in Unit for monitors that represent the status of the monitored object.
Message Token: specifies the alarm message to be issued when the specified threshold is breached.
QoS Name: specifies the default QoS that are available for the monitor.
8. Click Save to save the configuration.
The probe monitors the specified monitors in the selected counters.
Note: You can also delete a counter in the host profile. Select Delete Counter from the Options (icon) next to the CounterName node.
Note: The probe selects the default node name which is specified at the resource profile for an invalid node name.
FTP Hostname: defines the host name or IP address of the FTP server.
FTP Username: defines a valid user name to access the FTP server.
FTP Password: defines the password for the user account that is specified in the FTP Username field.
Remote Directory: defines the path of the directory with the FTP client to store the CDR data.
FTP Actual path: specifies the path of the directory from where the probe retrieves the CDR data.
Secure FTP: allows you to specify that the FTP connection is secure.
Time Zone Offset: specifies the time zone difference between the location of the Cisco UCM server and the robot with the probe.
QoS Identification Method: specifies whether the FTP server is identified through the profile name or the host name.
4. Click Submit.
The CAR profile is created and is visible as the CARProfileName node.
5. Click Save.
The profile configuration is saved and a list of available monitors is displayed in the created CAR profile.
6. Select the required monitor from the list.
7.
7. Set or modify the following fields to configure the monitors to generate QoS data:
Publish Data: enables the probe to generate QoS data for the selected monitor.
Note: When you select Publish Data, the value of Data column for the monitor in the table changes from Off to On.
Note: You can also delete a CAR profile to prevent the probe to monitor Call Data Records. Select Delete from the Options (icon) next
to the CARProfileName node.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
cisco_ucm IM Configuration
This article describes the configuration concepts and procedures to set up the Cisco Unified Communications Manager Monitoring
(cisco_ucm) probe. The cisco_ucm probe allows you to monitor the health and performance of UCM systems and services.
The following diagram outlines the procedure to configure the probe.
Contents
Verify Prerequisites
Configure General Properties
Create a Group
Create a Profile
Apply Monitors to Profile
Configure Monitor Properties
Create and Configure Templates
Display Current Monitor Values
Create and Configure Custom QoS
Launch the Message Pool Manager
Verify Prerequisites
Verify that required hardware and software is available and installation prerequisites are met before you configure the probe. For more
information, see cisco_ucm (Cisco Unified Communications Manager Monitoring) Release Notes.
1. Click
.
The Setup dialog appears.
2. Open the General tab.
3. Set or modify the following fields to configure the general probe properties, as required:
Log-level: specifies the level of details that are written to the log file.
Default: 0
3.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Logfile size: specifies the size of the log file where the internal log messages of the probe are written, in kilobytes. When this size is
reached, new log file entries are added and the older entries are deleted.
Send Session Alarms: enables you to generate alarms for deactivated sessions.
4. Open the Advanced tab.
5. Set or modify the following fields to configure interval and timeout properties of the probe, as required:
Check interval: specifies the time interval to retrieve monitoring information from the UCM system.
Default: 1 min
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Sample period: specifies the time duration for calculation of the average value for the monitors.
Example: if you specify 12 hours, the probe calculates the average value for the monitors from the last 12 hours.
Default: 6 hours
AXL timeout: specifies the threshold value (in seconds) to connect to the Cisco UCM server through Administrative XML Layer (AXL).
The probe generates an alarm after the connection attempt time exceeds the timeout value.
Default: 60
6. Set or modify the following fields to configure CAR properties of the probe, as required:
Report timeout: specifies the time interval (in seconds) when the cisco_ucm probe generates the call session report.
Default: 60
Maximum num of threads: specifies the total number of profiles that the probe can simultaneously monitor.
Default: 10
CAR Interval: specifies the time interval between each instance when the probe retrieves the CDR data.
Default: 1 min
Insert CAR details to SLM NIS database: enables the probe to write the data, that is collected for the CAR profile, to the tbnCDR da
tabase. The database resides at the path that is selected from the drop-down.
Note: The probe supports Microsoft SQL Server database as the backend database for the Insert CAR Details to SLM NIS
Database option.
7. Click Validate to verify the data engine path.
8. Click OK.
The general configuration settings of the probe are saved.
Create a Group
You can create a group to categorize profiles in the probe. You can also use the Default Group that is available when you deploy the
probe. Groups can be used to segregate profiles according to your requirement. On selecting a group, all host profiles in that group are listed in
the right pane.
You can right-click a group for the following options:
New Host: opens the profile dialog, enabling you to define a new host to be monitored.
New Group: opens the profile dialog, enabling you to define a new group. Use the group folders to place the hosts in logical groups.
You can select a group and right-click in the right pane for the following options:
New: opens the profile dialog to define a new host to be monitored or a QoS definition.
Edit: opens the profile dialog for the selected host, enabling you to modify the properties for the host.
Delete: allows you to delete the selected host.
Rename: allows you to rename the selected host.
Note: If you do not want to use a group, right-click on the group and click Delete. You cannot rename or delete the Default
Group.
Create a Profile
You can create monitoring profiles in the probe. A probe can simultaneously monitor multiple Cisco UCM hosts. You can also create multiple
profiles to monitor the same host. The maximum number of active profiles is specified in the Maximum num of threads field of the Advanced tab
of Setup window. On selecting a profile, all activated Cisco UCM monitors are listed in the right pane.
The probe can be configured to create the following types of monitoring profiles:
Host Profile: This profile represents the system where the probe is deployed. You can add custom checkpoints to the host profile and can
enable respective monitors.
CAR Profile: CAR stands for CDR (Call Data Record) Analysis Report. A CAR profile allows you to track call statistics and generate
health and performance reports of network calls. The cisco_ucm probe uses File Transfer Protocol (FTP) to retrieve CDR files and
generate the CDR Analysis Report. You can configure the FTP settings through the probe. For more information, see Create and
Configure CAR Profile.
The profile status is indicated as one of the following indicators:
indicates that the host responds and can access all checkpoints.
indicates that the host does not respond (such as host stopped).
indicates that the host responds, but there are some problems.
indicates that the host is initializing and waiting for measured values from the profile.
indicates that the profile is created and stored, but is not active.
Follow these steps:
1. Click
from the toolbar.
The Select Unified Server Type dialog appears.
2. Select the type of Cisco UCM server that is monitored.
A wizard is launched guiding you through the steps necessary to define a new host.
3. Read the instructions in the Welcome dialog and click Next to continue.
4. Set or modify the following fields to specify the server details:
Hostname or IP Address: defines the hostname or IP address of the Cisco UCM server.
4.
Port: specifies the communication port to use to connect to the Cisco UCM server.
Default: 8443
5. Click Next to continue.
6. Set or modify the following fields to specify the authentication details of the Cisco UCM server:
Username: defines a valid username to access the Cisco UCM server.
Password: defines the password for the user account that is specified in the Username field.
Note: Enter valid user credentials with administrative privileges to the Cisco UCM server.
1. Click
14. Set or modify the following fields to configure the monitors to generate QoS data:
Name: defines a unique name for the monitor.
Key: displays the name of the UCM node and monitor.
Publish Quality of Service (QoS): enables the probe to generate QoS data for the selected monitor.
QoS Name: specifies the default QoS that are available for the monitor.
15. Click OK to save the configuration.
16. Click Apply.
The probe monitors the specified monitors in the selected profile.
Note: You can also delete a CAR profile to prevent the probe to monitor Call Data Records. Right-click the profile name and select Dele
te.
Note: If phone details are not available, the probe generates a blank Phone Table report.
2.
Important! CA recommends you to use auto-configuration to create auto-monitors to monitor the Cisco UCM system. You can
configure static monitors for specific instances. Multiple static monitors can increase the size of the .cfg file and causes performance
issues.
You can view and configure all monitors that are applied to a profile in the All Monitors node.
Note: The profile monitors initially are not activated. You can activate the ones that you want to monitor. The Enable Monitoring option
on the Monitor Properties dialog for the checkpoint is automatically selected.
Important! If the templates are applied to the inventory tree, the inventory is recursively navigated to create static monitors. The
monitors are created for every checkpoint applicable to the template. Auto-monitor behavior in static form creates a large cfg file and
also creates issues with the performance of your probe.
You can add monitors and templates to the Auto Configurations node of a resource. Auto configuration applies the selected monitors to all
components of a resource. Monitors or checkpoints are automatically created for devices that are currently not monitored.
The Auto Configuration feature consists of two nodes that are located under the resource node in the left pane:
Auto Configurations: displays a list of monitor configurations that are automatically applied to the profile.
Auto Monitors: displays a list of all monitors that are configured using the monitor configurations in the Auto Configurations node.
Follow these steps:
1. Select a monitor from a template or the list of available monitors.
2. Drag-and-drop the selected monitor to the auto configuration node.
You can drag-and-drop a template to apply configurations of all monitors in the template.
3. Click the Apply button and restart the probe to activate the changes.
A list of monitors with the applied configurations are displayed in the Auto Monitors node.
Note: Adding multiple monitor or templates to the Auto Configurations node results in multiple Auto Monitors.
3.
Monitoring Object: displays the name of the UCM node, counter, and monitor.
Description: defines additional information from the monitor.
Note: Click Query Description to retrieve a predefined description for the monitor. Closing the window copies the
description text to the Description field.
Value Definition: specifies the type of value that is compared with the threshold value. The value is used for QoS messages and
alarms. You can select one of the following value types:
Current Value: uses the last retrieved value for comparison.
Average Value: uses the average of retrieved values for comparison. The sample time is specified in the Sample Interval field in
the General Configuration section.
Delta Value (Current - Previous): uses the difference between the current value and previous value for comparison.
last (mins): specifies the time interval when the probe collects values to compare with the threshold value.
Enable Monitoring: allows you to enable alarm message generation.
Note: The Operator, Threshold Value, Unit, and Message Token fields are enabled only when Publish Alarms is
selected.
Important! Do not specify a value in Unit for monitors that represent the status of the monitored object.
Message Token: specifies the alarm message to be issued when the specified threshold is breached.
Publish Quality of Service (QoS): allows you to enable QoS generation.
QoS Name: specifies the default QoS that are available for the monitor.
4. Click OK to save the configuration.
5. Select the checkbox next to the monitor name to activate the monitor.
6. Click Apply.
The probe uses the specified monitor.
Note: The horizontal red line in the graph indicates the alarm threshold that is defined for the monitor.
When clicking and holding the left mouse button inside the graph, a red vertical line appears. You can continue to hold the left mouse button down
and move the cursor. The window displays the exact value at different points in the graph. The value is displayed in the upper part of the graph on
the format: <Day> <Time> <Value>.
Right-clicking inside the monitor window lets you select the backlog (the time range that is shown in the monitor window). In addition, the
right-click menu lets you select the option Show Average. A horizontal blue line is added in the graph, representing the average sample value.
The following information can be found in the status bar at the bottom of the monitor window:
The number of samples from the probe start
The minimum value measured
The average value measured
Note: You can drag-and-drop a template to the Auto Configurations node to apply all the monitors in the template as Auto Monitors,
as applicable. For more information, see Auto Configure Monitors.
Note: Variable expansion in the message text is supported. For more information, see Variable Expansion.
1. Click
from the toolbar.
The Message Pool window appears.
2. Click
from the toolbar.
The Message Properties window appears.
3. Specify the following values, as applicable:
Identification Name: specifies an identification code as a name for the message.
Token: allows you to select the message token to identify the type of message.
Error Alarm Text: specifies the alarm text in case error occurs.
Clear Alarm Text (OK): specifies the alarm text in case no error occurs.
Error Severity: allows you to select the severity level for the error message.
Subsystem string/id: allows you to select the subsystem id for the message.
4. Click OK.
The new message is added.
Variable Expansion
You can enter a $ symbol in the alarm message text to select from the list of variables. The following variables are available in the alarm message
text:
Profile variables
profile
host
node
description
Monitor variables
name
key
value
oper
threshold
unit
state
The values of these variables are retrieved from the monitored system.
Contents
Verify Prerequisites
Configure General Properties
Create a Resource
Create Host Profile
Select and Configure Host Counters
Create CAR Profile
Alarm Thresholds
View Properties of the Alarm Message
Verify Prerequisites
Verify that required hardware and software is available and installation prerequisites are met before you configure the probe. For more
information, see cisco_ucm (Cisco Unified Communications Manager Monitoring) Release Notes.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Log Size: specifies the size of the log file to which the internal log messages of the probe are written, in kilobytes.
Note: New log file entries are added and the older entries are deleted when the log file is of the specified size.
Send Session Alarms: enables you to generate alarms for deactivated sessions.
Check Interval: specifies the time interval to retrieve monitoring information from the UCM system.
Default: 1
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Check Interval Unit: allows you to select the unit for the Check Interval field.
Default: Minutes
Sample Interval: specifies the time duration for calculation of the average value for the monitors.
Example: if you specify 12 hours, the probe calculates the average value for the monitors from the last 12 hours.
Default: 6
Sample Interval Unit: allows you to select the unit for the Sample Interval field.
Default: Hours
AXL Timeout (sec): specifies the threshold value (in seconds) to connect to the Cisco UCM server through Administrative XML Layer
(AXL). The probe generates an alarm after the connection attempt time exceeds the timeout value.
Default: 60
3. Set or modify the following fields to configure CAR properties of the probe, as required:
Report Timeout (sec): specifies the time interval (in seconds) within which the cisco_ucm probe generates the call session report.
Default: 60
CAR Interval: specifies the time interval between each instance when the probe retrieves the CDR data.
CAR Interval Unit: allows you to select the unit for the CAR Interval field.
Maximum Num of Threads: specifies the total number of profiles that the probe can simultaneously monitor.
Default: 10
Insert CAR Details to SLM NIS Database: enables the probe to write the data, which is collected for the CAR profile, to the tbnCDR
database. The database resides at the path which is specified in the Data Engine Address field.
Note: The probe supports Microsoft SQL Server database as the backend database for the Insert CAR Details to SLM NIS
Database option.
Data Engine Address: specifies the address of the data engine of the robot.
4. Select Validate Data Engine from the Actions menu to verify the data engine path.
Note: The Validate Data Engine is required only for CAR reporting.
5. Click Save.
The general configuration settings of the probe are saved.
Create a Resource
You can add a resource in the probe. The resource connects to the Cisco UCM server and monitors through host and CAR monitoring profiles.
Follow these steps:
1. Click the Options (icon) next to the cisco_ucm node in the navigation pane.
2. Select Add New Resource.
The Resource Configuration dialog appears.
3. Specify the following field values of the monitored Cisco UCM system:
Unified Server Type: allows you to select the version of the Cisco UCM server.
Hostname or IP Address: defines the hostname or IP address of the Cisco UCM server.
Port: specifies the communication port to use to connect to the Cisco UCM server.
Default: 8443
Username: defines a valid username to access the Cisco UCM server.
Password: defines the password for the user account that is specified in the Username field.
Note: Enter valid user credentials with administrative privileges to the Cisco UCM server.
Node List: allows you to select the service of the Cisco UCM to use for monitoring the resource. Each name in the list represents a
service of the Cisco UCM available for monitoring.
The list of values for this field are automatically populated after valid details are provided in the previous fields.
4. Click Submit.
The resource is visible as the hostname node under the cisco_ucm node.
5. Navigate to the hostname node.
The properties of the resource are displayed in the Resource Configuration section.
6. Select Validate Data Engine from the Actions menu to generate a list of nodes available for monitoring.
Note: You can also delete a resource to prevent the probe to connect to the Cisco UCM server. Select Delete from the Options (icon)
next to the hostname node.
Note: The probe selects the default node name which is specified at the resource profile for an invalid node name.
Note: You can also double-click a counter from the Selected list or click
counters from monitoring.
4. Click Submit.
The selected counters are available under the ProfileName node in the navigation pane.
5. Select the required counter.
A list of configurable monitors is displayed as CounterName nodes.
6. Select the required monitor from this list.
7.
7. Set or modify the following fields to configure the monitors to generate QoS data and alarms:
Publish data: allows you to enable QoS.
Note: When you select Publish Data, the value of Data column for the monitor in the table changes from Off to On.
Note: The Operator, Threshold, Unit, and Message Token fields are enabled only when Publish Alarms is selected.
Note: Select Description from the Actions menu to retrieve a predefined description for the monitor. You can also copy the
description text to the Description field.
Last Interval: specifies the time interval when the probe collects values to compare with the threshold value.
Last Interval Unit: allows you to select the unit of measurement for the Last Interval field.
Default: Minutes
Value Definition: specifies the type of value that is compared with the threshold value. The value is used for QoS messages and
alarms. You can select one of the following value types:
Current Value: uses the last retrieved value for comparison.
Average Value: uses the average of retrieved values for comparison. The sample time is specified in the Sample Interval field in
the General Configuration section.
Delta Value (Current - Previous): uses the difference between the current value and previous value for comparison.
Operator: specifies the comparison operator for thresholds.
Threshold: defines the threshold value of the monitor.
Unit: defines the unit of measurement of the threshold value.
Important! Do not specify a value in Unit for monitors that represent the status of the monitored object.
Message Token: specifies the alarm message to be issued when the specified threshold is breached.
QoS Name: specifies the default QoS that are available for the monitor.
8. Click Save to save the configuration.
The probe monitors the specified monitors in the selected counters.
Note: You can also delete a counter in the host profile. Select Delete Counter from the Options (icon) next to the CounterName node.
3. Specify the following field values of the monitored Cisco UCM system:
Profile Name: defines a unique name for the CAR profile.
Active: activates the CAR profile, on creation.
Node: defines the Cisco UCM service which is defined for the CAR profile to monitor call sessions.
Note: The probe selects the default node name which is specified at the resource profile for an invalid node name.
FTP Hostname: defines the host name or IP address of the FTP server.
FTP Username: defines a valid user name to access the FTP server.
FTP Password: defines the password for the user account that is specified in the FTP Username field.
Remote Directory: defines the path of the directory with the FTP client to store the CDR data.
FTP Actual path: specifies the path of the directory from where the probe retrieves the CDR data.
Secure FTP: allows you to specify that the FTP connection is secure.
Time Zone Offset: specifies the time zone difference between the location of the Cisco UCM server and the robot with the probe.
QoS Identification Method: specifies whether the FTP server is identified through the profile name or the host name.
4. Click Submit.
The CAR profile is created and is visible as the CARProfileName node.
5. Click Save.
The profile configuration is saved and a list of available monitors is displayed in the created CAR profile.
6. Select the required monitor from the list.
7. Set or modify the following fields to configure the monitors to generate QoS data:
Publish Data: enables the probe to generate QoS data for the selected monitor.
Note: When you select Publish Data, the value of Data column for the monitor in the table changes from Off to On.
Note: You can also delete a CAR profile to prevent the probe to monitor Call Data Records. Select Delete from the Options (icon) next
to the CARProfileName node.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Contents
Verify Prerequisites
Configure General Properties
Create a Group
Create a Profile
Apply Monitors to Profile
Configure Monitor Properties
Create and Configure Templates
Display Current Monitor Values
Verify Prerequisites
Verify that required hardware and software is available and installation prerequisites are met before you configure the probe. For more
information, see cisco_ucm (Cisco Unified Communications Manager Monitoring) Release Notes.
1. Click
The Setup dialog appears.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Logfile size: specifies the size of the log file where the internal log messages of the probe are written, in kilobytes. When this size is
reached, new log file entries are added and the older entries are deleted.
Send Session Alarms: enables you to generate alarms for deactivated sessions.
4. Open the Advanced tab.
5. Set or modify the following fields to configure interval and timeout properties of the probe, as required:
Check interval: specifies the time interval to retrieve monitoring information from the UCM system.
Default: 1 min
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Sample period: specifies the time duration for calculation of the average value for the monitors.
Example: if you specify 12 hours, the probe calculates the average value for the monitors from the last 12 hours.
Default: 6 hours
AXL timeout: specifies the threshold value (in seconds) to connect to the Cisco UCM server through Administrative XML Layer (AXL).
The probe generates an alarm after the connection attempt time exceeds the timeout value.
Default: 60
6. Set or modify the following fields to configure CAR properties of the probe, as required:
Report timeout: specifies the time interval (in seconds) when the cisco_ucm probe generates the call session report.
Default: 60
Maximum num of threads: specifies the total number of profiles that the probe can simultaneously monitor.
Default: 10
CAR Interval: specifies the time interval between each instance when the probe retrieves the CDR data.
Default: 1 min
Insert CAR details to SLM NIS database: enables the probe to write the data, that is collected for the CAR profile, to the tbnCDR da
tabase. The database resides at the path that is selected from the drop-down.
Note: The probe supports Microsoft SQL Server database as the backend database for the Insert CAR Details to SLM NIS
Database option.
7. Click Validate to verify the data engine path.
8.
8. Click OK.
The general configuration settings of the probe are saved.
Create a Group
You can create a group to categorize profiles in the probe. You can also use the Default Group that is available when you deploy the
probe. Groups can be used to segregate profiles according to your requirement. On selecting a group, all host profiles in that group are listed in
the right pane.
You can right-click a group for the following options:
New Host: opens the profile dialog, enabling you to define a new host to be monitored.
New Group: opens the profile dialog, enabling you to define a new group. Use the group folders to place the hosts in logical groups.
You can select a group and right-click in the right pane for the following options:
New: opens the profile dialog to define a new host to be monitored or a QoS definition.
Edit: opens the profile dialog for the selected host, enabling you to modify the properties for the host.
Delete: allows you to delete the selected host.
Rename: allows you to rename the selected host.
Activate: includes the selected host for monitoring.
Deactivate: removes the selected host from monitoring.
Refresh: retrieves the current values of the objects that are listed in the right window pane.
Reload: retrieves updated configuration information from the selected agent.
Information: opens an informational window, containing system and configuration information about the selected host
Product Info: opens the window containing information about the product that this profile monitors.
Follow these steps:
1. Right-click on the Default Group and select New Group.
A new group is created in the navigation tree.
2. Rename the group, as required.
Note: If you do not want to use a group, right-click on the group and click Delete. You cannot rename or delete the Default
Group.
You can move a host from one group to another. Select the host and drag-and-drop it to the required group.
Create a Profile
You can create monitoring profiles in the probe. A probe can simultaneously monitor multiple Cisco UCM hosts. You can also create multiple
profiles to monitor the same host. The maximum number of active profiles is specified in the Maximum num of threads field of the Advanced tab
of Setup window. On selecting a profile, all activated Cisco UCM monitors are listed in the right pane.
The probe can be configured to create the following types of monitoring profiles:
Host Profile: This profile represents the system where the probe is deployed. You can add custom checkpoints to the host profile and can
enable respective monitors.
CAR Profile: CAR stands for CDR (Call Data Record) Analysis Report. A CAR profile allows you to track call statistics and generate
health and performance reports of network calls. The cisco_ucm probe uses File Transfer Protocol (FTP) to retrieve CDR files and
generate the CDR Analysis Report. You can configure the FTP settings through the probe. For more information, see Create and
Configure CAR Profile.
The profile status is indicated as one of the following indicators:
indicates that the host responds and can access all checkpoints.
indicates that the host does not respond (such as host stopped).
indicates that the host responds, but there are some problems.
indicates that the host is initializing and waiting for measured values from the profile.
indicates that the profile is created and stored, but is not active.
Follow these steps:
1. Click
from the toolbar.
The Select Unified Server Type dialog appears.
2. Select the type of Cisco UCM server that is monitored.
A wizard is launched guiding you through the steps necessary to define a new host.
3. Read the instructions in the Welcome dialog and click Next to continue.
4. Set or modify the following fields to specify the server details:
Hostname or IP Address: defines the hostname or IP address of the Cisco UCM server.
Port: specifies the communication port to use to connect to the Cisco UCM server.
Default: 8443
5. Click Next to continue.
6. Set or modify the following fields to specify the authentication details of the Cisco UCM server:
Username: defines a valid username to access the Cisco UCM server.
Password: defines the password for the user account that is specified in the Username field.
Note: Enter valid user credentials with administrative privileges to the Cisco UCM server.
15.
QoS Identification Method: specifies the QoS source.
Default: Host Address
Alarm Identification Method: specifies the alarm source.
Default: Host Address
16. Click Test to verify the connection to the Cisco UCM server.
17. Click OK.
The profile is active and available for monitoring.
Note: You can also delete a profile to prevent the probe to connect to the Cisco UCM server. Right-click the profile name and select Del
ete.
You can create CAR profiles in the probe to monitor Call Data Records for Cisco UCM. CAR profiles can generate QoS messages.
You can generate reports and view additional information for CAR profiles, as follows:
Generate CAR Analysis Report
View Phone Table
Custom CAR Analysis Reporting
Follow these steps:
1. Click
8.
Host Name: defines the host name or IP address of the FTP server.
Remote Directory: defines the path of the directory with the FTP client to store the CDR data.
User Name: defines a valid user name to access the FTP server.
Password: defines the password for the user account that is specified in the FTP Username field.
FTP Actual path: specifies the path of the directory from where the probe retrieves the CDR data.
You can also browse to the required path.
Secure FTP: allows you to specify that the FTP connection is secure.
Time Zone Offset: specifies the time zone difference between the location of the Cisco UCM server and the robot with the probe.
9. Set or modify the following fields to configure the identification properties of the profile:
QoS Identification Method: specifies the QoS source.
Default: Host Address
10. Click OK.
The profile configuration is saved.
11. Click Apply.
A list of available monitors is displayed in the created CAR profile.
12. Select the required monitors from the list to activate QoS generation for the monitors.
13. (Optional) Right-click a monitor and select Edit.
The CAR Monitor window is displayed.
14. Set or modify the following fields to configure the monitors to generate QoS data:
Name: defines a unique name for the monitor.
Key: displays the name of the UCM node and monitor.
Publish Quality of Service (QoS): enables the probe to generate QoS data for the selected monitor.
QoS Name: specifies the default QoS that are available for the monitor.
15. Click OK to save the configuration.
16. Click Apply.
The probe monitors the specified monitors in the selected profile.
Note: You can also delete a CAR profile to prevent the probe to monitor Call Data Records. Right-click the profile name and select Dele
te.
You can generate CAR analysis report by clicking Show CAR Analysis Report button. In the left pane of this report, you can see the Call
Identifier that is a unique ID for each call. You can also see Call From and Call To details and the Time when the call was made. On clicking
any of the Call Identifier, its respective CMR details are displayed in the right pane.
In the CMR Average Summary Analysis section, Jitter indicates the total Jitters of all CMR and No of Packets Lost indicates the total number
of packets lost in all CMR. The remaining fields in CMR Average Summary Analysis section display the average of their respective values.
The fields in the dialog are explained as follows:
Jitter: provides an estimate of the statistical variance of the RTP data packet interarrival time. The time is measured in milliseconds and
expressed as an unsigned integer. The interarrival jitter J specifies the mean deviation (smoothed absolute value) of the difference D in
packet spacing at the receiver. The value is compared to the sender for a pair of packets. RFC 1889 contains detailed computation
algorithms. The value remains zero if the connection was set in "send only" mode. Default is 0.
No. of Packets Lost: displays the number of packets that are lost during the call.
CCR - Cumulative Conceal Ratio: represents the cumulative ratio of concealment time over speech time that is observed after starting a
call.
ICR - Interval Conceal Ratio: represents an interval-based average concealment rate. The rate is the ratio of concealment time over
speech time for the last 3 seconds of active speech.
ICRmx - Interval Conceal Ratio Max: represents the maximum concealment ratio that is observed during the call.
CS - Conceal Secs: represents the time when some concealment is observed during a call.
SCS - Severely Conceal Secs: represents the time when a significant amount of concealment is observed. If the observed concealment
is greater than 50 milliseconds or 5 percent, the speech is not audible.
MLQK - MOS Listening Quality K-factor: provides an estimate of the MOS score of the last 8 seconds of speech on the reception
signal path.
MLQKmn - MOS Listening Quality K-factor Min: represents the minimum score that is observed from the beginning of a call and
represents the worst sounding 8-second interval.
MLQKmx - MOS Listening Quality K-factor Max: represents the maximum score that is observed from the beginning of a call and
represents the best sounding 8-second interval.
MLQKav - MOS Listening Quality K-factor Avg: represents the running average of scores that are observed from the beginning of a
call.
View Phone Table
The Show Phone Table button in the toolbar allows you to view information about registered phone devices.
Follow these steps:
1. Select a configured profile in the left pane.
2. In the toolbar, click Show Phone Table.
The phone details report appears.
Note: If phone details are not available, the probe generates a blank Phone Table report.
The CAR data is stored in the tbnCDR and tbnCMR database tables in your NIS database. Using reporting tools, such as Unified Reporter, you
can create your own CAR Analysis reports.
Important! CA recommends you to use auto-configuration to create auto-monitors to monitor the Cisco UCM system. You can
configure static monitors for specific instances. Multiple static monitors can increase the size of the .cfg file and causes performance
issues.
You can view and configure all monitors that are applied to a profile in the All Monitors node.
Manually Select Monitors
You can manually select a monitor from the list of monitors in the profile.
Note: The profile monitors initially are not activated. You can activate the ones that you want to monitor. The Enable Monitoring option
on the Monitor Properties dialog for the checkpoint is automatically selected.
Templates can be applied to multiple profiles to measure the same parameters on multiple Cisco UCM systems. A default template is available.
You can drag-and-drop a template on a host profile to apply to all elements for the resource.
You can create templates and can define a set of monitors in that template. For information about creating templates, see Create and Configure
Templates.
Important! If the templates are applied to the inventory tree, the inventory is recursively navigated to create static monitors. The
monitors are created for every checkpoint applicable to the template. Auto-monitor behavior in static form creates a large cfg file and
also creates issues with the performance of your probe.
You can add monitors and templates to the Auto Configurations node of a resource. Auto configuration applies the selected monitors to all
components of a resource. Monitors or checkpoints are automatically created for devices that are currently not monitored.
The Auto Configuration feature consists of two nodes that are located under the resource node in the left pane:
Auto Configurations: displays a list of monitor configurations that are automatically applied to the profile.
Auto Monitors: displays a list of all monitors that are configured using the monitor configurations in the Auto Configurations node.
Follow these steps:
1. Select a monitor from a template or the list of available monitors.
2. Drag-and-drop the selected monitor to the auto configuration node.
You can drag-and-drop a template to apply configurations of all monitors in the template.
3. Click the Apply button and restart the probe to activate the changes.
A list of monitors with the applied configurations are displayed in the Auto Monitors node.
Note: Adding multiple monitor or templates to the Auto Configurations node results in multiple Auto Monitors.
Edit: opens the Monitor Properties dialog, enabling you to modify the monitoring properties for the selected monitor.
Delete: deletes the monitor from the list. If you want to monitor again, locate the monitor in its group in the left pane and activate it.
Activate: activates the selected monitor to be monitored by the probe. You can also activate it by clicking the check box belonging to the
checkpoint.
Deactivate: deactivates the selected monitor (if activated), and the probe stops monitoring the monitor.
Monitor: Opens the Monitor window. The window displays the values that are recorded for the selected monitor during the sample
period. For more information, see View Graphical Monitoring.
Add to Template: allows you to add this monitor to a template.
Follow these steps:
1. Navigate to the monitor category or template in the left pane.
2. Double-click the required monitor from the list of monitors.
3. Set or modify the following fields to configure the monitors to generate QoS data and alarms:
Name: defines a custom name for the checkpoint.
Monitoring Object: displays the name of the UCM node, counter, and monitor.
Description: defines additional information from the monitor.
Note: Click Query Description to retrieve a predefined description for the monitor. Closing the window copies the
description text to the Description field.
Value Definition: specifies the type of value that is compared with the threshold value. The value is used for QoS messages and
alarms. You can select one of the following value types:
Current Value: uses the last retrieved value for comparison.
Average Value: uses the average of retrieved values for comparison. The sample time is specified in the Sample Interval field in
the General Configuration section.
Delta Value (Current - Previous): uses the difference between the current value and previous value for comparison.
last (mins): specifies the time interval when the probe collects values to compare with the threshold value.
Enable Monitoring: allows you to enable alarm message generation.
Note: The Operator, Threshold Value, Unit, and Message Token fields are enabled only when Publish Alarms is
selected.
Important! Do not specify a value in Unit for monitors that represent the status of the monitored object.
Message Token: specifies the alarm message to be issued when the specified threshold is breached.
Publish Quality of Service (QoS): allows you to enable QoS generation.
QoS Name: specifies the default QoS that are available for the monitor.
4. Click OK to save the configuration.
5. Select the checkbox next to the monitor name to activate the monitor.
6. Click Apply.
The probe uses the specified monitor.
View Graphical Monitoring
The Monitor window displays the values that are recorded for the selected monitor during the sample period. The sample period in the Setup sec
tion sets the maximum time range that is shown in the window. This time range can be set to a lower value, changing the backlog to a lower
value.
Note: The horizontal red line in the graph indicates the alarm threshold that is defined for the monitor.
When clicking and holding the left mouse button inside the graph, a red vertical line appears. You can continue to hold the left mouse button down
and move the cursor. The window displays the exact value at different points in the graph. The value is displayed in the upper part of the graph on
the format: <Day> <Time> <Value>.
Right-clicking inside the monitor window lets you select the backlog (the time range that is shown in the monitor window). In addition, the
right-click menu lets you select the option Show Average. A horizontal blue line is added in the graph, representing the average sample value.
The following information can be found in the status bar at the bottom of the monitor window:
The number of samples from the probe start
The minimum value measured
The average value measured
The maximum value measured
The backlog is the time range that is shown in the monitor window. The backlog can be selected to 6, 12, 24 or 48 hours by right-clicking
inside the graph.
Note: The graph cannot display time ranges that are greater than the selected Sample Period in the Setup section.
Note: You can drag-and-drop a template to the Auto Configurations node to apply all the monitors in the template as Auto Monitors,
as applicable. For more information, see Auto Configure Monitors.
You can verify the current values for the monitors by clicking the Get Values button in the toolbar. Click a profile to display the current values in
the Values column.
Note: Variable expansion in the message text is supported. For more information, see Variable Expansion.
1. Click
from the toolbar.
The Message Pool window appears.
2. Click
from the toolbar.
The Message Properties window appears.
3. Specify the following values, as applicable:
Identification Name: specifies an identification code as a name for the message.
Token: allows you to select the message token to identify the type of message.
Error Alarm Text: specifies the alarm text in case error occurs.
Clear Alarm Text (OK): specifies the alarm text in case no error occurs.
Error Severity: allows you to select the severity level for the error message.
Subsystem string/id: allows you to select the subsystem id for the message.
4. Click OK.
The new message is added.
Variable Expansion
You can enter a $ symbol in the alarm message text to select from the list of variables. The following variables are available in the alarm message
text:
Profile variables
profile
host
node
description
Monitor variables
name
key
value
oper
threshold
unit
state
The values of these variables are retrieved from the monitored system.
cisco_ucm Metrics
This section describes the metrics that can be configured for the Cisco Unified Communications Manager (cisco_ucm) probe.
Contents
QoS Metrics
Metrics
Alert Metrics Default Settings
QoS Metrics
The following table describes the QoS metrics that can be configured using the cisco_ucm probe.
Monitor Name
Units
Description
Version
QOS_CCM_CALLS
Calls
v1.8
QOS_CCM_DEVICES
Value
v1.8
QOS_CCM_DISK_USAGE
MBytes
v1.8
QOS_CCM_RESOURCE
Resources
v1.8
QOS_CPU_USAGE
Percent
CPU Usage
v1.8
QOS_DISK_USAGE_PERC
Percent
v1.8
QOS_MEMORY_USAGE_CCM
Kilobytes
v1.8
QOS_MEMORY_USAGE_PERC
Percent
v1.8
QOS_PROCESS_CPU
Percent
v1.8
QOS_PROCESS_MEMORY
Kilobytes
v1.8
QOS_SERVICE_STATE
State
v1.8
Metrics
The following table provides the description of the monitors that can be configured using the cisco_ucm probe.
Resource
Monitor Name
Units
Description
Cisco
Annunciator
Device
(ANN_2)\OutOfResources
Cisco
Annunciator
Device
(ANN_2)\ResourceActive
Cisco
Annunciator
Device
(ANN_2)\ResourceAvailable
Cisco
Annunciator
Device
(ANN_2)\ResourceTotal
ThrottleCount
ThrottleState
Cisco Call
Restriction
AdHocConferenceFailures
Cisco Call
Restriction
BasicCallFailures
Cisco Call
Restriction
ForwardingFailures
Cisco Call
Restriction
LogicalPartitionFailuresTotal
Cisco Call
Restriction
MeetMeConferenceFailures
Cisco Call
Restriction
MidCallFailures
Cisco Call
Restriction
ParkRetrievalFailures
Cisco Call
Restriction
PickUpFailures
Cisco Call
Restriction
SharedLineFailures
Cisco Call
Restriction
TransferFailures
Cisco Call
Manager
AnnunciatorOutOfResources
Cisco Call
Manager
AnnunciatorResourceActive
Cisco Call
Manager
AnnunciatorResourceAvailable
Cisco Call
Manager
AnnunciatorResourceTotal
Cisco Call
Manager
AuthenticatedCallsActive
Cisco Call
Manager
AuthenticatedCallsCompleted
Cisco Call
Manager
AuthenticatedPartiallyRegisteredPhone
Cisco Call
Manager
AuthenticatedRegisteredPhones
Cisco Call
Manager
BRIChannelsActive
Cisco Call
Manager
BRISpansInService
Cisco Call
Manager
Communications ManagerHeartBeat
Cisco Call
Manager
CallsActive
Cisco Call
Manager
CallsAttempted
Cisco Call
Manager
CallsCompleted
Cisco Call
Manager
CallsInProgress
Cisco Call
Manager
CumulativeAllocatedResourceCannotOpenPort
Cisco Call
Manager
EncryptedCallsActive
Cisco Call
Manager
EncryptedCallsCompleted
Cisco Call
Manager
EncryptedPartiallyRegisteredPhones
Cisco Call
Manager
EncryptedRegisteredPhones
Cisco Call
Manager
ExternalCallControlEnabledCallsAttempted
Cisco Call
Manager
ExternalCallControlEnabledCallsCompleted
Cisco Call
Manager
ExternalCallControlEnabledFailureTreatmentApplied
Cisco Call
Manager
FXOPortsActive
Cisco Call
Manager
FXOPortsInService
Cisco Call
Manager
FXSPortsActive
Cisco Call
Manager
FXSPortsInService
Cisco Call
Manager
HuntListsInService
Cisco Call
Manager
HWConferenceActive
Cisco Call
Manager
HWConferenceCompleted
Cisco Call
Manager
HWConferenceResourceActive
Cisco Call
Manager
HWConferenceResourceAvailable
Cisco Call
Manager
HWConferenceResourceTotal
Cisco Call
Manager
InitializationState
Cisco Call
Manager
LocationOutOfResources
Cisco Call
Manager
MCUConferencesActive
Cisco Call
Manager
MCUConferencesCompleted
Cisco Call
Manager
MCUHttpConnectionErrors
Cisco Call
Manager
MCUHttpNon200OkResponse
Cisco Call
Manager
MCUOutOfResources
Cisco Call
Manager
MOHMulticastResourceActive
Cisco Call
Manager
MOHMulticastResourceAvailable
Cisco Call
Manager
MOHOutOfResources
Cisco Call
Manager
MOHTotalMulticastResources
Cisco Call
Manager
MOHTotalUnicastResources
Cisco Call
Manager
MOHUnicastResourceActive
Cisco Call
Manager
MOHUnicastResourceAvailable
Cisco Call
Manager
MTPOutOfResources
Cisco Call
Manager
MTPRequestsThrottled
Cisco Call
Manager
MTPResourceActive
Cisco Call
Manager
MTPResourceAvailable
Cisco Call
Manager
MTPResourceTotal
Cisco Call
Manager
PartiallyRegisteredPhone
Cisco Call
Manager
PRIChannelsActive
Cisco Call
Manager
PRISpansInService
Cisco Call
Manager
RegisteredAnalogAccess
Cisco Call
Manager
RegisteredHardwarePhones
Cisco Call
Manager
RegisteredMGCPGateway
Cisco Call
Manager
RegisteredOtherStationDevices
Cisco Call
Manager
SIPLineServerAuthorizationChallenges
Cisco Call
Manager
SIPLineServerAuthorizationFailures
Cisco Call
Manager
SIPTrunkApplicationAuthorizationFailures
Cisco Call
Manager
SIPTrunkApplicationAuthorizations
Cisco Call
Manager
SIPTrunkAuthorizationFailures
Cisco Call
Manager
SIPTrunkAuthorizations
Cisco Call
Manager
SIPTrunkServerAuthenticationChallenges
Cisco Call
Manager
SIPTrunkServerAuthenticationFailures
Cisco Call
Manager
SWConferenceActive
Cisco Call
Manager
SWConferenceCompleted
Cisco Call
Manager
SWConferenceOutOfResources
Cisco Call
Manager
SWConferenceResourceActive
Cisco Call
Manager
SWConferenceResourceAvailable
Cisco Call
Manager
SWConferenceResourceTotal
Cisco Call
Manager
SystemCallsAttempted
Cisco Call
Manager
T1ChannelsActive
Cisco Call
Manager
T1SpansInService
Cisco Call
Manager
TLSConnectedSIPTrunks
Cisco Call
Manager
TLSConnectedWSM
Cisco Call
Manager
TranscoderOutOfResources
Cisco Call
Manager
TranscoderRequestsThrottled
Cisco Call
Manager
TranscoderResourceActive
Cisco Call
Manager
TranscoderResourceAvailable
Cisco Call
Manager
TranscoderResourceTotal
Cisco Call
Manager
VCBConferencesActive
Cisco Call
Manager
VCBConferencesAvailable
Cisco Call
Manager
VCBConferencesCompleted
Cisco Call
Manager
VCBConferencesTotal
Cisco Call
Manager
VCBOutOfConferences
Cisco Call
Manager
VCBOutOfResources
Cisco Call
Manager
VCBResourceActive
Cisco Call
Manager
VCBResourceAvailable
Cisco Call
Manager
VCBResourceTotal
Cisco Call
Manager
VideoCallsActive
Cisco Call
Manager
VideoCallsCompleted
Cisco Call
Manager
VideoOutOfResources
Cisco Unified
Communications
Manager
System
Performance
AverageExpectedDelay
Cisco Unified
Communications
Manager
System
Performance
CallsRejectedDueToThrottling
Cisco Unified
Communications
Manager
System
Performance
CallThrottlingGenericCounter3
Cisco Unified
Communications
Manager
System
Performance
CodeRedEntryExit
Cisco Unified
Communications
Manager
System
Performance
CodeYellowEntryExit
Cisco Unified
Communications
Manager
System
Performance
EngineeringCounter1
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsPresent 1-High
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsPresent 2-Normal
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsPresent 3-Low
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsPresent 4-Lowest
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsPresent 5-Database
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsPresent 6-Interleaved
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsProcessed 1-High
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsProcessed 2-Normal
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsProcessed 3-Low
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsProcessed 4-Lowest
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsProcessed 5-Database
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsProcessed 6-Interleaved
Cisco Unified
Communications
Manager
System
Performance
QueueSignalsProcessed Total
Cisco Unified
Communications
Manager
System
Performance
SkinnyDevicesThrottled
Cisco Unified
Communications
Manager
System
Performance
ThrottlingSampleActivity
Cisco Unified
Communications
Manager
System
Performance
TotalCodeYellowEntry
Cisco Car
Database
(CAR_DB_Perfmon_Instance)\CARDBSpaceUsed
Cisco Car
Database
(CAR_DB_Perfmon_Instance)\CARTempDBSpaceUsed
Cisco Car
Database
(CAR_DB_Perfmon_Instance)\FreeSharedMemory
Cisco Car
Database
(CAR_DB_Perfmon_Instance)\RootDBSpaceUsed
Cisco Car
Database
(CAR_DB_Perfmon_Instance)\UsedSharedMemory
Cisco CTI
Manager
CcmLinkActive
Cisco CTI
Manager
CTIConnectionActive
Cisco CTI
Manager
DevicesOpen
Cisco CTI
Manager
LinesOpen
Cisco CTI
Manager
QBEVersion
Cisco Extension
Mobility
Cisco Extension
Mobility
Cisco Extension
Mobility
Cisco Extension
Mobility
Requests Handled
Cisco Extension
Mobility
Requests In Progress
Cisco Extension
Mobility
Requests Throttled
Cisco Extension
Mobility
Successful Logins
Cisco Extension
Mobility
Successful Logouts
Cisco Extension
Mobility
Cisco Extension
Mobility
Cisco Extension
Mobility
Cisco Extension
Mobility
Cisco H323
(InterClusterTrunkToAvaya)\CallsActive
Cisco H323
(InterClusterTrunkToAvaya)\CallsAttempted
Cisco H323
(InterClusterTrunkToAvaya)\CallsCompleted
Cisco H323
(InterClusterTrunkToAvaya)\CallsInProgress
Cisco H323
(InterClusterTrunkToAvaya)\CallsRejectedDueToICTCallThrottling
Cisco H323
(InterClusterTrunkToAvaya)\VideoCallsActive
Cisco H323
(InterClusterTrunkToAvaya)\VideoCallsCompleted
Cisco IP
Manager
Assistant
AssistantsActive
Cisco IP
Manager
Assistant
LinesOpen
Cisco IP
Manager
Assistant
ManagersActive
Cisco IP
Manager
Assistant
SessionsCurrent
Cisco Lines
(5001)\Active
Cisco Locations
(Hub_None)\BandwidthAvailable
Cisco Locations
(Hub_None)\BandwidthMaximum
Cisco Locations
(Hub_None)\CallsInProgress
Cisco Locations
(Hub_None)\OutOfResources
Cisco Locations
(Hub_None)\RSVP AudioReservationErrorCounts
Cisco Locations
(Hub_None)\RSVP MandatoryConnectionsInProgress
Cisco Locations
(Hub_None)\RSVP OptionalConnectionsInProgress
Cisco Locations
(Hub_None)\RSVP TotalCallsFailed
Cisco Locations
(Hub_None)\RSVP VideoCallsFailed
Cisco Locations
(Hub_None)\RSVP VideoReservationErrorCounts
Cisco Locations
(Hub_None)\VideoBandwidthAvailable
Cisco Locations
(Hub_None)\VideoBandwidthMaximum
Cisco Locations
(Hub_None)\VideoOutOfResources
Cisco Media
Streaming App
ANNConnectionsLost
Cisco Media
Streaming App
ANNConnectionState
Cisco Media
Streaming App
ANNConnectionsTotal
Cisco Media
Streaming App
ANNInstancesActive
Cisco Media
Streaming App
ANNStreamsActive
Cisco Media
Streaming App
ANNStreamsAvailable
Cisco Media
Streaming App
ANNStreamsTotal
Cisco Media
Streaming App
CFBConferencesActive
Cisco Media
Streaming App
CFBConferencesTotal
Cisco Media
Streaming App
CFBConferencesLost
Cisco Media
Streaming App
CFBConferenceState
Cisco Media
Streaming App
CFBStreamsActive
Cisco Media
Streaming App
CFBStreamsAvailable
Cisco Media
Streaming App
CFBStreamsTotal
Cisco Media
Streaming App
MOHAudioSourcesActive
Cisco Media
Streaming App
MOHConnectionsLost
Cisco Media
Streaming App
MOHConnectionState
Cisco Media
Streaming App
MOHStreamsActive
Cisco Media
Streaming App
MOHStreamsAvailable
Cisco Media
Streaming App
MOHStreamsTotal
Cisco Media
Streaming App
MTPConnectionsLost
Cisco Media
Streaming App
MTPConnectionsState
Cisco Media
Streaming App
MTPConnectionsTotal
Cisco Media
Streaming App
MTPInstancesActive
Cisco Media
Streaming App
MTPStreamsActive
Cisco Media
Streaming App
MTPStreamsAvailable
Cisco Media
Streaming App
MTPStreamsTotal
Cisco
Messaging
Interface
HeartBeat
Cisco
Messaging
Interface
SMDIMessageCountInbound
Cisco
Messaging
Interface
SMDIMessageCountInbound24Hour
Cisco
Messaging
Interface
SMDIMessageCountOutbound
Cisco
Messaging
Interface
SMDIMessageCountOutbound24Hour
Cisco
Messaging
Interface
StartTime
Cisco MGCP
Gateways
(nclabrtr01.ca.com)\BRIChannelsActive
Cisco MGCP
Gateways
(nclabrtr01.ca.com)\BRISpansInService
Cisco MGCP
Gateways
(nclabrtr01.ca.com)\FXOPortsActive
Cisco MGCP
Gateways
(nclabrtr01.ca.com)\FXOPortsInService
Cisco MGCP
Gateways
(nclabrtr01.ca.com)\FXSPortsActive
Cisco MGCP
Gateways
(nclabrtr01.ca.com)\FXSPortsInService
Cisco MGCP
Gateways
(nclabrtr01.ca.com)\PRIChannelsActive
Cisco MGCP
Gateways
(nclabrtr01.ca.com)\PRISpansInService
Cisco MGCP
Gateways
(nclabrtr01.ca.com)\T1ChannelsActive
Cisco MGCP
Gateways
(nclabrtr01.ca.com)\T1SpansInService
Cisco MGCP
PRI Device
(nclabrtr01.ca.com::S0_SU1_DS1-0)\CallsActive
Cisco MGCP
PRI Device
(nclabrtr01.ca.com::S0_SU1_DS1-0)\CallsCompleted
Cisco MGCP
PRI Device
(nclabrtr01.ca.com::S0_SU1_DS1-0)\Channel 1 Status
Cisco MGCP
PRI Device
(nclabrtr01.ca.com::S0_SU1_DS1-0)\DatalinkInService
Cisco MGCP
PRI Device
(nclabrtr01.ca.com::S0_SU1_DS1-0)\OutboundBusyAttempts
Cisco Mobility
Manager
MobileCallsAnchored
Cisco Mobility
Manager
MobileHandinsCompleted
Cisco Mobility
Manager
MobileHandoutsCompleted
Cisco Mobility
Manager
MobileHandoutsFailed
Cisco Mobility
Manager
MobilityFollowMeCallsAttempted
Cisco Mobility
Manager
MobilityFollowMeCallsIgnoredDueToAnswerTooSoon
Cisco Mobility
Manager
MobilityHandinsAborted
Cisco Mobility
Manager
MobilityHandinsFailed
Cisco Mobility
Manager
MobilityHandoutsAborted
Cisco Mobility
Manager
MobilityIVRCallsAttempted
Cisco Mobility
Manager
MobilityIVRCallsFailed
Cisco Mobility
Manager
MobilityIVRCallsSucceeded
Cisco Mobility
Manager
MobilitySCCPDualModeRegistered
Cisco Mobility
Manager
MobilitySIPDualModeRegistered
Cisco MOH
Device
(MOH_2)\MOHHighestActiveResources
Cisco MOH
Device
(MOH_2)\MOHMulticastResourceActive
Cisco MOH
Device
(MOH_2)\MOHMulticastResourceAvailable
Cisco MOH
Device
(MOH_2)\MOHOutOfResources
Cisco MOH
Device
(MOH_2)\MOHTotalMulticastResources
Cisco MOH
Device
(MOH_2)\MOHTotalUnicastResources
Cisco MOH
Device
(MOH_2)\MOHUnicastResourceActive
Cisco MOH
Device
(MOH_2)\MOHUnicastResourceAvailable
Cisco MTP
Device
(MTP_2)\AllocatedResourceCannotOpenPort
Cisco MTP
Device
(MTP_2)\OutOfResources
Cisco MTP
Device
(MTP_2)\RequestsThrottled
Cisco MTP
Device
(MTP_2)\ResourceActive
Cisco MTP
Device
(MTP_2)\ResourceAvailable
Cisco MTP
Device
(MTP_2)\ResourceTotal
Cisco Phones
(SEP00170EF03493)\CallsAttempted
Cisco Presence
Features
ActiveCallListAndTrunkSubscriptions
Cisco Presence
Features
ActiveSubscriptions
Cisco Presence
Features
CallListAndTrunkSubscriptionsThrottled
Cisco Presence
Features
IncomingLineSideSubscriptions
Cisco Presence
Features
IncomingTrunkSideSubscriptions
Cisco Presence
Features
OutgoingTrunkSideSubscriptions
Cisco QSIG
Features
(CiscoQSIGFeatureObject)\CallForwardByRerouteCompleted
Cisco QSIG
Features
(CiscoQSIGFeatureObject)\PathReplacementCompleted
Cisco SAF
Client
SAFFConnectionsFailed
Cisco SAF
Client
SAFFConnectionsSucceeded
Cisco Signaling
SIPTCPConnectionsClosed
Cisco Signaling
TCPSIPMaxIncomingMessageHeadersExceeded
Cisco Signaling
TCPSIPMaxIncomingMessageSizeExceeded
Cisco Signaling
UDPPacketsThrottled
Cisco Signaling
UDPSIPMaxIncomingMessageHeadersExceeded
Cisco Signaling
UDPSIPMaxIncomingMessageSizeExceeded
Cisco SIP
(InterClusterTrunkToCM80)\CallsActive
Cisco SIP
(InterClusterTrunkToCM80)\CallsAttempted
Cisco SIP
(InterClusterTrunkToCM80)\CallsCompleted
Cisco SIP
(InterClusterTrunkToCM80)\CallsInProgress
Cisco SIP
(InterClusterTrunkToCM80)\VideoCallsActive
Cisco SIP
(InterClusterTrunkToCM80)\VideoCallsCompleted
AckIns
AckOuts
ActiveTcpTlsConnections
ByeIns
ByeOuts
CancelIns
CancelOuts
CCBsAllocated
GlobalFailedClassIns
GlobalFailedClassOuts
InfoClassIns
InfoClassOuts
InfoIns
InfoOuts
InviteIns
InviteOuts
NotifyIns
NotifyOuts
OptionsIns
OptionsOuts
PRAckIns
PRAckOuts
PublishIns
PublishOuts
RedirClassIns
RedirClassOuts
ReferIns
ReferOuts
RegisterIns
RegisterOuts
RequestsFailedClassIns
RequestsFailedClassOuts
RetryByes
RetryCancels
RetryInfo
RetryInvites
RetryNotify
RetryPRAck
RetryPublish
RetryRefer
RetryRegisters
RetryRel1xx
RetryRequestsOut
RetryResponsesFinal
RetryResponsesNonFinal
RetrySubscribe
RetryUpdate
SCBsAllocated
ServerFailedClassIns
ServerFailedClassOuts
SIPHandlerSDLQueueSignalsPresent
SIPMessagesAllocated
SIPNewRegistrationPending
StatusCode100Ins
StatusCode100Outs
StatusCode180Ins
StatusCode180Outs
StatusCode181Ins
StatusCode181Outs
StatusCode182Ins
StatusCode182Outs
StatusCode183Ins
StatusCode183Outs
StatusCode200Ins
StatusCode200Outs
StatusCode202Ins
StatusCode202Outs
StatusCode300Ins
StatusCode301Ins
StatusCode302Ins
StatusCode302Outs
StatusCode303Ins
StatusCode305Ins
StatusCode380Ins
StatusCode400Ins
StatusCode400Outs
StatusCode401Ins
StatusCode401Outs
StatusCode402Ins
StatusCode402Outs
StatusCode403Ins
StatusCode403Outs
StatusCode404Ins
StatusCode404Outs
StatusCode405Ins
StatusCode405Outs
StatusCode406Ins
StatusCode406Outs
StatusCode407Ins
StatusCode407Outs
StatusCode408Ins
StatusCode408Outs
StatusCode409Ins
StatusCode409Outs
StatusCode410Ins
StatusCode410Outs
StatusCode413Ins
StatusCode413Outs
StatusCode414Ins
StatusCode414Outs
StatusCode415Ins
StatusCode415Outs
StatusCode416Ins
StatusCode416Outs
StatusCode417Ins
StatusCode417Outs
StatusCode420Ins
StatusCode420Outs
StatusCode422Ins
StatusCode422Outs
StatusCode423Ins
StatusCode423Outs
StatusCode424Ins
StatusCode424Outs
StatusCode480Ins
StatusCode480Outs
StatusCode481Ins
StatusCode481Outs
StatusCode482Ins
StatusCode482Outs
StatusCode483Ins
StatusCode483Outs
StatusCode484Ins
StatusCode484Outs
StatusCode485Ins
StatusCode485Outs
StatusCode486Ins
StatusCode486Outs
StatusCode487Ins
StatusCode487Outs
StatusCode488Ins
StatusCode488Outs
StatusCode489Ins
StatusCode489Outs
StatusCode491Ins
StatusCode491Outs
StatusCode500Ins
StatusCode500Outs
StatusCode501Ins
StatusCode501Outs
StatusCode502Ins
StatusCode502Outs
StatusCode503Ins
StatusCode503Outs
StatusCode504Ins
StatusCode504Outs
StatusCode505Ins
StatusCode505Outs
StatusCode580Ins
StatusCode580Outs
StatusCode600Ins
StatusCode600Outs
StatusCode603Ins
StatusCode603Outs
StatusCode604Ins
StatusCode604Outs
StatusCode606Ins
StatusCode606Outs
SubscribeIns
SubscribeOuts
SuccessClassIns
SuccessClassOuts
SummaryRequestsIn
SummaryRequestsOut
SummaryResponsesIn
SummaryResponsesOut
UpdateIns
UpdateOuts
Cisco SIP
Station
ConfigMismatchesPersistent
Cisco SIP
Station
ConfigMismatchesTemporary
Cisco SIP
Station
ConfigMismatchesTracking
Cisco SIP
Station
ConnectionsDedicated
Cisco SIP
Station
ConnectionsShared
Cisco SIP
Station
DBTimeouts
Cisco SIP
Station
DeviceEntries
Cisco SIP
Station
DevicesByContactSocket
Cisco SIP
Station
DevicesByName
Cisco SIP
Station
DeviceTypeAssociated
Cisco SIP
Station
DeviceTypeDualMode
Cisco SIP
Station
NewRegAccepted
Cisco SIP
Station
NewRegQueueSize
Cisco SIP
Station
NewRegRejected
Cisco SIP
Station
StationErrors
Cisco SIP
Station
StationErrorsMsgRouting
Cisco SIP
Station
TokensAccepted
Cisco SIP
Station
TokensOutstanding
Cisco SIP
Station
TokensRejected
Cisco SW
Conference
Bridge Device
(CFB_2)\AllocatedResourceCannotOpenPort
Cisco SW
Conference
Bridge Device
(CFB_2)\OutOfResources
Cisco SW
Conference
Bridge Device
(CFB_2)\ResourceActive
Cisco SW
Conference
Bridge Device
(CFB_2)\ResourceAvailable
Cisco SW
Conference
Bridge Device
(CFB_2)\ResourceTotal
Cisco SW
Conference
Bridge Device
(CFB_2)\SWConferenceActive
Cisco SW
Conference
Bridge Device
(CFB_2)\SWConferenceCompleted
Cisco TFTP
BuildAbortCount
Cisco TFTP
BuildCount
Cisco TFTP
BuildDeviceCount
Cisco TFTP
BuildDialruleCount
Cisco TFTP
BuildDuration
Cisco TFTP
BuildFeaturePolicyCount
Cisco TFTP
BuildSignCount
Cisco TFTP
BuildSoftkeyCount
Cisco TFTP
BuildUnitCount
Cisco TFTP
ChangeNotifications
Cisco TFTP
DeviceChangeNotifications
Cisco TFTP
DialruleChangeNotifications
Cisco TFTP
EncryptCount
Cisco TFTP
FeaturePolicyChangeNotifications
Cisco TFTP
GKFoundCount
Cisco TFTP
GKNotFoundCount
Cisco TFTP
HeartBeat
Cisco TFTP
HttpConnectRequests
Cisco TFTP
HttpRequests
Cisco TFTP
HttpRequestsAborted
Cisco TFTP
HttpRequestsNotFound
Cisco TFTP
HttpRequestsOverflow
Cisco TFTP
HttpRequestsProcessed
Cisco TFTP
HttpServedFromDisk
Cisco TFTP
LDFoundCount
Cisco TFTP
LDNotFoundCount
Cisco TFTP
MaxServingCount
Cisco TFTP
Requests
Cisco TFTP
RequestsAborted
Cisco TFTP
RequestsInProgress
Cisco TFTP
RequestsInProgress
Cisco TFTP
RequestsNotFound
Cisco TFTP
RequestsOverflow
Cisco TFTP
RequestsProcessed
Cisco TFTP
SegmentsAcknowledged
Cisco TFTP
SegmentsFromDisk
Cisco TFTP
SegmentsSent
Cisco TFTP
SEPFoundCount
Cisco TFTP
SEPNotFoundCount
Cisco TFTP
SIPFoundCount
Cisco TFTP
SIPNotFoundCount
Cisco TFTP
SoftkeyChangeNotifications
Cisco TFTP
UnitChangeNotifications
Cisco Tomcat
Connector
(http-8080)\Errors
Cisco Tomcat
Connector
(http-8080)\MBytesReceived
Cisco Tomcat
Connector
(http-8080)\MBytesSent
Cisco Tomcat
Connector
(http-8080)\Requests
Cisco Tomcat
Connector
(http-8080)\ThreadsBusy
Cisco Tomcat
Connector
(http-8080)\ThreadsMax
Cisco Tomcat
Connector
(http-8080)\ThreadsTotal
Cisco Tomcat
JVM
KBytesMemoryFree
KBytes
Cisco Tomcat
JVM
KBytesMemoryMax
KBytes
Cisco Tomcat
JVM
KBytesMemoryTotal
KBytes
Cisco Tomcat
Web Application
(_root)\Errors
Cisco Tomcat
Web Application
(_root)\Requests
Cisco Tomcat
Web Application
(_root)\SessionsActive
Cisco WebDialer
CallsCompleted
Cisco WebDialer
CallsFailed
Cisco WebDialer
RedirectorSessionsHandled
Cisco WebDialer
RedirectorSessionsInProgress
Cisco WebDialer
RequestsCompleted
Cisco WebDialer
RequestsFailed
Cisco WebDialer
SessionsHandled
Cisco WebDialer
SessionsInProgress
DB Change
Notification
Client
MessagesProcessed
DB Change
Notification
Client
MessagesProcessing
DB Change
Notification
Client
QueueHeadPointer
DB Change
Notification
Client
QueueMax
DB Change
Notification
Client
QueueTailPointer
DB Change
Notification
Client
TablesSubscribed
DB Change
Notification
Server
Clients
DB Change
Notification
Server
CNProcessed
DB Change
Notification
Server
QueueDelay
DB Change
Notification
Server
QueuedRequestsInDB
DB Change
Notification
Server
QueuedRequestsInMemory
DB Change
Notification
Subscription
SubscribedTable
DB Local DSN
(DSN=ccm;: NodeName =
cisco-ucm85.HCLT.CORP.HCL.IN)\CcmDbSpace_Used
DB Local DSN
(DSN=ccm;: NodeName =
cisco-ucm85.HCLT.CORP.HCL.IN)\CcmtempDbSpace_Used
DB Local DSN
(DSN=ccm;: NodeName =
cisco-ucm85.HCLT.CORP.HCL.IN)\CNDbSpace_Used
DB Local DSN
(DSN=ccm;: NodeName =
cisco-ucm85.HCLT.CORP.HCL.IN)\LocalDSN
DB Local DSN
(DSN=ccm;: NodeName =
cisco-ucm85.HCLT.CORP.HCL.IN)\RootDbSpace_Used
DB Local DSN
(DSN=ccm;: NodeName =
cisco-ucm85.HCLT.CORP.HCL.IN)\SharedMemory_Free
DB Local DSN
(DSN=ccm;: NodeName =
cisco-ucm85.HCLT.CORP.HCL.IN)\SharedMemory_Used
DB User Host
Information
Counters
(ccm8_6_2_22900_9:database:cisco-ucm85)\DB:User:Host
Instances
Enterprise
Replication
DBSpace
Monitors
(DSN=ccm;: NodeName =
cisco-ucm85.HCLT.CORP.HCL.IN)\ERDbSpace_Used
Enterprise
Replication
DBSpace
Monitors
(DSN=ccm;: NodeName =
cisco-ucm85.HCLT.CORP.HCL.IN)\ERSBDbSpace_Used
External Call
Control
ConnectionsActiveToPDPServer
External Call
Control
ConnectionsLostToPDPServer
External Call
Control
PDPServersInService
External Call
Control
PDPServersOutOfService
External Call
Control
PDPServersTotal
IME Client
CallsAccepted
IME Client
CallsAttempted
IME Client
CallsReceived
IME Client
CallsSetup
IME Client
DomainsUnique
IME Client
FallbackCallsFailed
IME Client
FallbackCallsSuccessful
IME Client
IMESetupsFailed
IME Client
RoutesLearned
IME Client
RoutesPublished
IME Client
RoutesRejected
IME Client
VCRUploadRequests
IP and IP6
Frag Creates
IP and IP6
Frag Fails
IP and IP6
Frag OKs
IP and IP6
In Delivers
IP and IP6
In Discards
IP and IP6
In HdrErrors
IP and IP6
In Receives
IP and IP6
In UnknownProtos
IP and IP6
InOut Requests
IP and IP6
Out Discards
IP and IP6
Out Requests
IP and IP6
Reasm Fails
IP and IP6
Reasm OKs
IP and IP6
Reasm Reqds
Memory
% Mem Used
Percent
Memory
% Page Usage
Percent
Memory
% VM Used
Percent
Memory
Buffers KBytes
KBytes
Memory
Cached KBytes
KBytes
Memory
Free KBytes
KBytes
Memory
KBytes
Memory
HighFree
Memory
HighTotal
Memory
Low Free
Memory
Low Total
Memory
Memory
Memory
Pages
Pages
Memory
Pages Input
Pages
Memory
Pages
Memory
Pages Output
Pages
Memory
Pages
Memory
Shared KBytes
KBytes
Memory
SlabCache
Memory
SwapCached
Memory
Total KBytes
KBytes
Memory
KBytes
Memory
Total VM KBytes
KBytes
Memory
Used KBytes
KBytes
Memory
KBytes
Memory
Used VM KBytes
KBytes
Network
Interface
(eth0)\Rx Bytes
Network
Interface
(eth0)\Rx Dropped
Network
Interface
(eth0)\Rx Errors
Network
Interface
(eth0)\Rx Multicast
Network
Interface
(eth0)\Rx Packets
Network
Interface
(eth0)\Total Bytes
Network
Interface
(eth0)\Total Packets
Number of
Replicates
Created and
State of
Replication
Number of
Replicates
Created and
State of
Replication
(ReplicateCount)\Replicate_State
Partition
Percent
Partition
(Active)\% Used
Percent
Partition
ms
Partition
(Active)\Await Time
ms
Partition
ms
Partition
(Active)\Queue Length
Partition
Bytes per
second
Partition
(Active)\Total Mbytes
Mbytes
Partition
(Active)\Used Mbytes
Mbytes
Partition
Bytes per
second
Process
Percent
Process
Percent
Process
Process
(acpid)\Nice
Process
Process
(acpid)\PID
Process
(acpid)\Process Status
Process
KBytes
Process
(acpid)\STime
Jiffies
Process
(acpid)\Thread Count
Process
Jiffies
Process
(acpid)\UTime
Jiffies
Process
(acpid)\VmData
KBytes
Process
(acpid)\VmRSS
KBytes
Process
(acpid)\VmSize
KBytes
Processor
Percent
Processor
(_Total)\Idle Percentage
Percent
Processor
(_Total)\IOwait Percentage
Percent
Processor
(_Total)\Irq Percentage
Percent
Processor
(_Total)\Nice Percentage
Percent
Processor
(_Total)\Softirq Percentage
Percent
Displays the task's process status: 0 - Running, 1 Sleeping, 2 - Uninterruptible disk sleep, 3 - Zombie, 4 Traced or stopped (on a signal), 5 - Paging, 6 Unknown.
Processor
(_Total)\System Percentage
Percent
Processor
(_Total)\User Percentage
Percent
Ramfs
(ccm_calllogs)\FilesTotal
Ramfs
(ccm_calllogs)\SpaceFree
Ramfs
(ccm_calllogs)\SpaceUsed
System
Allocated FDs
FDs
System
FDs
System
Freed FDs
FDs
System
IOAwait
Milliseconds
System
IOCpuUtil
Percent
System
IOKBytesReadPerSecond
System
IOKBytesWrittenPerSecond
System
IOPerSecond
System
IOReadReqMergedPerSecond
System
IOReadReqPerSecond
System
IOReqQueueSizeAvg
System
IOSectorsReadPerSecond
System
IOSectorsReqSizeAvg
System
IOSectorsWrittenPerSecond
System
IOServiceTime
System
IOWriteReqMergedPerSecond
System
IOWriteReqPerSecond
System
Max FDs
FDs
System
Jiffies
System
Total Processes
System
Total Threads
TCP
Active Opens
TCP
Attempt Fails
TCP
Curr Estab
TCP
Estab Resets
TCP
In Segs
TCP
InOut Segs
TCP
Out Segs
TCP
Passive Opens
TCP
RetransSegs
Thread
Percent
Thread
ccm_8782\PID
Error Threshold
MsgError
Error Severity
Description
Critical
MsgAgentError
Critical
MsgAuthError
Critical
MsgSessionError
Critical
SessionDataFailed
Critical
MsgServices
Critical
MsgWarning
Warning
MsgCritical
Critical
Overview
Add New Profile
Set Log Level
Set the Number of Connection Retries
Add Monitors
Manually Selecting Monitors to be Measured
Edit Monitor Properties
Monitors of Type Value
Monitors of Type Event
Configuring Alarm Thresholds
Overview
After installing the probe, you must define what to monitor. At a high level there are three steps:
1. Connect to the UCS Manager environment.
2. Add monitors (checkpoints) either manually or with templates.
For adding monitors manually, see the descriptions in the section Adding Monitors. For applying monitors with templates, see the
descriptions on the separate page titled Apply Monitoring with Templates.
3. Configure the properties for the monitors, in which you define QoS data, and define alarms to be sent if specified thresholds are
breached.
Note: You must follow the Java standard of enclosing the IPv6 address in square brackets. For example: The input string
[f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace error that includes the
exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
Port
The port number for the UCS Manager REST API environment. Default is 443.
Username
A valid username to be used by the probe to log on to the UCS Manager environment.
Note: If Cisco UCS has been set up for LDAP Authentication, the login will require an LDAP label in addition to the user
name. The LDAP label is an arbitrary string used to identify the LDAP realm; it is configured in the Cisco UCS Manager by
the Cisco UCS administrator. You may need to contact your Cisco UCS administrator to get this LDAP label. When you set
up the login to work with LDAP authentication, use this convention: ucs-<LDAP Label>\<username>. The LDAP label is
case-sensitive and the username can be either lower or upper case.
Password
A valid password to be used by the probe to log on to the UCS Manager environment.
Active
Select this checkbox to activate or deactivate monitoring of the Resource.
Interval
The interval defines how often the probe checks the values of the monitors. This can be set in seconds, minutes or hours. We
recommend polling once every 10 minutes. The polling interval should not be smaller than the time required to collect the data.
Alarm Message
Select the alarm message to be sent if the Resource does not respond.
Actions - Verify Selection
Click Actions>Verify Selection buttons to verify the connection to the Resource.
4. After completing the fields and testing that the connection works, click OK to add the Resource.
The profile is added. The initial data collection/polling cycle starts. The resource hierarchy will populate once the polling cycle has
completed.
Add Monitors
In Admin Console, there are two different ways to add monitors to Cisco UCS entities:
Manually select the monitors
To manually select and enable monitors, navigate to the target entity within the Resource. This lists its monitors in the right pane. Use the
available check-boxes to enable QoS monitoring for the selected metrics. To enable Alarm thresholding, you will need to launch the Edit
Monitor dialog.
Apply Monitoring with Templates
Templates let you define reusable sets of monitors to apply to various Cisco UCS monitored entities.
For more information, see the Apply Monitoring with Templates page.
Note: On the far left column of the details pane, the Monitor column describes either the type value of the monitor (such as Operability)
or describes it as an Event. Monitors of type value and type event have different properties.
Metric Type
This is the metric type of the monitor. This type will be inserted into this field when the monitor is retrieved from the UCS Manager
environment.
Units
This field specifies the unit of the monitored value (for example, Mbytes). The field is read-only. This unit type will be inserted into this
field when the monitor is retrieved from the UCS Manager environment.
Value Definition
This drop-down list lets you select which value to be used, both for alarming and QoS:
You have the following options:
The current value. The most current value measured will be used.
The delta value (current - previous). The delta value calculated from the current and the previous measured sample will be used.
Delta per second. The delta value calculated from the samples measured within a second will be used.
The average value of the last and current sample: (current + previous) / 2.
The average value last ... The user specifies a count. The value is then averaged based on the last "count" items.
Publish Alarms
Selecting this option activates the alarming.
You can define both a high and a low threshold.
Initially the high threshold is set to the current value. Set this value to match your needs.
The low threshold is initially disabled. If you want to use it, you must select another operator than "disabled" from the list and configure it
to match your needs.
Operator
Select from the drop-down list the operator to be used when setting the alarm threshold for the measured value.
Example:
>= 90 means the monitor is in alarm condition if the measured value is equal to or above 90.
= 90 means the monitor is in alarm condition if the measured value is exactly 90.
Threshold
The alarm threshold value. An alarm message is sent when this threshold is violated.
Message ID
Select the alarm message to be issued if the specified threshold value is breached. These messages reside in the message pool.
Monitors of Type Event
You can monitor Cisco UCS Faults on each Resource. The event is forwarded as an alarm message and the suppression key is based on the
entity.
The properties for monitors of type event are:
Description
This is a description of the monitor. This description will be inserted into this field when the monitor is retrieved from the UCS Manager
environment.
Metric Type
This is the metric type of the monitor. This type will be inserted into this field when the monitor is retrieved from the UCS Manager
environment.
High Operator
Select the high operator to be used when setting the alarm threshold for the event. This threshold refers to the events severity level in
Cisco UCS
High Threshold
The high alarm threshold value. An alarm message is sent when this threshold is violated.
High Message Name
Select the alarm message to be issued if the specified threshold value is breached. These messages reside in the message pool.
Configuring Alarm Thresholds
For more information about configuring thresholds for numeric monitors for the Cisco UCS Server monitoring (cisco_ucs) probe v2.3 or later, see
Configuring Alarm Thresholds. The link takes you to a page that describes how to configure centralized thresholds for select probes in CA Unified
Infrastructure Management.
Note: This article describes how to apply monitoring with templates for a single probe. For more information about how to use policies
to configure templates for multiple probes, see Configure Probes with Policies in the CA Unified Infrastructure Management wiki.
Contents
Overview
Verify Prerequisites
Enable Bulk Configuration
Using Templates
Apply a Default Template
Modify and Apply a Default Template
Create a Template
Create Template Filters
Add Rules to a Template Filter
Add Monitors to a Template Filter
Activate a Template
Using the Template Summary View
View Configuration in the All Monitors Table
Overview
Applying monitoring with templates saves time compared to manual monitor configuration and provides consistent monitoring across multiple
devices. At a high level, applying monitoring with templates is a two-step process in which you:
1. Enable bulk configuration
Before using the template editor, you first enable bulk configuration. This feature is disabled by default. It is also disabled if the probe has
been previously configured. Bulk configuration is possible only on a newly deployed probe (v2.3 or later) with no configuration.
2. Use the template editor
Once bulk configuration is enabled, you can copy and modify default template or create a new template to define unique monitoring
configurations for an individual device or groups of devices in your environment.
Verify Prerequisites
Important! Bulk configuration is only possible on a newly deployed v2.3 (or later) probe with no previous configuration. When you
enable bulk configuration, Infrastructure Manager is disabled and the Template Editor appears in the Admin Console GUI. Once you
enable bulk configuration, there is no supported process for going back to manual configuration. Be sure that you fully understand and
accept the consequences of enabling bulk configuration before enabling it.
Using Templates
The template editor allows you to configure and apply monitoring templates. Templates reduce the time that you need for manual configuration
and provide consistent monitoring across the devices in your network. You can configure monitoring on many targets with a well-defined template.
You can also create multiple templates to define unique configurations for all devices and groups of target devices in your environment.
You can use the template editor to:
Copy and apply a default template
Copy, modify, and apply a default template
Create and apply a new template
You can further customize any template by configuring:
Precedence
Precedence controls the order of template application. The probe applies a template with a precedence of one after a template with a
precedence of two. If there are any overlapping configurations between the two templates, then the settings in the template with a
precedence of one overrides the settings in the other template. If the precedence numbers are equal, then the templates are applied in
alphabetical order.
Note: The default template is applied with a precedence of 100. Be sure to set the precedence of your other templates to a
number lower than 100 so that the probe applies them at a higher priority than the default template. We recommend using incre
ments of 10. Using increments of 10, you can easily add custom templates and assign them a precedence that results in the
probe applying them in the order that you desire.
Filters
Filters let you control how the probe applies monitors based on attributes of the target device.
Rules
Rules apply to a device filter to create divisions within a group of systems or reduce the set of devices that the probe monitors.
Monitors
Monitors collect quality of service (QoS), event, and alarm data.
Note: Wait for the component discovery process to complete before using templates. Some QoS metrics are only applied to
components on specific devices. Determine what device types exist in your environment before you activate a template.
The default templates contain settings for a recommended monitor configuration. These default configurations include high-value metrics for
supported interfaces and network devices. Using these default configurations helps you to quickly start collecting data for your environment. To
save these recommended monitor configurations, the default templates are read-only. Because a default template is read-only, you first copy and
You can modify a default template to meet your specific needs. When your modifications are complete, you activate the template. The probe then
applies the template to the appropriate devices and components.
Follow these steps:
1. In Admin Console, select the probe and click Configure.
2. Click cisco_ucs > Template Editor.
3. Select any default template.
BladeServer
Chassis
FabricInterconnects
Vblock UCS Dashboard
VM
4. Click Options (...) > Copy.
5. Enter the name of the template and a description.
6. (Optional) Determine if you want to modify the default precedence setting.
7. Click Submit.
8. Click Save.
9. (Optional) Create template filters.
10. (Optional) Add rules to a template filter.
11. (Optional) Add monitors to a template filter.
12. In the navigation tree, select the template that you created in steps 1-5.
The template set up dialog appears in the detail pane.
13. Check Active.
The probe applies the template with the modified settings.
Create a Template
Note: Do not activate the template if you want to configure template monitoring filters or rules. If you change the template state
to active, the probe immediately applies the configuration.
6. Click Submit.
7. Click Save.
The system creates a template that you can configure and activate.
For more information, see Create Template Filters, Add Rules to a Template Filter, Add Monitors to a Template Filter, and Activate a
Template.
Create Template Filters
Filters let you control how the probe applies monitors based on attributes of the target device.
Follow these steps:
1. In the template editor, select the template that you created.
2. Choose any node that has the Options (...) icon next to it.
3. Enter a descriptive name for the filter and a precedence.
4. Repeat the previous steps for every template that requires filters.
5. Click Submit.
The template filter is created.
Note: You must activate the template for the probe to apply the filter configuration. When you change the template state to
active, the probe immediately applies all template configuration, including filters, rules, and monitors.
A filter allows you to control which devices and monitors are associated with a particular template. You specify more device criteria by using rules.
Filters usually contain one or more rules to define the types of devices for the template. You can add rules to a device filter to create divisions
within a group of systems or reduce the set of devices that the probe monitors. For example, you can add a rule to apply a configuration to all
devices with the name Cisco.
Note: If no rules exist, the probe always applies the monitor configuration in an active template to all applicable devices.
4.
Contains
Does not contain
Ends with
Equals
Not equals
Regex (regular expression)
Starts with
5. Enter a value for the rule.
6. Click Save.
The rule is created.
Note: You must activate the template for the probe to apply the rule configuration. When you change the template state to active, the
probe immediately applies all template configuration, including filters, rules, and monitors.
Device filters contain a set of commonly used monitors that you can configure to meet you specific needs.
Follow these steps:
1. Click cisco_ucs > Template Editor >cisco_ucs probe > template name.
2. Click the desired device filter.
3. In the Detail pane, in the Monitors section, select any monitor.
A configuration dialog for the monitor appears below.
4. Check Include in Template to turn on the monitor.
5. (Optional) Check Publish Data.
6. (Optional) Check Publish Alarms.
Enter the desired settings for these required fields:
Value Definition
High Threshold
High Message Name
Low Operator
Low Threshold
Low Message Name
See Configure Alarm Thresholds for details.
7. Click Save.
The monitor is added.
Note: You must activate the template for the probe to apply the monitor configuration. When you change the template state to active,
the probe immediately applies all template configuration, including filters, rules, and monitors.
Activate a Template
The probe does not automatically apply the template configuration. The probe only applies templates in an active state. The template icon in the
navigation pane indicates the state of the template. The states are inactive (
to apply it.
) and active (
Important! The monitor settings in a template override any monitor settings in the probe configuration GUI.
3.
The probe applies the monitoring configuration.
Note: You cannot modify a configured monitor in the probe configuration GUI once you activate a template.
You can view all of the monitors included in any template configuration in the All Monitors table.
Follow these steps:
1. In the left-hand navigation tree, select a template.
The Template Summary view appears in the detail pane. In the Monitors Included in Template table, you see all of the monitors available
for the template and their configuration status.
2. To see details for a specific monitor, click on it.
The configuration details appear below the table.
Edit Configuration in Context
When you copy a default read-only template or create your own new template, you can edit the template's monitors from the Template Summary
View.
Note: For a default read-only template, you can view but cannot edit the template's monitor configuration in Template Summary View.
Tree Hierarchy
Tree Icons
cisco_ucs Node
Template Editor
Profiles Node
Resource Node
<Managed System> Node
<Device> Nodes
Tree Hierarchy
Once a Resource profile is added, the components of the Resource are displayed in the tree. Click a node in the tree to see the alarms, events, or
monitors available for that component. The tree contains a hierarchical representation of the components that exist in the Cisco UCS
environment.
Tree Icons
The icons in the tree indicate the type of object the node contains.
- Closed folder. Organizational node that is used to group similar objects. Click the node to expand it.
- Open folder. Organizational node that is used to group similar objects. Click the triangle next to the folder to collapse it.
- Detached Configuration folder. Organizational node that is used to group objects that once had configuration that is no longer valid.
- Unknown.
- Storage pool
- Storage Controller
- Chassis
- Network interface
- Fan
- Port
cisco_ucs Node
Navigation: cisco_ucs Node
Important! Enabling this feature and applying monitoring with templates overwrites any manual monitoring configurations. Before
enabling probe-oriented bulk configuration, see v2.3 cisco_ucs AC Apply Monitoring with Templates.
Log Level
The probe's default log level is 3 (Recommended). You can set the log level to 4 or 5 for troubleshooting and return it to 3 for normal
operations.
cisco_ucs > (...) >Add New Profile
You can manually add a device profile. The profile icon indicates the status of subcomponent discovery.
Fields to know:
Hostname
The hostname or IP address of the IBM server you want to monitor.
Note: You must follow the Java standard of enclosing an IPv6 address in square brackets. For example: The input string
[f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace error that includes the
exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
Active
Select this checkbox to activate monitoring of the resource.
Port
The SSH port for the target system.
Interval (secs)
The time to wait for connection to establish.
Username
A valid username to be used by the probe to log in to the IBM server.
Password
A valid password to be used by the probe to log in to the IBM server.
Alarm Message
The alarm message to be sent if the resource does not respond.
Template Editor
You use the template editor to apply monitoring templates, with optional filters, rules, and monitors, to one or more resources.
The template editor contains the following default templates:
VM
Chassis
BladeServer
FabricInterconnects
Vblock UCS Dashboard
The default templates contain settings for a recommended monitor configuration. These default configurations include high-value metrics for
supported interfaces and network devices. Using these default configurations helps you to quickly start collecting data for your environment. To
save these recommended monitor configurations, the default templates are read-only. Because a default template is read-only, you first copy and
rename it before you apply it.
Note: For more information about how to use the Template Editor, see v2.3 cisco_ucs AC Apply Monitoring with Templates.
Profiles Node
This section describes how to manage your device configuration. You can modify or delete a device profile, and validate the credentials for each
device.
Navigation: cisco_ucs Node > profile name
profile name > (...) > Delete Profile
Select this option to delete the profile.
profile name > Actions > Verify Selection
Use this section to modify settings for the profile.
profile name > Resource Setup
Use this section to modify settings for the profile.
Fields to know:
Hostname
The hostname or IP address of the IBM server you want to monitor.
Active
Select this checkbox to activate monitoring of the resource.
Port
The SSH port for the target system.
Interval (secs)
The time to wait for connection to establish.
Username
A valid username to be used by the probe to log in to the IBM server.
Password
A valid password to be used by the probe to log in to the IBM server.
Alarm Message
The alarm message to be sent if the resource does not respond.
Resource Node
The Resource node contains the managed systems that are associated with the resource.
Navigation: cisco_ucs > profile name > resource name
Click to view a Hosts node containing the managed systems that are on the resource.
Publish Data
Select this option if you want QoS messages to be issued on the monitor.
Publish Alarms
Select this option if you want to activate alarms.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
Current Value -- The most current value that is measured will be used.
Delta Value (Current - Previous) -- The delta value that is calculated from the current and the previous measured sample is used.
Delta Per Second -- The delta value that is calculated from the samples that are measured within a second will be used.
Average Value Last n Samples -- The user specifies a count. The value is then averaged based on the last "count" items.
Note: The value definition effects the QoS publication interval. For example: If you set the value definition to an "average
of n," the probe will wait n cycles before it sends any QoS data to the Discovery server. If you set the value definition to
"delta," the probe will wait two cycles before it sends any QoS data to the Discovery server.
Number of Samples
The count of items for the Value Definition when set to Average Value Last n Samples .
Operator
The operator to be used when setting the high or low alarm threshold for the measured value. You must select Publish Alarms to
enable this setting.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The high or low alarm threshold value. An alarm message is sent if this threshold is exceeded.
Message Name
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message
pool.
<Device> Nodes
<Device> nodes exist for managed system devices.
Note: If you click a <Device> node, you might see a table with QoS metrics or more named device nodes. You must click the named device
nodes to view a table with QoS metrics.
<Device> > Monitors
You can change monitor settings in the fields below the table. Select a monitor in the table to view the monitor configuration fields.
Fields to know:
QoS Name
The name to be used on the QoS message issued. The field is read-only.
Description
This is a read-only field, describing the monitor.
Units
The unit of the monitored value (for example %, Mbytes etc.). The field is read-only.
Metric Type Id
Identifies a unique Id of the QoS.
Publish Data
Select this option if you want QoS messages to be issued on the monitor.
Publish Alarms
Select this option if you want to activate alarms.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
Current Value -- The most current value that is measured
Delta Value (Current - Previous) -- The delta value that is calculated from the current and the previous measured sample
Delta Per Second -- The delta value that is calculated from the samples that are measured within a second
Average Value Last n Samples -- The user specifies a count and the value is averaged based on the last "count" items
Number of Samples
The count of items for the Value Definition when set to Average Value Last n Samples .
Operator
The operator to be used when setting the high or low alarm threshold for the measured value. You must select Publish Alarms to
enable this setting.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The high or low alarm threshold value. An alarm message is sent if this threshold is exceeded.
Message Name
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message
pool.
Important!
Thresholds for the following three types of monitors can only be configured in Admin Console:
Event Forwarding Monitors
Alarm Forwarding Monitors
Monitors with Boolean, Enumeration, or String-only Metrics
This article describes how to configure the Cisco UCS Server Monitoring (cisco_ucs) probe in the Infrastructure Manager GUI.
Contents
Overview
General Setup
Create a New Resource
Using the Message Pool Manager
Add a New Alarm Message
Edit an Alarm Message
Delete an Alarm Message
Setting the Number of Connection Retries
Adding Monitors
Manually Selecting Monitors to be Measured
Enabling the Monitors for QoS and Alarming
Edit Monitor Properties
Monitors of Type Value
Monitors of Type Event
Using Templates
Copy a Default Template
Create a New Template
Overview
After installing the probe, you must define what to monitor. At a high level there are three steps:
1. Connect to the UCS Manager environment.
2. Add monitors (checkpoints). See the description in the section Adding Monitors.
3. Configure the properties for the checkpoints, in which you define QoS data, and define alarms to be sent if specified thresholds are
breached.
Important! Configuration of the probe -- through the Unified Management Portal (UMP), using the Admin Console portlet (AC) -- is not
compatible with the configuration through the Infrastructure Manager interface described here. Do not mix or interchange configuration
methods! If you do, the result will be unpredictable monitoring of your system.
General Setup
Click the General Setup button to set the level of details written to the log file for the probe. Log as little as possible during normal operation to
minimize disk consumption. This is a sliding scale with the range of information logged being fatal errors all the way to extremely detailed
information used for debugging.
Click the Apply button to implement the new log level immediately.
Note: The probe allows you to change the log level without restarting the probe.
Note: You must follow the Java standard of enclosing an IPv6 address in square brackets. For example: The input string
[f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace error that includes the
exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
Port
The port number for the UCS Manager REST API environment. Default is 443.
Active
Select this checkbox to activate or deactivate monitoring of the Resource.
Username
A valid username to be used by the probe to log on to the UCS Manager environment.
Note: If Cisco UCS has been set up for LDAP Authentication, the login will require an LDAP label in addition to the user name.
The LDAP label is an arbitrary string used to identify the LDAP realm; it is configured in the Cisco UCS Manager by the Cisco
UCS administrator. You may need to contact your Cisco UCS administrator to get this LDAP label. When you set up the login
to work with LDAP authentication, use this convention: ucs-<LDAP Label>\<username>. The LDAP label is case-sensitive and
the username can be either lower or upper case.
Password
A valid password to be used by the probe to log on to the UCS Manager environment.
Alarm Message
Select the alarm message to be sent if the Resource does not respond.
Note: You can edit the message or define a new message using the Message Pool Manager.
Check Interval
The check interval defines how often the probe checks the values of the monitors. This can be set in seconds, minutes or hours. We
recommend polling once every 10 minutes. The polling interval should not be smaller than the time required to collect the data.
Test button
Click the Test button to verify the connection to the Resource.
After completing the fields and testing that the connection works, click OK to add the Resource. The initial data collection/polling cycle starts. The
resource hierarchy will populate once the polling cycle has completed.
Adding Monitors
There are three different ways to add monitors to Cisco UCS entities:
Manually select the monitors
To manually select and enable monitors, navigate to the target entity within the Resource. This lists its monitors in the right pane. Use the
available check-boxes to enable QoS monitoring for the selected metrics. To enable Alarm thresholding, you will need to launch the Edit
Monitor dialog.
Use Templates
Templates let you define reusable sets of monitors to apply to various Cisco UCS monitored entities.
See the section Using Templates for further information.
Use Auto Configurations
Auto Configuration is a powerful way to automatically add monitors to be measured. Monitors are created for new devices (that is, ones
not currently monitored) that would otherwise need manual configuration to be monitored.
Example: Auto Configuration contains an auto-monitor for server 'Front Temperature'. When a new server is added to the Cisco UCS, the Auto
Configuration feature creates a monitor automatically for monitoring the new server.
See the section Using Automatic Configurations for further information.
Note: You can also add monitors to be measured using templates (see the section Using Templates).
Select the All Monitors node to list all monitors currently being measured in the right pane. You can select or deselect monitors here as well.
Green icon - the monitor is configured and active
Gray icon - the monitor is configured but not active
Black icon - the monitor is not configured
Note: If a monitor name is in italics you have changed the configuration but have not applied the changes.
Unit
This field specifies the unit of the monitored value (for example %, Mbytes etc.). The field is read-only.
Message ID
Select the alarm message to be issued if the specified threshold value is breached. These messages reside in the message pool. You
can modify the messages in the Message Pool Manager.
Publish Quality of Service
Select this option if you want QoS messages to be issued on the monitor.
QoS Name
The unique QoS metric. This is a read-only field.
Monitors of Type Event
You can monitor Cisco UCS Faults on each Resource. The event is forwarded as an alarm message and the suppression key is based on the
entity.
The properties for monitors of type event are:
Name
This is the name of the monitor. The name will be inserted into this field when the monitor is retrieved from the Cisco UCS, and you are
allowed to modify the name.
Key
This is a read-only field, describing the monitor key.
Description
This is a description of the monitor. This description will be inserted into this field when the monitor is retrieved from the Cisco UCS. This
is a read-only field.
Subscribe
Selecting this option, an alarm will be sent when this event has been triggered.
Operator
Select the operator to be used when setting the alarm threshold for the event.
This threshold refers to the events severity level in Cisco UCS.
Example: >= 1 means alarm condition if the event is triggered, and the severity level in Cisco UCS is equal to or higher than 1 (Warning).
Severity
The threshold severity level for the event in Cisco UCS.
Message Token
Select the alarm message to be issued if the specified threshold value is breached. These messages are kept in the message pool. The
messages can be modified in the Message Pool Manager.
Important! Monitoring events may cause a larger than expected increase in alarm messages and possibly decrease in system
performance.
Using Templates
Templates let you define reusable sets of monitors to be measured on multiple Equipment, Pools, and UCS Service Profiles. They provide an
easy way to accomplish consistent monitoring of your UCS Manager environment.
The following default templates come with the probe:
Chassis
BladeServer
FabricInterconnects
Vblock UCS Dashboard
VM
The default templates contain commonly used monitoring configurations that enable you to quickly apply monitoring.
You can also create your own templates and define a set of monitors belonging to each. You can then apply these templates to anything in the
Resources or Auto Configurations hierarchies in the navigation pane by dragging the template and dropping it on the appropriate item. This
assigns the template monitors to the drop point and everything below it.
If you apply a template to the Auto Configuration, its monitors are applied to all Cisco UCS monitored entities as they appear in the system. If you
need a finer level of control, you can apply a template to anything in the Resources hierarchy; in this case the monitors are applied to the
drop-point and everything subordinate to it. Any templates applied within the Resources hierarchy are static monitors. The static monitors override
any auto monitors for that specific resource entity.
Note: You can do both, placing general-purpose templates in Auto Configuration, and applying special-purpose templates that override
the Auto Configuration templates on specific nodes, for specific purposes.
See the Using Automatic Configurations section for details on Auto Configuration.
Copy a Default Template
You can apply a default template as you do any other template. However, you may want to copy the default template and then apply the copy.
Copying the default template allows you to make modifications to the copies without losing the original default template's monitor settings.
Follow these steps:
1. Click the Templates node in the navigation pane.
2. Select the default template.
3. Right-click>Copy Template.
The Templates Properties dialog appears.
4. Give the copy a name and description.
The default template is copied and appears under the Templates node and in the content pane.
Create a New Template
Drag the template to the Auto Configuration node or the Resource (Equipment, Pools, or UCS Service Profiles) where you want it applied, and
drop it there.
Note: You can drop the template on an object containing multiple subordinate objects. This applies the template to the entity and all its
subordinate entities. A static monitor is created for this entity.
This node lists Auto Monitors, created based on the contents added to the Auto Configuration node. The Auto Monitors are only created
for content without a pre-existing static monitor.
Important! If you are experiencing performance problems, we recommend increasing the polling cycle and/or the memory configuration
for the probe. Increase memory when the probe is running out of memory. Increase polling cycle when the collection takes longer than
the configured interval.
Note: You must click the Apply button and restart the probe to activate configuration changes.
You can add a single monitor (checkpoint) to the Auto Configurations node.
To list available monitors:
1. Select the Resource node in the navigation pane and navigate to the point of interest.
2. Select an object to list its monitors in the right pane.
3. Add the monitor to the Auto Configurations node by dragging the monitor to the Auto Configurations node and dropping it there.
4. Click the Auto Configurations node and verify that the monitor was successfully added.
Note: You must click the Apply button and restart the probe to activate configuration changes.
To verify that the monitors were successfully added, click the Auto Configurations node in the navigation pane.
To edit the properties for a monitor, right-click in the list and choose Edit from the menu. See the section Edit Monitor Properties for
detailed information.
To delete a monitor from the list, right-click in the list and choose Delete from the menu.
Note: You must click the Apply button and restart the probe to activate configuration changes.
Note: You must click the Apply button and restart the probe to activate the Auto Configuration feature.
When you restart the probe, it searches through the Resource's entities. For each one that is currently not monitored, an Auto Monitor is created
for each of the monitors listed under the Auto Configurations node.
All defined Auto Monitors are listed under the Auto Monitors node.
This section contains the basic Infrastructure Manager GUI information for the Cisco UCS Server Monitoring (cisco_ucs) probe.
Contents
The configuration interface consists of a row of tool buttons above a window split into two parts:
The Navigation pane
The Content pane
In addition, a status bar at the bottom of the window shows version information and date and time when the probe was last started.
Note: The probe is only configurable in Admin Console if you see the following message: "The probe is running in bulk configure mode.
This GUI is not supported in bulk configure mode. Exiting." For more information, see the cisco_ucs AC Configuration guide.
The General Setup button allows you to configure the log level for the probe.
The New Resource button allows you to add a new resource.
The Message Pool Manager button allows you to add, remove or edit alarm messages.
The Create New Template button allows you to create a new template.
The Navigation (left-side) Pane
The division on the left side of the window is the navigation pane. It displays the monitored Resources and any Templates you have created.
Resources
You can create a new Resource by clicking the New Resource button, or by right-clicking Resources and selecting New Resource.
Note: If you use an IPv6 address for a resource, you must follow the Java standard of enclosing the IPv6 address in square brackets.
For example: The input string [f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace
error that includes the exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
The Resource is configured as a link to the UCS Manager environment. Note the following icons for the Resource node:
Resource is inactive
Resource is marked for deletion
Resource is unable to connect
New resource (not yet saved)
Resource is loading inventory. Not ready to browse
Resource is connected and inventory is ready to browse
The Resources node contains the following sub-hierarchies:
The Auto Configurations node
One or more checkpoints (or templates) can be added to this node, using drag and drop. These checkpoints can to be used for auto
configuring unmonitored devices. See the section Using Automatic Configurations for further information.
The Auto Monitors node
This is a list of the monitors that have been created based on the Auto-Configuration entries and the inventory available on the Resource.
See the section Using Automatic Configurations for further information.
The All Monitors node
This node contains the complete list of Monitors for the Resource. This includes Auto Monitors and manually configured Monitors. See
the section Using Automatic Configurations for further information.
The Cisco UCS hierarchy
This is a list of the Equipment, Pools, UCS Service Profiles and child components available in the UCS Manager environment for
monitoring.
Templates
A right-click with the mouse pointer in the navigation pane over the hostname or IP address node opens a pop-up menu with menu items for
managing the selected object or creating new objects of its type. Options typically include: New, Edit, Delete, Deactivate, and Refresh.
Note: When available, the Refresh menu item retrieves updated values and refreshes the display.
The content of the right pane depends on the current selection in the navigation pane.
If you select a Resources node in the navigation pane, the content pane lists the UCS Manager environments.
If you select Equipment, Pools, UCS Service Profiles or a child component in the navigation pane, the content pane lists the available monitors.
Active Monitors are check-marked. The following icons can appear:
Monitor is active but not enabled to send alarms. The Enable Monitoring checkbox is not selected for this monitor.
Black: Monitor is NOT activated. The Action option is not set in the properties dialog for the monitor.
Green: Monitor is activated for monitoring and, if an alarm threshold is set, the threshold value defined in the properties dialog for the monitor
is not exceeded.
Other colors: Monitor is activated for monitoring and the threshold value defined in the properties dialog for the monitor is exceeded. The
color reflects the message token selected in the properties dialog for the monitor.
No value has been measured.
Note: A monitor name in italics indicates that the monitor has been modified and you must apply the changes before the monitor results
are updated.
A right-click with the mouse pointer on objects in the content pane opens a pop-up menu with menu items for managing the selected object type (
Edit, Delete, and Add to Template).
Note: When available, the Refresh menu item fetches updated values and refreshes the display.
Equipment
Chassis Interface Card
The server blade interface cards provide the traffic metric states.
If configured, the NICs on the server blade interface cards also provide traffic metrics (number of packets sent, packets dropped, errors,
etc.).
The information of the disks attached to the server blades is available at Equipment > Chassis/Rack Mounts > Servers. The metrics
collect the disk status values.
Fabric Interconnects
The ports in the Fixed Module / Expansion Module(s) are grouped into the following categories, based on the "ifRole" property in the Cisco UCS
Manager.
Expansion Module
Storage FC Ports
Unconfigured Ports
Uplink FC Ports
Fixed Module
Appliance Ports
FCoE Storage Ports
Server Ports
Unconfigured Ports
Uplink Ethernet Ports
Uplink FC Ports
The NIC nodes display an aggregated view of bandwidth (the bandwidth available for the Chassis, for Uplink to VSANs etc.).
Rack Mounts
Rack Mounts class is available in the Equipment node. The children nodes of this class are FEX and Servers. Rack Mounts have many
similarities with the equipmentChassis class. The cisco_ucs probe should automatically detect if any RackUnits are managed by the UCS
Manager.
Pools
The Cisco UCS contains a number of pools such as a MAC Pool (list of available MAC addresses), WWNN pool, WWPN pools, etc. It is useful
for the system administrators to be aware of how many of these pools are empty, how many are assigned, and so on.
The cisco_ucs probe supports the following pools. They are displayed in the tree view of the probe GUI.
MAC Pools
Server Pools
UUID Pools
WWNN Pools
WWPN Pools
UCS Service Profiles
The service profiles created in the Cisco UCSM are displayed in the tree view of cisco_ucs probe GUI under the UCS Service Profiles node.
Contents
General Setup
Create a New Resource
Using the Message Pool Manager
Add a New Alarm Message
Edit an Alarm Message
Important! Configuration of the probe -- through the Unified Management Portal (UMP), using the Admin Console portlet (AC) -- is not
compatible with the configuration through the Infrastructure Manager interface described here. Do not mix or interchange configuration
methods! If you do, the result will be unpredictable monitoring of your system.
After installing the cisco_ucs probe, you must define what to monitor. At a high level there are three steps:
1. Connect to the UCS Manager environment.
2. Add monitors (checkpoints). See the description in the section Adding Monitors (Checkpoints).
3. Configure the properties for the checkpoints, in which you define QoS data, and define alarms to be sent if specified thresholds are
breached.
Note: You must always click the Apply button to activate any configuration changes.
General Setup
Click the General Setup button to set the level of details written to the log file for the cisco_ucs probe. Log as little as possible during normal
operation to minimize disk consumption. This is a sliding scale with the range of information logged being fatal errors all the way to extremely
detailed information used for debugging.
Click the Apply button to implement the new log level immediately.
Note: The probe allows you to change the log level without restarting the probe.
Note: You must follow the Java standard of enclosing an IPv6 address in square brackets. For example: The input string
[f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace error that includes the
Port
The port number for the UCS Manager REST API environment. Default is 443.
Active
Select this checkbox to activate or deactivate monitoring of the Resource.
Username
A valid username to be used by the probe to log on to the UCS Manager environment.
Note: If Cisco UCS has been set up for LDAP Authentication, the login will require an LDAP label in addition to the user name.
The LDAP label is an arbitrary string used to identify the LDAP realm; it is configured in the Cisco UCS Manager by the Cisco
UCS administrator. You may need to contact your Cisco UCS administrator to get this LDAP label. When you set up the login
to work with LDAP authentication, use this convention: ucs-<LDAP Label>\<username>. The LDAP label is case-sensitive and
the username can be either lower or upper case.
Password
A valid password to be used by the probe to log on to the UCS Manager environment.
Alarm Message
Select the alarm message to be sent if the Resource does not respond.
Note: You can edit the message or define a new message using the Message Pool Manager.
Check Interval
The check interval defines how often the probe checks the values of the monitors. This can be set in seconds, minutes or hours. We
recommend polling once every 10 minutes. The polling interval should not be smaller than the time required to collect the data.
Test button
Click the Test button to verify the connection to the Resource.
After completing the fields and testing that the connection works, click OK to add the Resource. The initial data collection/polling cycle starts. The
resource hierarchy will populate once the polling cycle has completed.
Subsystem string/id
The NAS subsystem ID for the Cisco UCS system.
4. Click OK to save the new message.
Adding Monitors
There are three different ways to add monitors to Cisco UCS entities:
Manually select the monitors
To manually select and enable monitors, navigate to the target entity within the Resource. This lists its monitors in the right pane. Use the
available check-boxes to enable QoS monitoring for the selected metrics. To enable Alarm thresholding, you will need to launch the Edit
Monitor dialog. See the section Manually Selecting Monitors to be Measured.
Use Templates
Templates let you define reusable sets of monitors to apply to various Cisco UCS monitored entities.
See the section Using Templates for further information.
Use Auto Configurations
Auto Configuration is a powerful way to automatically add monitors to be measured. Monitors are created for new devices (that is, ones
not currently monitored) that would otherwise need manual configuration to be monitored.
Example: Auto Configuration contains an auto-monitor for server 'Front Temperature'. When a new server is added to the Cisco UCS, the Auto
Configuration feature creates a monitor automatically for monitoring the new server.
Note: You can also add monitors to be measured using templates (see the section Using Templates).
Select the All Monitors node to list all monitors currently being measured in the right pane. You can select or deselect monitors here as well.
Green icon - the monitor is configured and active
Gray icon - the monitor is configured but not active
Black icon - the monitor is not configured
Note: If a monitor name is in italics you have changed the configuration however have not applied the changes.
You can monitor Cisco UCS Faults on each Resource. The event is forwarded as an alarm message and the suppression key is based on the
entity.
The properties for monitors of type event are:
Name
This is the name of the monitor. The name will be inserted into this field when the monitor is retrieved from the Cisco UCS, and you are
allowed to modify the name.
Key
This is a read-only field, describing the monitor key.
Description
This is a description of the monitor. This description will be inserted into this field when the monitor is retrieved from the Cisco UCS. This
is a read-only field.
Subscribe
Selecting this option, an alarm will be sent when this event has been triggered.
Operator
Select the operator to be used when setting the alarm threshold for the event.
This threshold refers to the events severity level in Cisco UCS.
Example: >= 1 means alarm condition if the event is triggered, and the severity level in Cisco UCS is equal to or higher than 1 (Warning).
Severity
The threshold severity level for the event in Cisco UCS.
Message Token
Select the alarm message to be issued if the specified threshold value is breached. These messages are kept in the message pool. The
messages can be modified in the Message Pool Manager.
Important! Monitoring events may cause a larger than expected increase in alarm messages and possibly decrease in system
performance.
Using Templates
Templates let you define reusable sets of monitors to be measured on multiple Equipment, Pools, and UCS Service Profiles. They provide an
easy way to accomplish consistent monitoring of your UCS Manager environment.
You can create your own templates and define a set of monitors belonging to each. You can then apply these templates to anything in the
Resources or Auto Configurations hierarchies in the navigation pane by dragging the template and dropping it on the appropriate item. This
assigns the template monitors to the drop point and everything below it.
If you apply a template to the Auto Configuration, its monitors are applied to all Cisco UCS monitored entities as they appear in the system. If you
need a finer level of control, you can apply a template to anything in the Resources hierarchy; in this case the monitors are applied to the
drop-point and everything subordinate to it. Any templates applied within the Resources hierarchy are static monitors. The static monitors override
any auto monitors for that specific resource entity.
Note: You can do both, placing general-purpose templates in Auto Configuration, and applying special-purpose templates that override
the Auto Configuration templates on specific nodes, for specific purposes.
See the Using Automatic Configurations section for details on Auto Configuration.
Create a New Template
Apply a Template
Drag the template to the Auto Configuration node or the Resource (Equipment, Pools, or UCS Service Profiles) where you want it applied, and
drop it there.
Note: You can drop the template on an object containing multiple subordinate objects. This applies the template to the entity and all its
subordinate entities. A static monitor is created for this entity.
You can add a single monitor (checkpoint) to the Auto Configurations node.
To list available monitors:
1. Select the Resource node in the navigation pane and navigate to the point of interest.
2. Select an object to list its monitors in the right pane.
3. Add the monitor to the Auto Configurations node by dragging the monitor to the Auto Configurations node and dropping it there.
4. Click the Auto Configurations node and verify that the monitor was successfully added.
Note: You must click the Apply button and restart the probe to activate configuration changes.
To verify that the monitors were successfully added, click the Auto Configurations node in the navigation pane.
To edit the properties for a monitor, right-click in the list and choose Edit from the menu. See the section To Edit Monitor Properties for
detailed information.
To delete a monitor from the list, right-click in the list and choose Delete from the menu.
Note: You must click the Apply button and restart the probe to activate configuration changes.
Note: When monitors have been added to the Auto Configurations node, you must click the Apply button and restart the probe to activate the
Auto Configuration feature.
When you restart the probe, it searches through the Resource's entities. For each one that is currently not monitored, an Auto Monitor is created
for each of the monitors listed under the Auto Configurations node.
All defined Auto Monitors are listed under the Auto Monitors node.
The configuration interface consists of a row of tool buttons above a window split into two parts:
The Navigation pane
The Content pane
In addition, a status bar at the bottom of the window shows version information and date and time when the probe was last started.
The Toolbar Buttons
The General Setup button allows you to configure the log level for the probe.
The New Resource button allows you to add a new resource.
The Message Pool Manager button allows you to add, remove or edit alarm messages.
The Create New Template button allows you to create a new template.
The Navigation (left-side) Pane
The division on the left side of the window is the navigation pane. It displays the monitored Resources and any Templates you have created.
Resources
You can create a new Resource by clicking the New Resource button, or by right-clicking Resources and selecting New Resource.
Note: If you use an IPv6 address for a resource, you must follow the Java standard of enclosing the IPv6 address in square brackets.
For example: The input string [f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace
error that includes the exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
The Resource is configured as a link to the UCS Manager environment. Note the following icons for the Resource node:
Resource is inactive
Resource is marked for deletion
Resource is unable to connect
New resource (not yet saved)
Resource is loading inventory. Not ready to browse
Resource is connected and inventory is ready to browse
The Resources node contains the following sub-hierarchies:
The Auto Configurations node
One or more checkpoints (or templates) can be added to this node, using drag and drop. These checkpoints can to be used for auto
configuring unmonitored devices. See the section Using Automatic Configurations for further information.
The Auto Monitors node
This is a list of the monitors that have been created based on the Auto-Configuration entries and the inventory available on the Resource.
See the section Using Automatic Configurations for further information.
The All Monitors node
This node contains the complete list of Monitors for the Resource. This includes Auto Monitors and manually configured Monitors. See
the section Using Automatic Configurations for further information.
The Cisco UCS hierarchy
This is a list of the Equipment, Pools, UCS Service Profiles and child components available in the UCS Manager environment for
monitoring.
Templates
Templates let you define reusable sets of monitors to apply to the various Cisco UCS components. After you create a template and define a set of
checkpoints belonging to that template, you can either:
Drag and drop the template into the cisco_ucs resource hierarchy where you want to monitor the checkpoints defined for the template.
This creates a static monitor for that resource component and its children (recursively) based on the template contents at the time the
static monitor is created.
Drag and drop the template into the Auto Configuration to add the template contents to the list of auto configuration monitors.
See the section Using Templates for details.
Navigation Pane Updates
A right-click with the mouse pointer in the navigation pane over the hostname or IP address node opens a pop-up menu with menu items for
managing the selected object or creating new objects of its type. Options typically include: New, Edit, Delete, Deactivate, and Refresh.
Note: When available, the Refresh menu item retrieves updated values and refreshes the display.
The content of the right pane depends on the current selection in the navigation pane.
If you select a Resources node in the navigation pane, the content pane lists the UCS Manager environments.
If you select Equipment, Pools, UCS Service Profiles or a child component in the navigation pane, the content pane lists the available monitors.
Active Monitors are check-marked. The following icons can appear:
Monitor is active but not enabled to send alarms. The Enable Monitoring checkbox is not selected for this monitor.
Black: Monitor is NOT activated. The Action option is not set in the properties dialog for the monitor.
Green: Monitor is activated for monitoring and, if an alarm threshold is set, the threshold value defined in the properties dialog for the monitor
is not exceeded.
Other colors: Monitor is activated for monitoring and the threshold value defined in the properties dialog for the monitor is exceeded. The
color reflects the message token selected in the properties dialog for the monitor.
No value has been measured.
Note: A monitor name in italics indicates that the monitor has been modified and you must apply the changes before the monitor results
are updated.
A right-click with the mouse pointer on objects in the content pane opens a pop-up menu with menu items for managing the selected object type (
Edit, Delete, and Add to Template).
Note: When available, the Refresh menu item fetches updated values and refreshes the display.
Equipment
Chassis Interface Card
The server blade interface cards provide the traffic metric states.
If configured, the NICs on the server blade interface cards also provide traffic metrics (number of packets sent, packets dropped, errors,
etc.).
The information of the disks attached to the server blades is available at Equipment > Chassis/Rack Mounts > Servers. The metrics
collect the disk status values.
Fabric Interconnects
The ports in the Fixed Module / Expansion Module(s) are grouped into the following categories, based on the "ifRole" property in the Cisco UCS
Manager.
Expansion Module
Storage FC Ports
Unconfigured Ports
Uplink FC Ports
Fixed Module
Appliance Ports
FCoE Storage Ports
Server Ports
Unconfigured Ports
Uplink Ethernet Ports
Uplink FC Ports
The NIC nodes display an aggregated view of bandwidth (the bandwidth available for the Chassis, for Uplink to VSANs etc.).
Rack Mounts
Rack Mounts class is available in the Equipment node. The children nodes of this class are FEX and Servers. Rack Mounts have many
similarities with the equipmentChassis class. The cisco_ucs probe should automatically detect if any RackUnits are managed by the UCS
Manager.
Pools
The Cisco UCS contains a number of pools such as a MAC Pool (list of available MAC addresses), WWNN pool, WWPN pools, etc. It is useful
for the system administrators to be aware of how many of these pools are empty, how many are assigned, and so on.
The cisco_ucs probe supports the following pools. They are displayed in the tree view of the probe GUI.
MAC Pools
Server Pools
UUID Pools
WWNN Pools
WWPN Pools
UCS Service Profiles
The service profiles created in the Cisco UCSM are displayed in the tree view of cisco_ucs probe GUI under the UCS Service Profiles node.
cisco_ucs Metrics
The following table contains the metrics for the Cisco UCS Server Monitoring (cisco_ucs) probe.
Resource Entity
CHA_EQUIPMENTCHASSIS
Metric Name
Unit
Description
Version
Administrative
State
State
v1.0
Config State
State
v1.0
Input Power
Watts
v1.0
Operability
State
v1.0
Output Power
Watts
v1.0
Overall Status
State
v1.0
Power
State
The power status of the chassis. Possible values are: 0 (unknown), 1 (ok), 2
(failed), 3 (input-failed), 4 (input-degraded), 5 (output-failed), 6
(output-degraded), 7 (redundancy-failed), 8 (redundancy-degraded).
v1.0
Presence
State
v1.0
CHA_EQUIPMENTFANMODULE
CHA_EQUIPMENTFAN
Thermal
State
The thermal status of the chassis. Possible values are: 0 (unknown), 1 (ok), 2
(upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Exhaust
Temperature
Celsius
Operability
State
The operability status of the chassis fan module. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
Overall Status
State
The overall status of the chassis fan module. Possible values are: 0 (unknown),
1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
Performance
State
The performance status of the chassis fan module. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
Power
State
The power status of the chassis fan module. Possible values are: 0 (unknown), 1 v1.0
(on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8
(power-save), 9 (error), 100 (not-supported).
Presence
State
The presence status of the chassis fan module. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the chassis fan module. Possible values are: 0 (unknown),
1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Voltage
State
The voltage status of the chassis fan module. Possible values are: 0 (unknown),
1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
Operability
State
The operability status of the chassis fan. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Overall Status
State
The overall status of the chassis fan. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance status of the chassis fan. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Power
State
The power status of the chassis fan. Possible values are: 0 (unknown), 1 (on), 2
(test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8 (power-save), 9
(error), 100 (not-supported).
v1.0
Presence
State
The presence status of the chassis fan. Possible values are: 0 (unknown), 1
(empty), 10 (equipped), 11 (missing), 12 (mismatch), 13 (equipped-not-primary),
20 (equipped-identity-unestablishable), 21 (mismatch-identity-unestablishable),
30 (inaccessible), 40 (unauthorized), 100 (not-supported).
v1.0
Speed
RPM
Thermal
State
The thermal status of the chassis fan. Possible values are: 0 (unknown), 1 (ok),
2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
v1.0
CHA_EQUIPMENTPSU
CHA_EQUIPMENTIOCARD
Voltage
State
The voltage of the chassis fan. Possible values are: 0 (unknown), 1 (ok), 2
(upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Input 210v
Volts
v1.0
Internal
Temperature
Celsius
v1.0
Operability
State
The operability status of the chassis power supply unit. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Output 12v
Volts
The output 12v of the chassis power supply unit measured in volts
v1.0
Output 3v3
Volts
The output 3v3 of the chassis power supply unit measured in volts
v1.0
Output Current
Amps
The output current of the chassis power supply unit measured in volts
v1.0
Output Power
Watts
The output power of the chassis power supply unit measured in watts
v1.0
Overall Status
State
The overall status of the chassis power supply unit. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance status of the chassis power supply unit. Possible values are : 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Power
State
The power status of the chassis power supply unit. Possible values are 0
(unknown), 1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7
(degraded), 8 (power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the chassis power supply unit. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the chassis power supply unit. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Voltage
State
The voltage status of the chassis power supply unit. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
ASIC
Temperature
Celsius
v1.2
Admin Power
State
State
The admin power status of the chassis IO card. Possible values are: 1 (policy), 2
(cycle-immediate), 3 (cycle-wait).
v1.2
Ambient
Temperature
Celsius
v1.2
Config State
State
v1.2
Discovery
State
The discovery status of the chassis IO card. Possible values are: 0 (unknown), 1
(online), 2 (offline), 3 (discovered), 4 (unsupported-connectivity), 5
(auto-upgrading).
v1.2
Operability
State
The operability status of the chassis IO card. Possible values are: 0 (unknown),
1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.2
CHA_ETHERSERVERINTFIO
CHA_ETHERSWITCHINTFIO
CHA_COMPUTEBLADE
Overall Status
State
The overall status of the chassis IO card. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.2
Peer
Communication
Status
State
The peer communication status of the chassis IO card. Possible values are: 0
(unknown), 1 (connected), 2 (disconnected)..
v1.2
Performance
State
v1.2
Power
State
The power status of the chassis IO card. Possible values are: 0 (unknown), 1
(on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8
(power-save), 9 (error), 100 (not-supported).
v1.2
Presence
State
The presence status of the chassis IO card. Possible values are: 0 (unknown), 1
(empty), 10 (equipped), 11 (missing), 12 (mismatch), 13 (equipped-not-primary),
20 (equipped-identity-unestablishable), 21 (mismatch-identity-unestablishable),
30 (inaccessible), 40 (unauthorized), 100 (not-supported).
v1.2
Thermal
State
The thermal status of the chassis IO card. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.2
Voltage
State
The voltage status of the chassis IO card. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.2
Administrative
State
State
The administrative status of the chassis ethernet server interface IO. Possible
values are: 0 (enabled), 1 (disabled).
v1.2
Overall Status
State
The overall status of the chassis ethernet server interface IO. Possible values
are: 0 (indeterminate), 1 (up), 2 (admin-down), 3 (link-down), 4 (failed), 5
(no-license), 6 (link-up), 7 (hardware-failure), 8 (software-failure), 9
(error-disabled), 10 (sfp-not-present).
v1.2
Acknowledged
State
The acknowledged status of the chassis ethernet switch interface IO. Possible
values are: 0 (un-initialized), 1 (un-acknowledged), 2 (unsupported-connectivity),
3 (ok), 4 (removing), 6 (ack-in-progress).
v1.2
Administrative
State
State
The administrative status of the chassis ethernet switch interface IO. Possible
values are: 0 (enabled), 1 (disabled).
v1.2
Discovery
State
The discovery status of the chassis ethernet switch interface IO. Possible values
are: 0 (absent), 1 (present), 2 (mis-connect).
v1.2
Overall Status
State
The overall status of the chassis ethernet switch interface IO. Possible values
are: 0 (indeterminate), 1 (up), 2 (admin-down), 3 (link-down), 4 (failed), 5
(no-license), 6 (link-up), 7 (hardware-failure), 8 (software-failure), 9
(error-disabled), 10 (sfp-not-present).
v1.2
Administrative
State
State
v1.0
Association
State
v1.0
Availability
State
v1.0
Consumed Power
Watts
v1.0
Front
Temperature
Celsius
v1.0
Input Current
Amps
v1.0
Input Voltage
Volts
v1.0
Operability
State
v1.0
CHA_ADAPTORUNIT
CHA_ADAPTOREXTETHIF
CHA_ADAPTORHOSTFCIF
Overall Status
State
v1.0
Power State
State
The power status of the blade. Possible values are: 0 (unknown), 1 (on), 2 (test),
3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8 (power-save), 9 (error),
100 (not-supported).
v1.0
Presence
State
The presence status of the blade. Possible values are: 0 (unknown), 1 (empty),
10 (equipped), 11 (missing), 12 (mismatch), 13 (equipped-not-primary), 20
(equipped-identity-unestablishable), 21 (mismatch-identity-unestablishable), 30
(inaccessible), 40 (unauthorized), 100 (not-supported).
v1.0
Rear
Temperature
Celsius
v1.0
Operability
State
The operability status of the chassis adaptor unit. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Overall Status
State
The overall status of the chassis adaptor unit. Possible values are: 0 (unknown),
1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance of the chassis adaptor unit. Possible values are: 0 (unknown),
1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Power
State
The power status of the chassis adaptor unit. Possible values are: 0 (unknown),
1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8
(power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the chassis adaptor unit. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the chassis adaptor unit. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
Voltage
State
The voltage of the chassis adaptor unit. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Administrative
State
State
v1.0
Overall Status
State
The overall status of the network facing adaptor interface. Possible values are: 0
(indeterminate), 1 (up), 2 (admin-down), 3 (link-down), 4 (failed), 5 (no-license),
6 (link-up), 7 (hardware-failure), 8 (software-failure), 9 (error-disabled), 10
(sfp-not-present).
v1.0
Administrative
State
State
The administrative status of the adaptor host fibre channel interface. Possible
values are: 0 (enabled), 44 (reset-connectivity-active), 45
(reset-connectivity-passive), 46 (reset-connectivity).
Operability
State
The operability status of the adaptor host fibre channel interface. Possible
values are: 0 (unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4
(powered-off), 5 (power-problem), 6 (removed), 7 (voltage-problem), 8
(thermal-problem), 9 (performance-problem), 10 (accessibility-problem), 11
(identity-unestablishable), 12 (bios-post-timeout), 13 (disabled), 51
(fabric-conn-problem), 52 (fabric-unsupported-conn), 81 (config), 82
(equipment-problem), 83 (decomissioning), 84 (chassis-limit-exceeded), 100
(not-supported), 101 (discovery), 102 (discovery-failed), 103 (identify), 104
(post-failure), 105 (upgrade-problem), 106 (peer-comm-problem), 107
(auto-upgrade).
v1.0
CHA_ADAPTORhostTHIF
CHA_MEMORYARRAY
Overall Status
State
The overall status of the adaptor host fibre channel interface. Possible values
are: 0 (unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance of the adaptor host fibre channel interface. Possible values
are: 0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Power
State
The power status of the adaptor host fibre channel interface. Possible values
are: 0 (unknown), 1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7
(degraded), 8 (power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the adaptor host fibre channel interface. Possible values
are: 0 (unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the adaptor host fibre channel interface. Possible values
are: 0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Voltage
State
The voltage status of the adaptor host fibre channel interface. Possible values
are: 0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Administrative
State
State
The administrative status of the adaptor host ethernet interface. Possible values
are: 0 (enabled), 44 (reset-connectivity-active), 45 (reset-connectivity-passive),
46 (reset-connectivity).
v1.0
Operability
State
The operability status of the adaptor host ethernet interface. Possible values are: v1.0
0 (unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
Overall Status
State
The overall status of the adaptor host ethernet interface. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance of the adaptor host ethernet interface. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Power
State
The power status of the adaptor host ethernet interface. Possible values are: 0
(unknown), 1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7
(degraded), 8 (power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the adaptor host ethernet interface. Possible values are:
0 (unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the adaptor host ethernet interface. Possible values are:0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Voltage
State
The voltage status of the adaptor host ethernet interface. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Current Capacity
Megabytes
v1.0
Max Capacity
Megabytes
v1.0
CHA_MEMORYUNITENVSTATS
CHA_PROCESSORENVSTATS
CHA_STORAGECONTROLLER
CHA_STORAGELOCALDISK
Operability
State
The operability status of the memory unit environment. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v2.3
Presence
State
The presence status of the memory unit environment. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Temperature
Celsius
v1.0
Visibility
State
v1.0
CPU
Temperature
Celsius
v1.0
Input Current
Amps
v1.0
Operability
State
v1.0
Visibility
State
v1.0
Operability
State
v1.0
Overall Status
State
The overall status of the storage controller. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
v1.0
Power
State
The power status of the storage controller. Possible values are: 0 (unknown), 1
(on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8
(power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the storage controller. Possible values are: 0 (unknown),
1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the storage controller. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Voltage
State
The voltage status of the storage controller. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Operability
State
The operability status of the storage local disk. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
CHA_STORAGELOCALLUN
CHA_STORAGERAIDBATTERY
SWENVSTATS
EQUIPMENTFAN
Presence
State
The presence status of the storage local disk. Possible values are: 0 (unknown),
1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Operability
State
v1.0
Presence
State
Operability
State
The operability status of the storage RAID battery. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Presence
State
The presence status of the storage RAID battery. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Available Memory
Megabytes
v1.0
Cached Memory
Megabytes
v1.0
Load
Percent
v1.0
Operability
State
The operability status of the fabric interconnect. The possible values are: 0
(unknown), 1 (operable), 2 (inoperable).
v1.0
fanCtrlrInlet1
Celsius
v1.0
fanCtrlrInlet2
Celsius
v1.0
fanCtrlrInlet3
Celsius
v1.0
fanCtrlrInlet4
Celsius
v1.0
mainBoardOutlet1 Celsius
v1.0
mainBoardOutlet2 Celsius
v1.0
psuCtrlrInlet1
Celsius
The temperature of the fabric interconnect power supply unit controller inlet 1
v1.0
psuCtrlrInlet2
Celsius
The temperature of the fabric interconnect power supply unit controller inlet 2
v1.0
Operability
State
The operability status of the fan. Possible values are: 0 (unknown), 1 (operable),
2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6 (removed), 7
(voltage-problem), 8 (thermal-problem), 9 (performance-problem), 10
(accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout), 13
(disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81 (config),
82 (equipment-problem), 83 (decomissioning), 84 (chassis-limit-exceeded), 100
(not-supported), 101 (discovery), 102 (discovery-failed), 103 (identify), 104
(post-failure), 105 (upgrade-problem), 106 (peer-comm-problem), 107
(auto-upgrade).
v1.0
Overall Status
State
The overall status of the fan. Possible values are: 0 (unknown), 1 (operable), 2
(inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6 (removed), 7
(voltage-problem), 8 (thermal-problem), 9 (performance-problem), 10
(accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout), 13
(disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81 (config),
82 (equipment-problem), 83 (decomissioning), 84 (chassis-limit-exceeded), 100
(not-supported), 101 (discovery), 102 (discovery-failed), 103 (identify), 104
(post-failure), 105 (upgrade-problem), 106 (peer-comm-problem), 107
(auto-upgrade).
v1.0
Performance
State
The performance status of the fan. Possible values are: 0 (unknown), 1 (ok), 2
(upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Power
State
The power status of the fan. Possible values are: 0 (unknown), 1 (on), 2 (test), 3
(off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8 (power-save), 9 (error),
100 (not-supported).
Presence
State
The presence status of the fan. Possible values are: 0 (unknown), 1 (empty), 10
(equipped), 11 (missing), 12 (mismatch), 13 (equipped-not-primary), 20
(equipped-identity-unestablishable), 21 (mismatch-identity-unestablishable), 30
(inaccessible), 40 (unauthorized), 100 (not-supported).
v1.0
EQUIPMENTPSU
EQUIPMENTSWITCHCARD
ETHERPIO
Thermal
State
The thermal status of the fan. Possible values are: 0 (unknown), 1 (ok), 2
(upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Voltage
State
v1.0
Input Current
Amps
The input current of the equipment power supply unit measured in amps
Input Power
Watts
The input power of the equipment power supply unit measured in watts
v1.0
Input Voltage
Volts
v1.0
Operability
State
The operability status of the equipment power supply. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Overall Status
State
The overall status of the equipment power supply. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance status of the equipment power supply. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
Power
State
The power status of the equipment power supply. Possible values are: 0
(unknown), 1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7
(degraded), 8 (power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the equipment power supply. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the equipment power supply. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Voltage
State
The voltage of the equipment power supply, Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Operability
State
The operability status of the equipment switch card. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
State
State
The status of the equipment switch card. Possible values are: 0 (unknown), 1
(on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8
(power-save), 9 (error), 100 (not-supported).
v1.0
Administrative
State
State
The administrative status of the ether port IO. Possible values are: 0 (enabled),
1 (disabled).
v1.0
Align
errors
v1.0
Broadcast
Packets Rx
packets
v1.0
Broadcast
Packets Tx
packets
v1.0
Byte Rate Rx
mbps
v1.0
Byte Rate Tx
mbps
v1.0
Carrier Sense
errors
v1.0
Deferred Tx
errors
v1.0
Excess Collision
errors
v1.0
FCPIO
Fcs
errors
v1.0
Giants
errors
v1.0
Int Mac Rx
errors
v1.0
Int Mac Tx
errors
v1.0
Jumbo Packets
Rx
packets
v1.0
Jumbo Packets
Tx
packets
v1.0
Late Collision
errors
v1.0
Multi Collision
errors
v1.0
Multicast Packets
Rx
packets
v1.0
Multicast Packets
Tx
packets
v1.0
Out Discard
errors
v1.0
Overall Status
State
The overall status of the ether port IO. Possible values are: 0 (indeterminate), 1
(up), 2 (admin-down), 3 (link-down), 4 (failed), 5 (no-license), 6 (link-up), 7
(hardware-failure), 8 (software-failure), 9 (error-disabled), 10 (sfp-not-present).
v1.0
Rcv
errors
v1.0
Recv Pause
pause
v1.0
Resets
resets
v1.0
SQETest
errors
v1.0
Single Collision
errors
v1.0
Symbol
errors
v1.0
Total Bytes Rx
Bytes
v1.0
Total Bytes Tx
Bytes
v1.0
Total Packets Rx
packets
v1.0
Total Packets Tx
packets
v1.0
Under Size
errors
v1.0
Unicast Packets
Rx
packets
v1.0
Unicast Packets
Tx
packets
v1.0
Xmit
errors
v1.0
Xmit Pause
pause
v1.0
ifRole
State
The ifRole status of the ether port IO. Possible values are: 0 (unknown), 1
(network), 2 (server), 3 (mgmt), 4 (diag), 5 (storage), 6 (monitor), 7
(fcoe-storage), 8 (nas-storage), 9 (fcoe-nas-storage), 10 (fcoe-uplink), 11
(network-fcoe-uplink).
v1.0
operSpeed
State
The operation speed status of the ether port IO. Possible values are: 0
(indeterminate), 1 (1gbps), 2 (10gbps), 3 (20gbps), 4 (40gbps).
v1.0
utilization
Percent
v1.0
Administrative
State
State
The administrative status of the fibre channel port IO. Possible values are: 0
(enabled), 1 (disabled).
v1.0
Byte Rate Rx
mbps
v1.0
Byte Rate Tx
mbps
v1.0
Bytes Rx
Bytes
v1.0
Bytes Tx
Bytes
v1.0
Crc Rx
errors
v1.0
Discard Rx
errors
v1.0
Discard Tx
errors
v1.0
Link Failures
errors
v1.0
Overall Status
State
The overall status of the fibre channel port IO. Possible values are: 0
(indeterminate), 1 (up), 2 (admin-down), 3 (link-down), 4 (failed), 5 (no-license),
6 (link-up), 7 (hardware-failure), 8 (software-failure), 9 (error-disabled), 10
(sfp-not-present).
v1.0
Packets Rx
Packets
v1.0
Packets Tx
Packets
v1.0
Rx
errors
v1.0
Signal Losses
errors
v1.0
Sync Losses
errors
v1.0
Too Long Rx
errors
v1.0
Too Short Rx
errors
v1.0
Tx
errors
v1.0
ifRole
State
The ifRole status of the fibre channel port IO. Possible values are: 0 (unknown),
1 (network), 2 (server), 3 (mgmt), 4 (diag), 5 (storage), 6 (monitor), 7
(fcoe-storage), 8 (nas-storage), 9 (fcoe-nas-storage), 10 (fcoe-uplink), 11
(network-fcoe-uplink).
v1.0
operSpeed
State
The operation speed of the fibre channel port IO. Possible values are: 0
(indeterminate), 1 (1gbps), 2 (2gbps), 3 (4gbps), 4 (8gbps), 5 (auto).
v1.0
utilization
Percent
v1.0
Size
Integer
v1.0
Used
Percent
v1.0
VMHV
Status
State
The status of the VMHZ. Possible values are: 0 (unknown), 1 (online), 2 (offline).
v1.0
VMINSTANCE
Status
State
v1.0
LSSERVER
Assign State
State
v1.0
Assoc State
State
v1.0
Config State
State
v1.0
Overall Status
State
v1.0
% Used
Percent
v1.2
Assigned
Count
v1.2
Size
Count
v1.2
% Used
Percent
v1.2
Assigned
Count
v1.2
Size
Count
v1.2
% Used
Percent
v1.2
Assigned
Count
v1.2
Size
Count
v1.2
% Used
Percent
v1.2
Assigned
Count
v1.2
Size
Count
v1.2
% Used
Percent
v1.2
Assigned
Count
v1.0
Size
Count
v1.0
Admin State
State
The admin status of the fabric extender. Possible values are: 1 (acknowledged),
2 (re-acknowledge), 3 (decommission), 4 (remove).
v1.0
Config State
State
v1.0
Operability
State
The operability status of the fabric extender. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
STORAGEITEM
MACPOOLPOOL
UUIDPOOLPOOL
WWPNPOOL
WWNNPOOL
COMPUTEPOOL
FEX_EQUIPMENTFEX
FEX_EQUIPMENTFAN
FEX_EQUIPMENTPSU
Overall Status
State
The overall status of the fabric extender. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Power
State
The power status of the fabric extender. Possible values are: 0 (unknown), 1
(on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8
(power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the fabric extender. Possible values are: 0 (unknown), 1
(empty), 10 (equipped), 11 (missing), 12 (mismatch), 13 (equipped-not-primary),
20 (equipped-identity-unestablishable), 21 (mismatch-identity-unestablishable),
30 (inaccessible), 40 (unauthorized), 100 (not-supported).
v1.0
Thermal
State
The thermal status of the fabric extender. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Voltage
State
The voltage status of the fabric extender. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Operability
State
The operability status of the fabric extender fan. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Overall Status
State
The overall status of the fabric extender fan. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance status of the fabric extender fan. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Power
State
The power status of the fabric extender fan. Possible values are: 0 (unknown), 1
(on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8
(power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the fabric extender fan. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the fabric extender fan. Possible values are: 0 (unknown),
1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Voltage
State
The voltage status of the fabric extender fan. Possible values are: 0 (unknown),
1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Input Current
Amps
The input current of the fabric extender power supply unit measured in amps
v1.0
Input Power
Watts
The input power of the fabric extender power supply unit measured in watts
v1.0
Input Voltage
Volts
v1.0
Operability
State
The operability status of the fabric extender power supply unit. Possible values
are: 0 (unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
FEX_EQUIPMENTIOCARD
Overall Status
State
The overall status of the fabric extender power supply unit. Possible values are:
0 (unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance status of the fabric extender power supply unit. Possible
values are: 0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Power
State
The power status of the fabric extender power supply unit. Possible values are:
0 (unknown), 1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7
(degraded), 8 (power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the fabric extender power supply unit. Possible values
are: 0 (unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the fabric extender power supply unit. Possible values are:
0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Voltage
State
The voltage status of the fabric extender power supply unit. Possible values are:
0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Admin Power
State
State
The admin power status of the fabric extender IO card. Possible values are: 1
(policy), 2 (cycle-immediate), 3 (cycle-wait).
v1.4
Config State
State
The configuration status of the fabric extender IO card. Possible values are: 0
(un-initialized), 1 (un-acknowledged), 2 (unsupported-connectivity), 3 (ok), 4
(removing), 6 (ack-in-progress).
v1.4
Discovery
State
The discovery status of the fabric extender IO card. Possible values are: 0
(unknown), 1 (online), 2 (offline), 3 (discovered), 4 (unsupported-connectivity), 5
(auto-upgrading).
v1.4
Operability
State
The operability status of the fabric extender IO card. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.4
Overall Status
State
The overall status of the fabric extender IO card. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.4
Peer
Communication
Status
State
The peer communication status of the fabric extender IO card. Possible values
are: 0 (unknown), 1 (connected), 2 (disconnected).
v1.4
Performance
State
The performance status of the fabric extender IO card. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.4
Power
State
The power status of the fabric extender IO card. Possible values are: 0
(unknown), 1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7
(degraded), 8 (power-save), 9 (error), 100 (not-supported).
v1.4
Presence
State
The presence status of the fabric extender IO card. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.4
Thermal
State
The thermal status of the fabric extender IO card. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.4
FEX_ETHERRXSTATS
FEX_ETHERSWITCHINTFIO
Voltage
State
The voltage status of the fabric extender IO card. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.4
Administrative
State
State
The administrative state of the fabric extender ethernet. Possible values are: 0
(enabled), 1 (disabled).
v1.0
Align
errors
v1.0
Broadcast
Packets Rx
packets
v1.0
Broadcast
Packets Tx
packets
v1.0
Carrier Sense
errors
v1.0
Deferred Tx
errors
v1.0
Excess Collision
errors
v1.0
Fcs
errors
v1.0
Giants
errors
v1.0
Int Mac Rx
errors
v1.0
Int Mac Tx
errors
v1.0
Jumbo Packets
Rx
packets
v1.0
Jumbo Packets
Tx
packets
v1.0
Late Collision
errors
v1.0
Multi Collision
errors
v1.0
Multicast Packets
Rx
packets
v1.0
Multicast Packets
Tx
packets
v1.0
Out Discard
errors
v1.0
Overall Status
State
The overall status of the fabric extender ethernet. Possible values are: 0
(indeterminate), 1 (up), 2 (admin-down), 3 (link-down), 4 (failed), 5 (no-license),
6 (link-up), 7 (hardware-failure), 8 (software-failure), 9 (error-disabled), 10
(sfp-not-present).
v1.0
Rcv
errors
v1.0
Recv Pause
pause
v1.0
Resets
resets
v1.0
SQE test
errors
v1.0
Single Collision
errors
v1.0
Symbol
errors
v1.0
Total Bytes Rx
Bytes
v1.0
Total Bytes Tx
Bytes
v1.0
Total Packets Rx
packets
v1.0
Total Packets Tx
packets
v1.0
Under Size
errors
v1.0
Unicast Packets
Rx
packets
v1.0
Unicast Packets
Tx
packets
v1.0
Xmit
errors
v1.0
Xmit Pause
pause
v1.0
Acknowledged
State
The acknowledged status of the fabric extender ethernet switch interface IO.
Possible values are: 0 (un-initialized), 1 (un-acknowledged), 2
(unsupported-connectivity), 3 (ok), 4 (removing), 6 (ack-in-progress).
v1.0
Administrative
State
State
The administrative status of the fabric extender ethernet switch interface IO.
Possible values are: 0 (enabled), 1 (disabled).
v1.0
Discovery
State
The discovery status of the fabric extender ethernet switch interface IO. Possible
values are: 0 (absent), 1 (present), 2 (mis-connect).
v1.0
Overall Status
State
The overall status of the fabric extender ethernet switch interface IO. Possible
values are: 0 (indeterminate), 1 (up), 2 (admin-down), 3 (link-down), 4 (failed), 5
(no-license), 6 (link-up), 7 (hardware-failure), 8 (software-failure), 9
(error-disabled), 10 (sfp-not-present).
v1.0
RACK_COMPUTERACKUNIT
RACK_EQUIPMENTFANMODULE
Administrative
State
State
The administrative status of the computer rack unit. Possible values are: 1
(in-service), 2 (out-of-service).
v1.0
Association
State
The association status of the computer rack unit. Possible values are: 0 (none),
1 (establishing), 2 (associated), 3 (removing), 4 (failed), 5 (throttled).
v1.0
Availability
State
The availability status of the computer rack unit. Possible values are: 0
(unavailable), 1 (available).
v1.0
Consumed Power
Watts
v1.0
Front
Temperature
Celsius
v1.0
IO Hub 1
Temperature
Celsius
v1.0
IO Hub 2
Temperature
Celsius
v1.0
Input Current
Amps
v1.0
Input Voltage
Volts
v1.0
Internal
Temperature
Celsius
v1.0
Operability
State
The operability status of the computer rack unit. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Overall Status
State
The overall status of the computer rack unit. Possible values are: 0
(indeterminate), 1 (unassociated), 10 (ok), 11 (discovery), 12 (config), 13
(unconfig), 14 (power-off), 15 (restart), 20 (maintenance), 21 (test), 29
(compute-mismatch), 30 (compute-failed), 31 (degraded), 32 (discovery-failed),
33 (config-failure), 34 (unconfig-failed), 35 (test-failed), 36 (maintenance-failed),
40 (removed), 41 (disabled), 50 (inaccessible), 60 (thermal-problem), 61
(power-problem), 62 (voltage-problem), 63 (inoperable), 101 (decomissioning),
201 (bios-restore), 202 (cmos-reset), 203 (diagnostics), 204 (diagnostics-failed),
210 (pending-reboot), 211 (pending-reassociation).
v1.0
Power State
State
The power status of the computer rack unit. Possible values are: 0 (unknown), 1
(on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8
(power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the computer rack unit. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Rear
Temperature
Celsius
v1.0
Operability
State
The operability status of the rack fan module. Possible values are: 0 (unknown),
1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Overall Status
State
The overall status of the rack fan module. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The perfermance status of the rack fan module. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Power
State
The power status of the rack fan module. Possible values are: 0 (unknown), 1
(on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8
(power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the rack fan module. Possible values are: 0 (unknown), 1
(empty), 10 (equipped), 11 (missing), 12 (mismatch), 13 (equipped-not-primary),
20 (equipped-identity-unestablishable), 21 (mismatch-identity-unestablishable),
30 (inaccessible), 40 (unauthorized), 100 (not-supported).
v1.0
RACK_EQUIPMENTFAN
RACK_EQUIPMENTPSU
Thermal
State
The thermal status of the rack fan module. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Voltage
State
The voltage status of the rack fan module. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Operability
State
The operability status of the rack fan. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Overall Status
State
The overall status of the rack fan. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance status of the rack fan. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Power
State
The power status of the rack fan. Possible values are: 0 (unknown), 1 (on), 2
(test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8 (power-save), 9
(error), 100 (not-supported).
v1.0
Presence
State
The presence status of the rack fan. Possible values are: 0 (unknown), 1
(empty), 10 (equipped), 11 (missing), 12 (mismatch), 13 (equipped-not-primary),
20 (equipped-identity-unestablishable), 21 (mismatch-identity-unestablishable),
30 (inaccessible), 40 (unauthorized), 100 (not-supported).
v1.0
Speed
RPM
v1.0
Thermal
State
The thermal status of the rack fan. Possible values are: 0 (unknown), 1 (ok), 2
(upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Voltage
State
The voltage status of the rack fan. Possible values are: 0 (unknown), 1 (ok), 2
(upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Input Power
Watts
v1.0
Input Voltage
Volts
v1.0
Internal
Temperature
Celsius
v1.0
Operability
State
The operability status of the rack power supply unit. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Output Current
Amps
v1.0
Output Power
Watts
v1.0
Output Voltage
Volts
v1.0
Overall Status
State
The overall status of the rack power supply unit. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance status of the rack power supply unit. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
RACK_MEMORYARRAY
RACK_MEMORYUNITENVSTATS
RACK_PROCESSORENVSTATS
RACK_STORAGECONTROLLER
Power
State
The power status of the rack power supply unit. Possible values are: 0
(unknown), 1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7
(degraded), 8 (power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the rack power supply unit. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the rack power supply unit. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Voltage
State
The voltage status of the rack power supply unit. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Current Capacity
Megabytes
v1.0
Max Capacity
Megabytes
v1.0
Operability
State
The operability status of the rack memory array. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v2.3
Presence
State
The presence status of the rack memory array. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Temperature
Celsius
v1.0
Visibility
State
v2.3
CPU
Temperature
Celsius
v1.0
Input Current
Amps
v1.0
Operability
State
The operability status of the rack processor environment. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v2.3
Visibility
State
v2.3
Operability
State
The operability status of the rack storage controller. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Overall Status
State
The overall status of the rack storage controller. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance status of the rack storage controller. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Power
State
The power status of the rack storage controller. Possible values are: 0
(unknown), 1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7
(degraded), 8 (power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the rack storage controller. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the rack storage controller. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Voltage
State
The voltage status of the rack storage controller. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Operability
State
The operability status of the rack storage local disk. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Presence
State
The presence status of the rack storage local disk. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Operability
State
The operability status of the rack storage local unit. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Presence
State
The presence status of the rack storage local unit. Possible values are: 0
(unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Operability
State
The operability status of the rack storage RAID battery. Possible values are: 0
(unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5
(power-problem), 6 (removed), 7 (voltage-problem), 8 (thermal-problem), 9
(performance-problem), 10 (accessibility-problem), 11 (identity-unestablishable),
12 (bios-post-timeout), 13 (disabled), 51 (fabric-conn-problem), 52
(fabric-unsupported-conn), 81 (config), 82 (equipment-problem), 83
(decomissioning), 84 (chassis-limit-exceeded), 100 (not-supported), 101
(discovery), 102 (discovery-failed), 103 (identify), 104 (post-failure), 105
(upgrade-problem), 106 (peer-comm-problem), 107 (auto-upgrade).
v1.0
Presence
State
The presence status of the rack storage local RAID battery. Possible values are:
0 (unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Event
Integer
v1.0
UCS API
Available
Boolean
v1.0
UCS API
Response Time
v1.0
ETHERNET_UNCONFIGURED_PORTS
Aggregated
Bandwidth
mbps
v1.0
ETHERNET_NETWORK_PORTS
Aggregated
Bandwidth
mbps
v1.0
ETHERNET_SERVER_PORTS
Aggregated
Bandwidth
mbps
v1.0
ETHERNET_MGMT_PORTS
Aggregated
Bandwidth
mbps
v1.0
ETHERNET_DIAG_PORTS
Aggregated
Bandwidth
mbps
v1.0
ETHERNET_STORAGE_PORTS
Aggregated
Bandwidth
mbps
v1.0
ETHERNET_MONITOR_PORTS
Aggregated
Bandwidth
mbps
v1.0
RACK_STORAGELOCALDISK
RACK_STORAGELOCALLUN
RACK_STORAGERAIDBATTERY
RESOURCE
ETHERNET_FCOE_STORAGE_PORTS
Aggregated
Bandwidth
mbps
The aggregated bandwidth of the ethernet fibre channel over ethernet storage
ports in mbps
v1.0
ETHERNET_NAS_STORAGE_PORTS
Aggregated
Bandwidth
mbps
v1.0
ETHERNET_FCOE_NAS_STORAGE_PORTS
Aggregated
Bandwidth
mbps
The aggregated bandwidth of the ethernet fibre channel over ethernet NAS ports
in mbps
v1.0
ETHERNET_FCOE_UPLINK_PORTS
Aggregated
Bandwidth
mbps
The aggregated bandwidth of the ethernet fibre channel over ethernet uplink
ports in mbps
v1.0
ETHERNET_NETWORK_FCOE_UPLINK_PORTS Aggregated
Bandwidth
mbps
The aggregated bandwidth of the ethernet network fibre channel over ethernet
uplink ports in mbps
v1.0
FC_UNCONFIGURED_PORTS
Aggregated
Bandwidth
mbps
v1.0
FC_NETWORK_PORTS
Aggregated
Bandwidth
mbps
v1.0
FC_SERVER_PORTS
Aggregated
Bandwidth
mbps
v1.0
FC_MGMT_PORTS
Aggregated
Bandwidth
mbps
v1.0
FC_DIAG_PORTS
Aggregated
Bandwidth
mbps
v1.0
FC_STORAGE_PORTS
Aggregated
Bandwidth
mbps
v1.0
FC_MONITOR_PORTS
Aggregated
Bandwidth
mbps
v1.0
FC_FCOE_STORAGE_PORTS
Aggregated
Bandwidth
mbps
The aggregated bandwidth of the fibre channel over ethernet storage ports in
mbps
v1.0
FC_NAS_STORAGE_PORTS
Aggregated
Bandwidth
mbps
The aggregated bandwidth of the fibre channel NAS storage ports in mbps
v1.0
FC_FCOE_NAS_STORAGE_PORTS
Aggregated
Bandwidth
mbps
The aggregated bandwidth of the fibre channel over ethernet NAS storage ports
in mbps
v1.0
FC_FCOE_UPLINK_PORTS
Aggregated
Bandwidth
mbps
The aggregated bandwidth of the fibre channel over ethernet uplink ports in
mbps
v1.0
FC_NETWORK_FCOE_UPLINK_PORTS
Aggregated
Bandwidth
mbps
The aggregated bandwidth of the network of fibre channel over ethernet uplink
ports in mbps
v1.0
CHA_ADAPTOREXTETHIF_ERROR_RX
Bad CRC
packets
The bad cyclic redundancy check (CRC) Rx packets of the adaptor external
ethernet interface
v1.0
Bad Length
packets
v1.0
MAC Discarded
packets
v1.0
Bad CRC
packets
The bad cyclic redundancy check (CRC) Tx packets of the adaptor external
ethernet interface
v1.0
Bad Length
packets
v1.0
MAC Discarded
packets
v1.0
Broadcast
packets
v1.0
Multicast
packets
v1.0
Unicast
packets
v1.0
Broadcast
packets
v1.0
Multicast
packets
v1.0
Unicast
packets
v1.0
Greater Than Or
Equal To 9216
packets
The Rx packets of the adaptor external ethernet interface greater than or equal
to 9216
v1.0
packets
The Rx packets of the adaptor external ethernet interface greater than or equal
to 2048
v1.0
packets
The Rx packets of the adaptor external ethernet interface less than 4096
v1.0
packets
The Rx packets of the adaptor external ethernet interface less than 8192
v1.0
packets
The Rx packets of the adaptor external ethernet interface less than 9216
v1.0
Less Than Or
Equal To 1518
packets
The Rx packets of the adaptor external ethernet interface less than or equal To
1518
v1.0
No Breakdown
Greater Than
1518
packets
v1.0
Greater Than Or
Equal To 9216
packets
The Tx packets of the adaptor external ethernet interface greater than or equal
to 9216
v1.0
CHA_ADAPTOREXTETHIF_ERROR_TX
CHA_ADAPTOREXTETHIF_COMM_RX
CHA_ADAPTOREXTETHIF_COMM_TX
CHA_ADAPTOREXTETHIF_LARGE_RX
CHA_ADAPTOREXTETHIF_LARGE_TX
CHA_ADAPTOREXTETHIF_SMALL_RX
CHA_ADAPTOREXTETHIF_SMALL_TX
CHA_ADAPTOREXTETHIF_OUTSIZED_RX
CHA_ADAPTOREXTETHIF_OUTSIZED_TX
CHA_ADAPTOREXTETHIF_PACKETS_RX
CHA_ADAPTOREXTETHIF_PACKETS_TX
CHA_NIC_ADAPTORhostTHIF_ERROR_RX
packets
The Tx packets of the adaptor external ethernet interface less than 2048
v1.0
packets
The Tx packets of the adaptor external ethernet interface less than 4096
v1.0
packets
The Tx packets of the adaptor external ethernet interface less than 8192
v1.0
packets
The Tx packets of the adaptor external ethernet interface less than 9216
v1.0
Less Than Or
Equal To 1518
packets
The Tx packets of the adaptor external ethernet interface less than or equal to
1518
v1.0
No Breakdown
Greater Than
1518
packets
v1.0
Equal To 64
packets
v1.0
packets
The Rx packets of the adaptor external ethernet interface less than 1024
v1.0
packets
The Rx packets of the adaptor external ethernet interface less than 128
v1.0
packets
The Rx packets of the adaptor external ethernet interface less than 256
v1.0
packets
The Rx packets of the adaptor external ethernet interface less than 512
v1.0
Less Than 64
packets
v1.0
Equal To 64
packets
v1.0
packets
The Tx packets of the adaptor external ethernet interface less than 1024
v1.0
packets
The Tx packets of the adaptor external ethernet interface less than 128
v1.0
packets
The Tx packets of the adaptor external ethernet interface less than 256
v1.0
packets
The Tx packets of the adaptor external ethernet interface less than 512
v1.0
Less Than 64
packets
v1.0
Oversized
packets
v1.0
Oversized Bad
CRC
packets
The oversized bad cyclic redundancy check (CRC) Rx packets of the adaptor
external ethernet interface
v1.0
Oversized Good
CRC
packets
The oversized good cyclic redundancy check (CRC) Rx packets of the adaptor
external ethernet interface
v1.0
Undersized Bad
CRC
packets
The undersized bad cyclic redundancy check (CRC) Rx packets of the adaptor
external ethernet interface
v1.0
Undersized Good
CRC
packets
The undersized good cyclic redundancy check (CRC) Rx packets of the adaptor
external ethernet interface
v1.0
Oversized
packets
v1.0
Oversized Bad
CRC
packets
The oversized bad cyclic redundancy check (CRC) Tx packets of the adaptor
external ethernet interface
v1.0
Oversized Good
CRC
packets
The oversized good cyclic redundancy check (CRC) Tx packets of the adaptor
external ethernet interface
v1.0
Undersized Bad
CRC
packets
The undersized bad cyclic redundancy check (CRC) Tx packets of the adaptor
external ethernet interface
v1.0
Undersized Good
CRC
packets
The undersized good cyclic redundancy check (CRC) Tx packets of the adaptor
external ethernet interface
v1.0
Good
packets
v1.0
PPP
packets
v1.0
Pause
packets
v1.0
Per Priority
packets
v1.0
Total
packets
v1.0
VLAN
packets
v1.0
Good
packets
v1.0
PPP
packets
v1.0
Pause
packets
v1.0
Per Priority
packets
v1.0
Total
packets
v1.0
VLAN
packets
v1.0
Bad CRC
packets
The bad cyclic redundancy check (CRC) Rx packets of the NIC adaptor host
ethernet interface
v1.0
Bad Length
packets
The bad length Rx packets of the NIC adaptor host ethernet interface
v1.0
MAC Discarded
packets
The MAC discarded Rx packets of the NIC adaptor host ethernet interface
v1.0
CHA_NIC_ADAPTORhostTHIF_ERROR_TX
CHA_NIC_ADAPTORhostTHIF_COMM_RX
CHA_NIC_ADAPTORhostTHIF_COMM_TX
CHA_NIC_ADAPTORhostTHIF_LARGE_RX
CHA_NIC_ADAPTORhostTHIF_LARGE_TX
CHA_NIC_ADAPTORhostTHIF_SMALL_RX
CHA_NIC_ADAPTORhostTHIF_SMALL_TX
CHA_NIC_ADAPTORhostTHIF_OUTSIZED_RX
CHA_NIC_ADAPTORhostTHIF_OUTSIZED_TX
Bad CRC
packets
The bad cyclic redundancy check (CRC) Tx packets of the NIC adaptor host
ethernet interface
v1.0
Bad Length
packets
The bad length Tx packets of the NIC adaptor host ethernet interface
v1.0
MAC Discarded
packets
The MAC discarded Tx packets of the NIC adaptor host ethernet interface
v1.0
Broadcast
packets
v1.0
Multicast
packets
v1.0
Unicast
packets
v1.0
Broadcast
packets
v1.0
Multicast
packets
v1.0
Unicast
packets
v1.0
Greater Than Or
Equal To 9216
packets
The NIC adaptor host ethernet interface Rx packets greater than or equal to
9216
v1.0
packets
The NIC adaptor host ethernet interface Rx packets less than 2048
v1.0
packets
The NIC adaptor host ethernet interface Rx packets less than 4096
v1.0
packets
The NIC adaptor host ethernet interface Rx packets less than 8192
v1.0
packets
The NIC adaptor host ethernet interface Rx packets less than 9216
v1.0
Less Than Or
Equal To 1518
packets
The NIC adaptor host ethernet interface Rx packets less than or equal to 1518
v1.0
No Breakdown
Greater Than
1518
packets
The NIC adaptor host ethernet interface Rx packet with no breakdown greater
than 1518
v1.0
Greater Than Or
Equal To 9216
packets
The NIC adaptor host ethernet interface Tx packets greater than or equal to
9216
v1.0
packets
The NIC adaptor host ethernet interface Tx packets less than 2048
v1.0
packets
The NIC adaptor host ethernet interface Tx packets less than 4096
v1.0
packets
The NIC adaptor host ethernet interface Tx packets less than 8192
v1.0
packets
The NIC adaptor host ethernet interface Tx packets less than 9216
v1.0
Less Than Or
Equal To 1518
packets
The NIC adaptor host ethernet interface Tx packets less than or equal to 1518
v1.0
No Breakdown
Greater Than
1518
packets
The NIC adaptor host ethernet interface Tx packets with no breakdown greater
than 1518
v1.0
Equal To 64
packets
v1.0
packets
The NIC adaptor host ethernet interface Rx packets less than 1024
v1.0
packets
The NIC adaptor host ethernet interface Rx packets less than 128
v1.0
packets
The NIC adaptor host ethernet interface Rx packets less than 256
v1.0
packets
The NIC adaptor host ethernet interface Rx packets less than 512
v1.0
Less Than 64
packets
v1.0
Equal To 64
packets
v1.0
packets
The NIC adaptor host ethernet interface Tx packets less than 1024
v1.0
packets
The NIC adaptor host ethernet interface Tx packets less than 128
v1.0
packets
The NIC adaptor host ethernet interface Tx packets less than 256
v1.0
packets
The NIC adaptor host ethernet interface Tx packets less than 512
v1.0
Less Than 64
packets
v1.0
Oversized
packets
v1.0
Oversized Bad
CRC
packets
The oversized bad cyclic redundancy check (CRC) Rx packets of the NIC
adaptor host ethernet interface
v1.0
Oversized Good
CRC
packets
The oversized good cyclic redundancy check (CRC) Rx packets of the NIC
adaptor host ethernet interface
v1.0
Undersized Bad
CRC
packets
The undersized bad cyclic redundancy check (CRC) Rx packets of the NIC
adaptor host ethernet interface
v1.0
Undersized Good
CRC
packets
The undersized good cyclic redundancy check (CRC) Rx packets of the NIC
adaptor host ethernet interface
v1.0
Oversized
packets
v1.0
Oversized Bad
CRC
packets
The oversized bad cyclic redundancy check (CRC) Tx packets of the NIC
adaptor host ethernet interface
v1.0
Oversized Good
CRC
packets
The oversized good cyclic redundancy check (CRC) Tx packets of the NIC
adaptor host ethernet interface
v1.0
CHA_NIC_ADAPTORhostTHIF_PACKETS_RX
CHA_NIC_ADAPTORhostTHIF_PACKETS_TX
CHA_NIC_ADAPTORhostTHIF_VNIC
RACK_ADAPTORUNIT
RACK_ADAPTOREXTETHIF
Undersized Bad
CRC
packets
The undersized bad cyclic redundancy check (CRC) Tx packets of the NIC
adaptor host ethernet interface
v1.0
Undersized Good
CRC
packets
The undersized good cyclic redundancy check (CRC) Tx packets of the NIC
adaptor host ethernet interface
v1.0
Good
packets
v1.0
PPP
packets
The point-to-point protocol (PPP) Rx packets of the NIC adaptor host ethernet
interface
v1.0
Pause
packets
v1.0
Per Priority
packets
The per priority Rx packets of the NIC adaptor host ethernet interface
v1.0
Total
packets
v1.0
VLAN
packets
v1.0
Good
packets
v1.0
PPP
packets
The point-to-point protocol (PPP) Tx packets of the NIC adaptor host ethernet
interface
v1.0
Pause
packets
v1.0
Per Priority
packets
The per priority Tx packets of the NIC adaptor host ethernet interface
v1.0
Total
packets
v1.0
VLAN
packets
v1.0
Rx (bytes)
Bytes
v1.0
Rx (packets)
packets
v1.0
Rx Dropped
packets
The Rx packets dropped of the NIC adaptor host ethernet interface VNIC
v1.0
Rx Errors
packets
v1.0
Tx (bytes)
Bytes
v1.0
Tx (packets)
packets
v1.0
Tx Dropped
packets
v1.0
Tx Errors
packets
v1.0
Operability
State
The operability status of the rack adaptor unit. Possible values are: 0 (unknown),
1 (operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Overall Status
State
The overall status of the rack adaptor unit. Possible values are: 0 (unknown), 1
(operable), 2 (inoperable), 3 (degraded), 4 (powered-off), 5 (power-problem), 6
(removed), 7 (voltage-problem), 8 (thermal-problem), 9 (performance-problem),
10 (accessibility-problem), 11 (identity-unestablishable), 12 (bios-post-timeout),
13 (disabled), 51 (fabric-conn-problem), 52 (fabric-unsupported-conn), 81
(config), 82 (equipment-problem), 83 (decomissioning), 84
(chassis-limit-exceeded), 100 (not-supported), 101 (discovery), 102
(discovery-failed), 103 (identify), 104 (post-failure), 105 (upgrade-problem), 106
(peer-comm-problem), 107 (auto-upgrade).
v1.0
Performance
State
The performance status of the rack adaptor unit. Possible values are: 0
(unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Power
State
The power status of the rack adaptor unit. Possible values are: 0 (unknown), 1
(on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7 (degraded), 8
(power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the rack adaptor unit. Possible values are: 0 (unknown),
1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the rack adaptor unit. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Voltage
State
The voltage status of the rack adaptor unit. Possible values are: 0 (unknown), 1
(ok), 2 (upper-non-recoverable), 3 (upper-critical), 4 (upper-non-critical), 5
(lower-non-critical), 6 (lower-critical), 7 (lower-non-recoverable), 100
(not-supported).
v1.0
Administrative
State
State
v1.0
RACK_ADAPTORHOSTFCIF
RACK_ADAPTORhostTHIF
Overall Status
State
The overall status of the adaptor external ethernet interface. Status Possible
values are: 0 (indeterminate), 1 (up), 2 (admin-down), 3 (link-down), 4 (failed), 5
(no-license), 6 (link-up), 7 (hardware-failure), 8 (software-failure), 9
(error-disabled), 10 (sfp-not-present).
v1.0
Administrative
State
State
The administrative status of the rack adaptor host fibre channel interface.
Possible values are: 0 (enabled), 44 (reset-connectivity-active), 45
(reset-connectivity-passive), 46 (reset-connectivity).
v1.0
Operability
State
The operability status of the rack adaptor host fibre channel interface. Possible
values are: 0 (unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4
(powered-off), 5 (power-problem), 6 (removed), 7 (voltage-problem), 8
(thermal-problem), 9 (performance-problem), 10 (accessibility-problem), 11
(identity-unestablishable), 12 (bios-post-timeout), 13 (disabled), 51
(fabric-conn-problem), 52 (fabric-unsupported-conn), 81 (config), 82
(equipment-problem), 83 (decomissioning), 84 (chassis-limit-exceeded), 100
(not-supported), 101 (discovery), 102 (discovery-failed), 103 (identify), 104
(post-failure), 105 (upgrade-problem), 106 (peer-comm-problem), 107
(auto-upgrade).
v1.0
Overall Status
State
The overall status of the rack adaptor host fibre channel interface. Possible
values are: 0 (unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4
(powered-off), 5 (power-problem), 6 (removed), 7 (voltage-problem), 8
(thermal-problem), 9 (performance-problem), 10 (accessibility-problem), 11
(identity-unestablishable), 12 (bios-post-timeout), 13 (disabled), 51
(fabric-conn-problem), 52 (fabric-unsupported-conn), 81 (config), 82
(equipment-problem), 83 (decomissioning), 84 (chassis-limit-exceeded), 100
(not-supported), 101 (discovery), 102 (discovery-failed), 103 (identify), 104
(post-failure), 105 (upgrade-problem), 106 (peer-comm-problem), 107
(auto-upgrade).
v1.0
Performance
State
The performance status of the rack adaptor host fibre channel interface.
Possible values are: 0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3
(upper-critical), 4 (upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Power
State
The power status of the rack adaptor host fibre channel interface. Possible
values are: 0 (unknown), 1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty),
7 (degraded), 8 (power-save), 9 (error), 100 (not-supported).
v1.0
Presence
State
The presence status of the rack adaptor host fibre channel interface. Possible
values are: 0 (unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch),
13 (equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the rack adaptor host fibre channel interface. Possible
values are: 0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Voltage
State
The voltage status of the rack adaptor host fibre channel interface. Possible
values are: 0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Administrative
State
State
The administrative status of the rack adaptor host ethernet interface. Possible
values are: 0 (enabled), 44 (reset-connectivity-active), 45
(reset-connectivity-passive), 46 (reset-connectivity).
v1.0
Operability
State
The operability status of the rack adaptor host ethernet interface. Possible
values are: 0 (unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4
(powered-off), 5 (power-problem), 6 (removed), 7 (voltage-problem), 8
(thermal-problem), 9 (performance-problem), 10 (accessibility-problem), 11
(identity-unestablishable), 12 (bios-post-timeout), 13 (disabled), 51
(fabric-conn-problem), 52 (fabric-unsupported-conn), 81 (config), 82
(equipment-problem), 83 (decomissioning), 84 (chassis-limit-exceeded), 100
(not-supported), 101 (discovery), 102 (discovery-failed), 103 (identify), 104
(post-failure), 105 (upgrade-problem), 106 (peer-comm-problem), 107
(auto-upgrade).
v1.0
Overall Status
State
The overall status status of the rack adaptor host ethernet interface. Possible
values are: 0 (unknown), 1 (operable), 2 (inoperable), 3 (degraded), 4
(powered-off), 5 (power-problem), 6 (removed), 7 (voltage-problem), 8
(thermal-problem), 9 (performance-problem), 10 (accessibility-problem), 11
(identity-unestablishable), 12 (bios-post-timeout), 13 (disabled), 51
(fabric-conn-problem), 52 (fabric-unsupported-conn), 81 (config), 82
(equipment-problem), 83 (decomissioning), 84 (chassis-limit-exceeded), 100
(not-supported), 101 (discovery), 102 (discovery-failed), 103 (identify), 104
(post-failure), 105 (upgrade-problem), 106 (peer-comm-problem), 107
(auto-upgrade).
v1.0
Performance
State
The performance status of the rack adaptor host ethernet interface. Possible
values are: 0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Power
State
The power status of the rack adaptor host ethernet interface. Possible values
are: 0 (unknown), 1 (on), 2 (test), 3 (off), 4 (online), 5 (offline), 6 (offduty), 7
(degraded), 8 (power-save), 9 (error), 100 (not-supported).
v1.0
RACK_ADAPTOREXTETHIF_ERROR_RX
RACK_ADAPTOREXTETHIF_ERROR_TX
RACK_ADAPTOREXTETHIF_COMM_RX
RACK_ADAPTOREXTETHIF_COMM_TX
RACK_ADAPTOREXTETHIF_LARGE_RX
RACK_ADAPTOREXTETHIF_LARGE_TX
RACK_ADAPTOREXTETHIF_SMALL_RX
RACK_ADAPTOREXTETHIF_SMALL_TX
Presence
State
The presence status of the rack adaptor host ethernet interface. Possible values
are: 0 (unknown), 1 (empty), 10 (equipped), 11 (missing), 12 (mismatch), 13
(equipped-not-primary), 20 (equipped-identity-unestablishable), 21
(mismatch-identity-unestablishable), 30 (inaccessible), 40 (unauthorized), 100
(not-supported).
v1.0
Thermal
State
The thermal status of the rack adaptor host ethernet interface. Possible values
are: 0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Voltage
State
The voltage status of the rack adaptor host ethernet interface. Possible values
are: 0 (unknown), 1 (ok), 2 (upper-non-recoverable), 3 (upper-critical), 4
(upper-non-critical), 5 (lower-non-critical), 6 (lower-critical), 7
(lower-non-recoverable), 100 (not-supported).
v1.0
Bad CRC
packets
The bad cyclic redundancy check (CRC) Rx packets of the rack adaptor external
ethernet interface
v1.0
Bad Length
packets
The bad length Rx packets of the rack adaptor external ethernet interface
v1.0
MAC Discarded
packets
The MAC discarded Rx packets of the rack adaptor external ethernet interface
v1.0
Bad CRC
packets
The bad cyclic redundancy check (CRC) Tx packets of the rack adaptor external
ethernet interface
v1.0
Bad Length
packets
The bad length Tx packets of the rack adaptor external ethernet interface
v1.0
MAC Discarded
packets
The MAC discarded Tx packets of the rack adaptor external ethernet interface
v1.0
Broadcast
packets
v1.0
Multicast
packets
v1.0
Unicast
packets
v1.0
Broadcast
packets
v1.0
Multicast
packets
v1.0
Unicast
packets
v1.0
Greater Than Or
Equal To 9216
packets
The rack adaptor external ethernet interface Rx packets greater than or equal to
9216
v1.0
packets
The rack adaptor external ethernet interface Rx packets less than 2048
v1.0
packets
The rack adaptor external ethernet interface Rx packets less than 4096
v1.0
packets
The rack adaptor external ethernet interface Rx packets less than 8192
v1.0
packets
The rack adaptor external ethernet interface Rx packets less than 9216
v1.0
Less Than Or
Equal To 1518
packets
The rack adaptor external ethernet interface Rx packets less than or equal to
1518
v1.0
No Breakdown
Greater Than
1518
packets
v1.0
Greater Than Or
Equal To 9216
packets
The rack adaptor external ethernet interface Tx packets greater than or equal to
9216
v1.0
packets
The rack adaptor external ethernet interface Tx packets less than 2048
v1.0
packets
The rack adaptor external ethernet interface Tx packets less than 4096
v1.0
packets
The rack adaptor external ethernet interface Tx packets less than 8192
v1.0
packets
The rack adaptor external ethernet interface Tx packets less than 9216
v1.0
Less Than Or
Equal To 1518
packets
The rack adaptor external ethernet interface Tx packets less than or equal to
1518
v1.0
No Breakdown
Greater Than
1518
packets
v1.0
Equal To 64
packets
The rack adaptor external ethernet interface Rx packets that are equal to 64
v1.0
packets
The rack adaptor external ethernet interface Rx packets that are less than 1024
v1.0
packets
The rack adaptor external ethernet interface Rx packets that are less than 128
v1.0
packets
The rack adaptor external ethernet interface Rx packets that are less than 256
v1.0
packets
The rack adaptor external ethernet interface Rx packets that are less than 512
v1.0
Less Than 64
packets
The rack adaptor external ethernet interface Tx packets that are less than 64
v1.0
Equal To 64
packets
The rack adaptor external ethernet interface Tx packets that are equal to 64
v1.0
packets
The rack adaptor external ethernet interface Tx packets that are less than 1024
v1.0
packets
The rack adaptor external ethernet interface Tx packets that are less than 128
v1.0
packets
The rack adaptor external ethernet interface Tx packets that are less than 256
v1.0
packets
The rack adaptor external ethernet interface Tx packets that are less than 512
v1.0
RACK_ADAPTOREXTETHIF_OUTSIZED_RX
RACK_ADAPTOREXTETHIF_OUTSIZED_TX
RACK_ADAPTOREXTETHIF_PACKETS_RX
RACK_ADAPTOREXTETHIF_PACKETS_TX
RACK_NIC_ADAPTORhostTHIF_ERROR_RX
RACK_NIC_ADAPTORhostTHIF_ERROR_TX
RACK_NIC_ADAPTORhostTHIF_COMM_RX
RACK_NIC_ADAPTORhostTHIF_COMM_TX
RACK_NIC_ADAPTORhostTHIF_LARGE_RX
Less Than 64
packets
The rack adaptor external ethernet interface Tx packets that are less than 64
v1.0
Oversized
packets
v1.0
Oversized Bad
CRC
packets
The oversized bad cyclic redundancy check (CRC) Rx packets of the rack
adaptor external ethernet interface
v1.0
Oversized Good
CRC
packets
The oversized good cyclic redundancy check (CRC) Rx packets of the rack
adaptor external ethernet interface
v1.0
Undersized Bad
CRC
packets
The undersized bad cyclic redundancy check (CRC) Rx packets of the rack
adaptor external ethernet interface
v1.0
Undersized Good
CRC
packets
The undersized good cyclic redundancy check (CRC) Rx packets of the rack
adaptor external ethernet interface
v1.0
Oversized
packets
v1.0
Oversized Bad
CRC
packets
The oversized bad cyclic redundancy check (CRC) Tx packets of the rack
adaptor external ethernet interface
v1.0
Oversized Good
CRC
packets
The oversized good cyclic redundancy check (CRC) Tx packets of the rack
adaptor external ethernet interface
v1.0
Undersized Bad
CRC
packets
The undersized bad cyclic redundancy check (CRC) Tx packets of the rack
adaptor external ethernet interface
v1.0
Undersized Good
CRC
packets
The undersized good cyclic redundancy check (CRC) Tx packets of the rack
adaptor external ethernet interface
v1.0
Good
packets
v1.0
PPP
packets
The point-to-point protocol (PPP) Rx packets of the rack adaptor external ethern
et interface
v1.0
Pause
packets
v1.0
Per Priority
packets
The per priority Rx packets of the rack adaptor external ethernet interface
v1.0
Total
packets
v1.0
VLAN
packets
v1.0
Good
packets
v1.0
PPP
packets
The point-to-point protocol (PPP) Tx packets of the rack adaptor external ethern
et interface
v1.0
Pause
packets
v1.0
Per Priority
packets
The per priority Tx packets of the rack adaptor external ethernet interface
v1.0
Total
packets
v1.0
VLAN
packets
v1.0
Bad CRC
packets
The bad cyclic redundancy check (CRC) Rx packets of the rack NIC adaptor
host ethernet interface
v1.0
Bad Length
packets
The bad length Rx packets of the rack NIC adaptor host ethernet interface
v1.0
MAC Discarded
packets
The MAC discarded Rx packets of the rack NIC adaptor host ethernet interface
v1.0
Bad CRC
packets
The bad cyclic redundancy check (CRC) Tx packets of the rack NIC adaptor
host ethernet interface
v1.0
Bad Length
packets
The bad Length Tx packets of the rack NIC adaptor host ethernet interface
v1.0
MAC Discarded
packets
The MAC discarded Tx packets of the rack NIC adaptor host ethernet interface
v1.0
Broadcast
packets
The broadcast Rx packets of the rack NIC adaptor host ethernet interface
v1.0
Multicast
packets
The multicast Rx packets of the rack NIC adaptor host ethernet interface
v1.0
Unicast
packets
The unicast Rx packets of the rack NIC adaptor host ethernet interface
v1.0
Broadcast
packets
The broadcast Tx packets of the rack NIC adaptor host ethernet interface
v1.0
Multicast
packets
The multicast Tx packets of the rack NIC adaptor host ethernet interface
v1.0
Unicast
packets
The unicast Tx packets of the rack NIC adaptor host ethernet interface
v1.0
Greater Than Or
Equal To 9216
packets
The rack NIC adaptor host ethernet interface large Rx packets that are greater
than or equal to 9216
v1.0
packets
The rack NIC adaptor host ethernet interface large Rx packets that are less than
2048
v1.0
packets
The rack NIC adaptor host ethernet interface large Rx packets that are less than
4096
v1.0
packets
The rack NIC adaptor host ethernet interface large Rx packets that are less than
8192
v1.0
packets
The rack NIC adaptor host ethernet interface large Rx packets that are less than
9216
v1.0
RACK_NIC_ADAPTORhostTHIF_LARGE_TX
RACK_NIC_ADAPTORhostTHIF_SMALL_RX
RACK_NIC_ADAPTORhostTHIF_SMALL_TX
RACK_NIC_ADAPTORhostTHIF_OUTSIZED_RX
RACK_NIC_ADAPTORhostTHIF_OUTSIZED_TX
RACK_NIC_ADAPTORhostTHIF_PACKETS_RX
Less Than Or
Equal To 1518
packets
The rack NIC adaptor host ethernet interface large Rxt packets that are less
than or equal to 1518
v1.0
No Breakdown
Greater Than
1518
packets
The rack NIC adaptor host ethernet interface large Rx packets that have no
breakdown greater than 1518
v1.0
Greater Than Or
Equal To 9216
packets
The rack NIC adaptor host ethernet interface large Tx packets that are greater
than or equal to 9216
v1.0
packets
The rack NIC adaptor host ethernet interface large Tx packets that are less than
2048
v1.0
packets
The rack NIC adaptor host ethernet interface large Tx packets that are less than
4096
v1.0
packets
The rack NIC adaptor host ethernet interface large Tx packets that are less than
8192
v1.0
packets
The rack NIC adaptor host ethernet interface large Tx packets that are less than
9216
v1.0
Less Than Or
Equal To 1518
packets
The rack NIC adaptor host ethernet interface large Tx packets that are less than
or equal to 1518
v1.0
No Breakdown
Greater Than
1518
packets
The rack NIC adaptor host ethernet interface large Tx packets that have no
breakdown greater than 1518
v1.0
Equal To 64
packets
The rack NIC adaptor host ethernet interface small Rx packets that are Equal To
64
v1.0
packets
The rack NIC adaptor host ethernet interface small Rx packets that are less than
1024
v1.0
packets
The rack NIC adaptor host ethernet interface small Rx packets that are less than
128
v1.0
packets
The rack NIC adaptor host ethernet interface small Rx packets that are less than
256
v1.0
packets
The rack NIC adaptor host ethernet interface small Rx packets that are less than
512
v1.0
Less Than 64
packets
The rack NIC adaptor host ethernet interface small Rx packets that are less than
64
v1.0
Equal To 64
packets
The rack NIC adaptor host ethernet interface small Tx packets that are equal To
64
v1.0
packets
The rack NIC adaptor host ethernet interface small Tx packets that are less than
1024
v1.0
packets
The rack NIC adaptor host ethernet interface small Tx packets that are less than
128
v1.0
packets
The rack NIC adaptor host ethernet interface small Tx packets that are less than
256
v1.0
packets
The rack NIC adaptor host ethernet interface small Tx packets that are less than
512
v1.0
Less Than 64
packets
The rack NIC adaptor host ethernet interface small Tx packets that are less than
64
v1.0
Oversized
packets
The oversized Rx packets of the rack NIC adaptor host ethernet interface
v1.0
Oversized Bad
CRC
packets
The oversized Rx bad cyclic redundancy check (CRC) packets of the rack NIC
adaptor host ethernet interface
v1.0
Oversized Good
CRC
packets
The oversized Rx good cyclic redundancy check (CRC) packets of the rack NIC
adaptor host ethernet interface
v1.0
Undersized Bad
CRC
packets
The undersized Rx bad cyclic redundancy check (CRC) packets of the rack NIC
adaptor host ethernet interface
v1.0
Undersized Good
CRC
packets
The undersized Rx good cyclic redundancy check (CRC) packets of the rack
NIC adaptor host ethernet interface
v1.0
Oversized
packets
The oversized Tx packets of the rack NIC adaptor host ethernet interface
v1.0
Oversized Bad
CRC
packets
The oversized Tx bad cyclic redundancy check (CRC) packets of the rack NIC
adaptor host ethernet interface
v1.0
Oversized Good
CRC
packets
The oversized Tx good cyclic redundancy check (CRC) packets of the rack NIC
adaptor host ethernet interface
v1.0
Undersized Bad
CRC
packets
The undersized Tx bad cyclic redundancy check (CRC) packets of the rack NIC
adaptor host ethernet interface
v1.0
Undersized Good
CRC
packets
The undersized Tx good cyclic redundancy check (CRC) packets of the rack NIC v1.0
adaptor host ethernet interface
Good
packets
The good Rx packets of the rack NIC adaptor host ethernet interface
v1.0
RACK_NIC_ADAPTORhostTHIF_PACKETS_TX
RACK_NIC_ADAPTORhostTHIF_VNIC
PPP
packets
The point-to-point protocol (PPP) Rx packets of the rack NIC adaptor host
ethernet interface
v1.0
Pause
packets
The paused Rx packets of the rack NIC adaptor host ethernet interface
v1.0
Per Priority
packets
The per priority Rx packets of the rack NIC adaptor host ethernet interface
v1.0
Total
packets
The total Rx packets of the rack NIC adaptor host ethernet interface
v1.0
VLAN
packets
v1.0
Good
packets
The Tx packet status of the of the rack NIC adaptor host ethernet interface
v1.0
PPP
packets
The point-to-point protocol (PPP) Tx packets of the rack NIC adaptor host
ethernet interface
v1.0
Pause
packets
The paused Tx packets of the rack NIC adaptor host ethernet interface
v1.0
Per Priority
packets
The per priority packets of the rack NIC adaptor host ethernet interface
v1.0
Total
packets
The total Tx packets of the rack NIC adaptor host ethernet interface
v1.0
VLAN
packets
The Tx packets of the rack NIC adaptor host ethernet interface VLAN
v1.0
Rx (bytes)
Bytes
The Rx bytes of the rack NIC adaptor host ethernet interface VNIC
v1.0
Rx (packets)
packets
The Rx packets of the rack NIC adaptor host ethernet interface VNIC
v1.0
Rx Dropped
packets
The dropped Rx packets of the rack NIC adaptor host ethernet interface VNIC
v1.0
Rx Errors
errors
The Rx errors of the rack NIC adaptor host ethernet interface VNIC
v1.0
Tx (bytes)
Bytes
The Tx bytes of the rack NIC adaptor host ethernet interface VNIC
v1.0
Tx (packets)
packets
The Tx packets of the rack NIC adaptor host ethernet interface VNIC
v1.0
Tx Dropped
packets
The dropped Tx packets of the rack NIC adaptor host ethernet interface VNIC
v1.0
Tx Errors
errors
The Tx errors of the rack NIC adaptor host ethernet interface VNIC
v1.0
The 2.10 and later versions of the probe include the standard static alarm threshold parameters.
Notes:
The standard static alarm threshold parameters are supported for only the 2.10 version or later of the probe using CA UIM 8.2.
All the probe specific alarm configurations in the probe monitors are replaced by Static alarms and Time over Threshold confi
gurations.
More information:
clariion (Clariion Monitoring) Release Notes
Prerequisites
Verify that required hardware, software, and information is available before you configure the probe. The clariion probe requires Navisphere CLI
executable (naviseccli.exe) to gather data about the clariion systems. You are required to install the Navisphere CLI on the box where the clariion
probe is deployed. For more information, see clariion (Clariion Monitoring) Release Notes.
6.
Storage Processor A: specifies the host name or IP address of Storage Processor A.
Storage Processor B: specifies the host name or IP address of Storage Processor B.
Source: specifies the source to be used for QoS metrics and alarms. This is typically the IP address of Storage Processor A.
Username: enables you to enter a valid username to be used by the probe to access the clariion system.
Password: enables you to enter a valid password to be used by the probe to access the clariion system.
Active: activates the profile for monitoring. By default, the profile is active.
Interval (seconds): specifies the time interval (in seconds) after which the probe collects the data from the clariion system for the
specific profile.
Default: 600
Alarm Message: specifies the alarm to be generated when the profile is not responding. For example, the profile does not respond if
there is a connection failure or inventory update failure.
Default: ResourceCritical
Send Faults as Alarm: If selected, the probe sends the alarm for all the faults generated in the system.
Send Events as Alarm: If selected, the probe sends the alarm for all the events generated in the system.
Login Scope: defines the login scope for the Clariion system user. There can be following three types of clariion system users:
0(Global): A global user has access to the entire domain.
1(Local): A local user has access to a single system in the domain.
2(LDAP): LDAP users can access any system that uses the LDAP server to authenticate users.
Minimum Event Severity: enables you to configure the minimum event severity.
7. Click Submit.
The new monitoring profile is created and displayed under the clariion node. The monitoring categories for the storage components are
displayed as nodes below the profile name node.
8. Verify the connection between the probe and the storage server through the Verify Selection button under the Actions drop down.
9. Configure the alarms and thresholds in the monitors for the desired storage component. For example, to configure the alarms and
thresholds in the monitors for the Fast Cache node:
a. Navigate to clariion > Profile Name > Host Name > Fast Cache > Monitors
b. The performance counters are visible in a tabular form. Select any one counter from the table, and configure its properties.
10. Save the configuration to start monitoring.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
clariion Node
<Profile Name> Node
<Host Name> Node
<Storage Component Name> Node
clariion Node
The clariion node contains configuration details specific to the clariion probe. This node enables you to view the probe information, configure the
log properties and the location of the Navisphere CLI executable (naviseccli.exe) which is used to gather data about the Clariion system.
Navigation: clariion
clariion > Configuring Clariion Monitoring
This section describes the minimum configuration settings required to monitor a storage system using the clariion probe.
clariion > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
Set or modify the following values if needed:
clariion > Probe Setup
This section lets you configure the detail level of the log file. The default value is 3-info.
clariion > Navisphere CLI
This section enables you to configure the location of naviseccli.exe.
Provide the absolute path of naviseccli.exe by clicking the Browse (icon). Click Save to save the configuration.
The probe now uses this executable to connect to the clariion systems.
clariion > Options (icon)
This button allows you to add a resource as a monitoring profile to the probe.
<Profile Name> Node
This node lets you configure a profile for the storage component you want to monitor. The clariion probe uses each profile as a connection to the
storage component through which the probe collects and stores data and information from the monitored components. You can check the
connection between the probe and the storage server through the Verify Selection button under the Actions drop down.
Note: This node is referred to as Profile Name node in the article and is user-configurable.
2(LDAP): LDAP users can access any system that uses the LDAP server to authenticate users.
Note: It is recommended to use Global or Local users for monitoring the Clariion system in order to remove any dependency on LDAP
server.
Minimum Event Severity: enables you to configure the minimum event severity.
Profile Name > Options (icon)
This button allows you to delete the monitoring profile from the probe.
<Host Name> Node
This node allows you to enable monitors to measure the performance of different storage components.
The monitors for the clariion elements are visible in a tabular form. You can select any one monitor from the table and configure its properties.
Navigation: clariion > Profile Name > Host Name > Monitors
Set or modify the following values if needed:
Host Name > Monitors
This section lets you edit the properties of a monitor and configure the performance counters for generating QoS.
Note: The performance counters are visible in a tabular form. You can select any one counter in the table and configure its properties.
Value Definition: enables you to select which value to be used for generating alarms and QoS. The following options are available:
Current Value: The most current value measured will be used.
Average Value Last n Samples: The average value of the last and current sample, that is, (current + previous) / 2 will be used.
Delta Value (Current-Previous): The delta value calculated from the current and the previous measured sample will be used.
Delta Per Second: The delta value calculated from the samples measured within a second will be used.
<Storage Component Name> Node
The Storage Component Name node represents the actual name of the storage component which is available in the clariion storage system. This
node contains only the Monitors section. Select any of the monitors and configure its properties.
Note: The name varies for each storage component. So, this node is referred to as the Storage Component Name node in the
document.
Navigation: clariion > Profile Name > Host Name > Storage Component Name > Monitors
The Storage Component Name node contains the following sub nodes for representing various storage elements:
Fast Cache
Hosts
LUN Folders
Mirror Views
Physical
RAID Groups
Storage Groups
Storage Processor
Thin Pools
This article describes the configuration concepts and procedures for setting up the clariion probe. This probe is configured to monitor the
performance of EMC storage platforms.
The following diagram outlines the process to configure the probe to monitor the storage components.
the probe collects and stores information from the monitored components. Each profile represents one clariion system.
Follow these steps:
1. Click Options next to the clariion node in the navigation pane.
2. Click the Add New Profile option.
3. Update the field information with the storage processor details, the source to be used for QoS metrics and alarms, and valid username
and password.
4. Click Submit.
The new monitoring profile is saved and is visible under the clariion node in the navigation pane. You can now configure the monitors for
the desired storage component.
Delete a profile
You can delete a profile if you do not want the probe to monitor the performance of a specific storage component.
Follow these steps:
1. Click the Options icon next to the profile name node that you want to delete.
2. Click the Delete Profile option.
3. Click Save.
The monitoring profile is deleted.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
clariion node
<Profile Name> node
<Host Name> Node
<Storage Component Name> Node
clariion node
The clariion node contains configuration details specific to the clariion probe. This node enables you to view the probe information, configure the
log properties and the location of the Navisphere CLI executable (naviseccli.exe) which is used to gather data about the Clariion system.
Navigation: clariion
clariion > Configuring Clariion Monitoring
This section describes the minimum configuration settings required to monitor a storage system using the clariion probe.
This node lets you configure a profile for the storage component you want to monitor. The clariion probe uses each profile as a connection to the
storage component through which the probe collects and stores data and information from the monitored components. You can check the
connection between the probe and the storage server through the Verify Selection button under the Actions drop down.
Note: This node is referred to as Profile Name node in the document and is user-configurable.
Minimum Event Severity: enables you to configure the minimum event severity.
<Host Name> Node
This node allows you to enable monitors to measure the performance of different storage components.
The monitors for the clariion elements are visible in a tabular form. You can select any one monitor from the table and configure its properties.
Navigation: clariion > Profile Name > Host Name > Monitors
Set or modify the following values if needed:
Note: The performance counters are visible in a tabular form. You can select any one counter in the table and configure its properties.
Value Definition: enables you to select which value to be used for generating alarms and QoS. The following options are available:
Current Value: The most current value measured will be used.
Average Value Last n Samples: The average value of the last and current sample, that is, (current + previous) / 2 will be used.
Delta Value (Current-Previous): The delta value calculated from the current and the previous measured sample will be used.
Delta Per Second: The delta value calculated from the samples measured within a second will be used.
<Storage Component Name> Node
The Storage Component Name node represents the actual name of the storage component which is available in the clariion storage system. This
node contains only the Monitors section. Select any of the monitors and configure its properties.
Note: The name varies for each storage component. So, this node is referred to as the Storage Component Name node in the
document.
The Storage Component Name node contains the following sub nodes for representing various storage elements:
Fast Cache
Hosts
LUN Folders
Mirror Views
Physical
RAID Groups
Storage Groups
Storage Processor
Thin Pools
Preconfiguration Requirements
The clariion probe requires Navisphere CLI executable (naviseccli.exe) to gather data about the clariion systems. You are required to install the
Navisphere CLI on the box where the clariion probe is deployed.
2. Click the Setup tab and open the Navisphere CLI section.
3. Provide the absolute path of naviseccli.exe by clicking Browse.
4. Click OK.
The probe now uses this executable to connect to the clariion systems.
Click the Test button to verify the connection to the host. You should receive a response like the one shown below.
Click the Apply button in the probe GUI to activate the new resource configuration.
The new resource should now appear under the Default group.
You can create a new group by selecting the Create New Group button from the toolbar. The new group appears in the pane with the name New
Group.
Right-click the new group and select Rename to give the group a name of your own choice.
The Resource dialog will be launched, guiding you through the process of creating a resource.
The alarm messages for each alarm situation are stored in the Message Pool. Using the Message Pool Manager, you can customize the alarm
text, and also create your own messages.
Note that the probe supports variable expansion in the message text. If you type a $ in the Alarm text field, a dialog pops up, offering a set of
variables to be selected:
Resource: specifies the resource mentioned in the alarm message.
Source: specifies the source where the alarm condition occurs.
Monitor: specifies the monitor (checkpoint) mentioned in the alarm message.
Descr: specifies the description of the monitor.
Key: specifies the monitor key (normally the same as the name of the monitor).
Value: specifies the value used in the alarm message.
Oper: specifies the operand to be combined with the value and the threshold in the alarm message.
Thr: specifies the alarm threshold.
Unit: specifies the unit to be combined with the value in the alarm message (for example boolean).
Creating a Resource
Adding Monitors to be Measured
Manually Selecting Monitors to Be Measured
Using Templates
Using Auto Configurations
Editing Monitor Properties
Turning on Statistics Monitoring
Creating a Resource
You can edit the properties for a resource by right-clicking a resource and selecting Edit. This brings up the Resource dialog.
Field
Description
Resource
Name
Source
The source to be used for QoS metrics and Alarms. This is typically the IP address of Storage Processor A.
Storage
Processor A
Storage
Processor B
Username
Password
Group
Select which group you want the resource to belong to. Normally you just have the Default group.
Alarm
message
Select the alarm message to be sent if the resource does not respond. Note that you can edit the message or define your
own using the Message Pool Manager.
Check interval
The check interval defines how often the probe checks the values of the monitors.
Login Scope
The login scope for the CLARiiON system user. It is recommended to use Global or Local users for monitoring the CLARiiON
system in order to remove any dependency on an LDAP server.
Faults
When the Send Clariion Faults as Nimsoft Alarms checkbox is selected, the probe sends the alarm with severity major for
all the faults generated in the system.
Events
When the Send Clariion Events as Nimsoft Alarms checkbox is selected, the probe sends all the events whose error code
is greater than or equal to configured severity.
Minimum
Event Severity
Test button
Press the test button to verify the connection to the CLARiiON system. You should receive a response like the one shown
below.
To select a monitor to be measured for a resource, select the resource in the left pane and browse the tree under the resource node. Selecting a
folder in this tree-structure, the monitors found will be listed in the right pane of the probe GUI, enabling you to select the ones you want to
monitor.
Note that you may also add monitors to be measured using templates (see the section ).
Selecting the All Monitors node, all monitors currently being measured will be listed in the right pane. Note that you can also select/deselect
monitors here.
Using Templates
Templates are useful tools for defining monitors to be measured on the various elements of a CLARiiON system:
You may create templates and define a set of monitors belonging to that template. These templates can be applied to a folder or element by
dragging and dropping the template on the node in the tree where you want to measure the monitors defined for the template. You may also drop
a template on a resource in the tree structure, and the template will be applied to all elements for the resource.
Creating a template
Right-click the Templates node in the left window pane and select New.
Note that you may also edit an existing template by selecting one of the templates defined (found by expanding the Templates node and selecting
Edit).
The Template Properties dialog appears, letting you specify a Name and a Description for the new template.
You may now edit the properties for the monitors under the template as described in the section .
Applying a template
Drag and drop the template on the element where you want to monitor the checkpoints defined for the template. Note that you may also drop the
template on a folder containing multiple elements.
Auto Configurations provide a powerful method for automatically adding monitors to be measured. "Auto Monitors" will be created for devices that
are currently NOT monitored.
Example:
When new Thin Pools are created on the CLARiiON system, the Auto Configuration feature will, if configured, create Auto Monitor(s) for the new
Thin Pools and automatically start monitoring them.
The Auto Configuration feature consists of two nodes located under the resource node in the left pane:
Auto Configurations
One or more checkpoints (or templates) can be added to this node, using drag and drop. When this is done, you must click the Apply
button and restart the probe to activate the changes.
The probe will search for the appropriate types of objects. Auto Monitors, representing the monitor(s)/template(s) added under the Auto
Configuration node will be created (and listed under the Auto Monitor node, see below) for devices that are currently NOT monitored.
NOTE: Adding many monitors/templates to the Auto Configurations node may result in a very large number of Auto Monitors
(see below).
Auto Monitors
This node lists Auto Monitors created for previously unmonitored devices, based on the contents added to the Auto Configuration node.
The Auto Monitors will only be created for devices that are currently NOT monitored.
Adding a template to the Auto Configurations node
You can add a template (see the section to learn more about templates) by selecting the Templates node in the left pane. All templates available
will now be listed in the right pane. Add a template to the Auto Configurations node by dragging the template, dropping it on the Auto
Configurations node.
Click the Auto Configurations node and verify that the template was successfully added. Note that you must also click the Apply button and restart
the probe to activate the configuration.
Field
Description
Name
This is the name of the monitor. The name will be inserted into this field when the monitor is created, but you are allowed to
modify the name.
Key
Description
This is a description of the monitor. This description will be inserted into this field when the monitor is created, but you are
allowed to modify it.
Value Definition
This drop-down list lets you select which value to be used, both for alarming and QoS:
You have the following options:
The current value, meaning that the most current value measured will be used.
The delta value (current - previous). This means that the delta value calculated from the current and the previous measured
sample will be used.
Delta per second. This means that the delta value calculated from the samples measured within a second will be used.
The average value of the last and current sample:
(current + previous) / 2.
Active
Enable
Monitoring
Alarms
Operator
Select from the drop-down list the operator to be used when setting the alarm threshold for the measured value.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exactly 90.
Threshold
The alarm threshold value. An alarm message will be sent if this threshold is exceeded.
Unit
Message Token
Select the alarm message to be issued if the specified threshold value is breached. These messages are kept in the
message pool. The messages can be modified in the Message Pool Manager.
Advanced
Post with subject
Checking this option, the alarm messages for the checkpoints will be attached/posted with the displayed subject.
Publish Quality
of Service
Select this option if you want QoS messages to be issued on the monitor.
QoS Name
Note: Select Add New QoS Definition option from the QoS Name dropdown to add a custom QoS. Enter the
details in the QoS Definition dialog that appears and click OK.
Statistics reporting may have a slight performance impact on the CLARiiON system. Administrators may want to disable statistics reporting except
when troubleshooting the CLARiiON system.
You can determine whether such monitoring is enabled by viewing a setting in the clariion probe configuration GUI. Turning statistics monitoring
on or off is done via the command line interface for the CLARiiON system.
Note: You cannot enable or disable the collection of statistics on the CLARiiON system using the clariion probe GUI. This is because
this action is typically restricted to CLARiiON administrators.
The window consists of a row of tool buttons and two window panes. In addition, a status bar is located at the bottom of the window, showing GUI
probe version information and when the probe was started.
Groups
You can create new groups by right-clicking a group and selecting New Group (or by clicking the New Group toolbar button).
Resources
A group contains one or more resources. On this probe you normally just create one resource. This resource is configured as a link to the
clariion system. It is possible to move a resource from one group to another, using drag and drop.
Templates
Templates are a useful tool for defining reusable sets of checkpoints to be monitored on the various clariion components.
You can create templates and define a set of monitors belonging to that template. Drag-and-drop the template on a node containing objects you
want to monitor.
Right-clicking the Templates node lets you add a template.
Right-clicking one of the templates defined lets you edit or delete the selected template.
The properties for a template are Name and Description.
QoS
This node contains the standard QoS definitions included with the probe package. These can be selected when editing the monitoring properties
for a monitor. To define your own QoS definitions, right-click the QoS node and select New.
Right-clicking in the left pane
Right-clicking in the pane opens a pop-up menu, displaying the following options:
When a Resource node is selected:
New Resource
Opens the Resource dialog, enabling you to define a new resource to be monitored.
New Group
Creates a new group where you may place resources. The new group will appear in the pane with the name New Group. Right-click the
new group and select Rename to give the group a name of your own choice.
Edit
Lets you edit the properties for the selected resource.
Delete
Lets you delete the selected resource. Note that the Default group cannot be deleted, but if you remove all elements from the group, it will
not appear the next time you restart the probe.
Rename
Lets you delete the selected resource.
Reload
Refresh the window to display the most current measured values for the monitors.
Note: When selecting reload, you will not get updated values until the next time the probe has polled the CLARiiON system.
This interval is set by the Check Interval set on the properties dialog for the Resource.
General Setup
Clicking the General Setup button opens the General Setup dialog.
Log-level: Sets the level of details written to the log file. Log as little as possible during normal operation, to minimize disk consumption.
The available options are: 0-Fatal errors, 1-Errors, 2-Warnings, 3-Information.
Enable GUI auto-refresh: When this option is selected, the GUI will be refreshed each 60 seconds. This will reflect the most current
measured values from the checkpoints and status of the nodes in the tree structure that can be seen when navigating through the clariion
system appearing under a resource node. If not selected, you are required to press the F5 button to refresh the GUI.
Note: When pressing F5 to refresh, you will not get updated values until the next time the probe has polled the clariion system. This
interval is set by the Check Interval option on the properties dialog for the Resource.
clariion Metrics
The following table describes the checkpoint metrics that can be configured using the Clariion Monitoring (clariion) probe.
Monitor Name
QoS Name
Units
Description
Version
QOS_STORAGE_CFG_TOTAL_CAPACITY
Gigabytes
1.6
Number of LUNs
QOS_STORAGE_DISK_LUNS
Count
1.6
Capacity
QOS_STORAGE_DISK_CAPACITY
Megabytes
1.6
Percent Busy
QOS_STORAGE_DISK_PCT_BUSY
Percent
1.6
Percent Idle
QOS_STORAGE_DISK_PCT_IDLE
Percent
1.6
QOS_STORAGE_DISK_PCT_BUSY_LT
Percent
1.6
QOS_STORAGE_DISK_PCT_IDLE_LT
Percent
1.6
Percent Rebuilt
QOS_STORAGE_DISK_PCT_REBUILT
Percent
1.6
QOS_STORAGE_FAN_PRESENT
Boolean
1.6
QOS_STORAGE_FAST_CACHE_PCT_DIRTY_SPA
Percent
1.6
QOS_STORAGE_FAST_CACHE_PCT_DIRTY_SPB
Percent
1.6
Size
QOS_STORAGE_FAST_CACHE_SIZE
Gigabytes
1.6
QOS_STORAGE_LCC_PRESENT
Boolean
1.6
QOS_STORAGE_IOMODULE_PRESENT
Boolean
1.6
Service Time
QOS_STORAGE_LUN_SERVICETIME
Milliseconds
1.6
User Capacity
QOS_STORAGE_DISK_USER_CAPACITY
Gigabytes
1.6
LUN Capacity
QOS_STORAGE_LUN_CAP
Megabytes
1.6
QOS_STORAGE_LUN_HARD_ERRORS
Count
1.6
Total IOPS
QOS_STORAGE_LUN_IOPS
IO per
second
1.6
QOS_STORAGE_LUN_SOFT_ERRORS
Count
1.6
QOS_STORAGE_LUN_PCT_BUSY
Percent
1.6
QOS_STORAGE_LUN_PCT_IDLE
Percent
1.6
Percent Rebuilt
QOS_STORAGE_LUN_PCT_REBUILT
Percent
1.6
Trespasses
QOS_STORAGE_LUN_TRESPASSES
Count
1.6
Trespassed
QOS_STORAGE_LUN_TRESPASSED
Boolean
1.6
QOS_STORAGE_LUN_TRESPASSES_PH
Count
1.6
Trespasses SPA
QOS_STORAGE_LUN_TRESPASSES_SPA
Count
1.6
Trespasses SPB
QOS_STORAGE_LUN_TRESPASSES_SPB
Count
1.6
Present (For
Management Module)
QOS_STORAGE_MANAGEMENTMODULE_PRESENT
Boolean
1.6
QOS_STORAGE_ML_ACTUAL_USER_CAP
Megabytes
1.6
Percent Expanded
QOS_STORAGE_ML_PERCENT_EXPANDED
Percent
1.6
QOS_STORAGE_ML_TOTAL_USER_CAP
Megabytes
1.6
Faulted
QOS_STORAGE_MIRROR_VIEW_FAULTED
Boolean
1.6
QOS_STORAGE_NUM_OF_CONFIGURED_DISKS
Count
1.6
QOS_STORAGE_NUM_OF_DEVICES
Count
1.6
QOS_STORAGE_NUM_OF_DISKS
Count
1.6
QOS_STORAGE_NUM_OF_HOT_SPARES
Count
1.6
QOS_STORAGE_NUM_OF_UNCONFIGURED_DISKS
Count
1.6
QOS_STORAGE_PSU_PRESENT
Boolean
1.6
QOS_STORAGE_RAW_TOTAL_CAPACITY
Gigabytes
1.6
Free Capacity
QOS_STORAGE_RG_FREE_CAP
Blocks
1.6
Logical Capacity
QOS_STORAGE_RG_LOG_CAP
Blocks
1.6
Number of LUNs
QOS_STORAGE_RG_NUM_LUNS
Count
1.6
Percent Free
QOS_STORAGE_RG_PCT_FREE
Percent
1.6
Raw Capacity
QOS_STORAGE_RG_RAW_CAP
Blocks
1.6
Number of Hosts
QOS_STORAGE_SG_NUM_HOSTS
Count
1.6
Number of LUNs
QOS_STORAGE_SG_NUM_LUNS
Count
1.6
Total Capacity
QOS_STORAGE_SG_TOTAL_CAP
Megabytes
1.6
QOS_STORAGE_SP_AverageBusyQueueLength
Count
1.6
QOS_STORAGE_SP_BLOCKS_READ_PER_SECOND
Blocks/sec
1.6
QOS_STORAGE_SP_BLOCKS_WRITTEN_PER_SECOND
Blocks/sec
1.6
Total IOPs
QOS_STORAGE_SP_IOPS
Ops/sec
1.6
QOS_STORAGE_SP_FAULT_ON
Boolean
1.6
Read IOPs
QOS_STORAGE_SP_READ_IOPS
Ops/sec
1.6
Write IOPs
QOS_STORAGE_SP_WRITE_IOPS
Ops/sec
1.6
QOS_STORAGE_SP_PCT_BUSY
Percent
1.6
QOS_STORAGE_SP_PCT_IDLE
Percent
1.6
QOS_STORAGE_SP_READ_CACHE_ENABLED
Boolean
1.6
QOS_STORAGE_SP_WRITE_CACHE_ENABLED
Boolean
1.6
QOS_STORAGE_SP_PCT_DIRTY
Percent
1.6
QOS_STORAGE_SP_PCT_OWNED
Percent
1.6
QOS_STORAGE_SP_RESPONSETIME
Milliseconds
1.6
QOS_STORAGE_SP_SERVICETIME
Milliseconds
1.6
Utilization
QOS_STORAGE_SP_UTILIZATION
Percent
1.6
Number of Faults
QOS_STORAGE_SYS_FAULTS
Count
1.6
QOS_STORAGE_SPS_PRESENT
Boolean
1.6
QOS_STORAGE_SYS_READ_CACHE_COUNT
Count
1.6
QOS_STORAGE_SYS_WRITE_CACHE_COUNT
Count
1.6
Available Capacity
QOS_STORAGE_TP_AVAILABLE_CAPACITY
Gigabytes
1.6
Consumed Capacity
QOS_STORAGE_TP_CONSUMED_CAPACITY
Gigabytes
1.6
Oversubscribed By
QOS_STORAGE_TP_OVERSUBSCRIBED_BY
Gigabytes
1.6
Percent Subscribed
QOS_STORAGE_TP_PERCENT_SUBSCRIBED
Percent
1.6
Percent Available
QOS_STORAGE_TP_PERCENT_AVAILABLE
Percent
1.6
Percent Full
QOS_STORAGE_TP_PERCENT_FULL
Percent
1.6
Relocating
QOS_STORAGE_TP_RELOCATING
Boolean
1.6
Subscribed Capacity
QOS_STORAGE_TP_SUBSCRIBED_CAPACITY
Gigabytes
1.6
User Capacity
QOS_STORAGE_TP_USER_CAPACITY
Gigabytes
1.6
Disk State
String
1.6
Mode
String
1.6
String
1.6
String
1.6
String
1.6
String
1.6
String
1.6
String
1.6
String
1.6
String
1.6
String
1.6
Statistics Logging
String
1.6
Relocation Status
String
1.6
Reads IOPs
QOS_STORAGE_LUN_READ_REQUESTS_PER_SECOND
Reads/s
2.1
Write IOPs
QOS_STORAGE_LUN_WRITE_REQUESTS_PER_SECOND
Writes/s
2.1
Read Bandwidth
QOS_STORAGE_LUN_READ_THROUGHPUT_PER_SEC
KB/s
2.1
Write Bandwidth
QOS_STORAGE_LUN_WRITE_THROUGHPUT_PER_SEC
KB/s
2.1
Bandwidth
QOS_STORAGE_LUN_TOTAL_THROUGHPUT_PER_SEC
KB/s
2.1
State Number
QOS_STORAGE_MIRROR_VIEW_STATE_NUMBER
Number
2.1
Blocks Read
QOS_STORAGE_SP_BLOCKS_READ
Count
2.1
Blocks Written
QOS_STORAGE_SP_BLOCKS_WRITTEN
Count
2.1
Total Reads
QOS_STORAGE_SP_READ
Count
2.1
Total Writes
QOS_STORAGE_SP_WRITE
Count
2.1
LUN Utilization
QOS_STORAGE_LUN_UTILIZATION
Percent
2.1
QOS_STORAGE_CFG_FREE_CAPACITY
Gigabytes
1.6
QOS_STORAGE_CFG_FREE_CAPACITY_PERCENT
Gigabytes
1.6
QOS_STORAGE_CFG_USED_CAPACITY
Gigabytes
1.6
QOS_STORAGE_MIRROR_VIEW_ACTIVE
Boolean
1.6
QOS_STORAGE_RAW_FREE_CAPACITY
Gigabytes
1.6
QOS_STORAGE_RAW_FREE_CAPACITY_PERCENT
Gigabytes
1.6
QOS_STORAGE_RAW_USED_CAPACITY
Gigabytes
1.6
Bandwidth (LUN)
QOS_STORAGE_LUN_TOTAL_THROUGHPUT
KB/s
1.6
Supported Probes
The following table lists the supported probes and their corresponding supported cluster environment:
Important! Use the cluster probe with other probes only when you configure the cluster probe on a robot in a clustered environment.
Supported Probe
cdm
The probe supports only disk profile monitoring on cluster version 2.20 and later.
Active/Passive
N+I node cluster
Note: For more information, see the Set up Cluster Monitoring in cdm section.
dirscan
2-node Active/Passive
N+I node clusters if the profile names are
unique
exchange_monitor
The probe supports only Microsoft Cluster Server (MSCS) monitoring on cluster version 1.61
and later.
logmon
Active/Active
Active/Passive
N+I node cluster
2-node Active/Passive
N+I node clusters if the profile names are
unique
nperf
2-node Active/Passive
N+I node clusters if the profile names are
unique
ntservices
2-node Active/Passive
N+I node clusters if the profile names are
unique
oracle
2-node Active/Passive
N+I node clusters if the profile names are
unique
processes
2-node Active/Passive
N+I node clusters if the profile names are
unique
sqlserver
2-node Active/Passive
N+I node clusters if the profile names are
unique
The cdm probe receives information about the cluster disk resources from the cluster probe. Monitoring profiles are created for the resources that
are based on the fixed_default settings in the cdm probe. The profile is automatically registered with the cluster probe to ensure continuous
monitoring on cluster group failover. In the cdm probe, the cluster IP is used as Alarm and Quality of Service source instead of the cluster node.
You can change the source to cluster name or group name through Infrastructure Manager (IM).
More information:
cluster (Clustered Environment Monitoring) Release Notes
The following diagram outlines the process to configure the cluster probe.
Contents
Verify Prerequisites
Configure General Properties
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see cluster (Clustered Environment
Monitoring) Release Notes.
Notes:
A clustered probe must be deployed on more than one node of the cluster.
Profile(s) created on these probes can have failover support from the cluster probe.
All nodes in the cluster must have the same probes with the same general configuration.
Shared sections can be set up in the cluster probe to follow a Resource Group upon failover. All shared sections that the cluster probe has been
configured for, are automatically synchronized between the probes running on the different nodes in the cluster. A shared section is linked to only
the "/connections" or "/messages" section of the configuration files of the probe that runs on the node. Shared sections represent shared
resources that are defined for that resource group.
Note: Any changes to a shared section must be implemented on the probe running on the cluster node currently in control of the
associated Resource Group.
The following procedure enables you to add a shared section to a resource group on the cluster probe.
Follow these steps:
1. Click Options (icon) next to the Shared Section node in the navigation pane.
2. Select Add New Shared Section.
3. Specify the name of the section.
4. Select the probe from the list of deployed probes.
5. Select the shared section of the selected deployed probe.
6. Click Submit.
The shared section is visible under the Shared Section node in the navigation pane.
Value
1.1.16
Cluster
1.1.16.1
Node
1.1.16.2
Resource Group
1.1.16.3
Package
To update the Subsystem IDs using Admin Console, follow these steps:
1. Open the Raw Configure interface of the nas probe.
2. Click the Subsystems folder.
3. Click the New Key menu item.
4. Enter the Key Name in the Add key window.
5. Click Add.
6. The new key appears in the list of keys with a blank value.
7. Click in the Value column for the newly created key and enter the key value.
8. Repeat this process for all the required subsystem IDs for your probe.
9. Click Apply.
cluster Node
<Node Name (selected)> Node
<Resource Group Name> Node
Profiles Node
<Profile Name> Node
<Shared Section> Node
<Shared Section Name> Node
The cluster probe enables failover support for the supported probes. The profiles of these probes can be associated with different resource
groups. The probe configurations that are saved in a resource group remain the same, even when that resource group moves to another cluster
node.
Notes:
A shared section is a common section between different monitoring profiles. The CA UIM probes get exposed to the shared
sections. The cluster probe also sends alarms when a cluster node or resource group changes state.
In case of HP_UX service guard cluster, a package is equivalent to resource group.
All cluster nodes must have the same set of CA UIM probes installed. The cluster probe updates the changes that are made to the profile setup.
When any probe profile is updated, the changes must be implemented on the probe that runs on the node which is in control of the associated
resource group. The cluster probe automatically updates most of the changes of the probe profiles. However, general probe parameters must be
updated manually on the different nodes.
The collected data must contain the name and state of the cluster nodes, the name and state of each resource group, and node on what they run.
This information is used to identify the recommended cluster node where you can run the probe configurations.
You can view the cluster node status metrics when you select any one of the available cluster nodes. However, you can view the related group
metrics only under that particular cluster node to which the group belongs.
cluster Node
This node lets you view the probe information and set the initial probe configurations.
Navigation: cluster
Set or modify the following values if needed:
cluster > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
cluster > Set Up Configuration
This section lets you select the cluster type and the cluster name for the node to which you want to add the resource group. The cluster software
MSCS which stands for Microsoft Cluster Service (Windows only) or VCS which stands for VERITAS Cluster Service (Windows, Linux, Solaris,
and AIX), or RedHat Cluster Service (Linux), or Serviceguard cluster (HP_UX) is required for this probe to work.
Select the Cluster Type: specifies the type of cluster software.
This node represents one of the machines of the cluster on which the CA UIM robot is installed. You can configure the resource group properties
and the alarm properties on the active node. All cluster nodes must have the same set of CA UIM probes installed.
Note: This node is referred to as the node name (selected) node in the document and it represents an active node of the cluster. The
Admin Console GUI is initiated from the active cluster node.
Note: QoS data and alarms are not generated after every check interval. They are generated on events like, probe restart, resource
group state change, or node state change.
Status: indicates the node state. Available states are: 1=online, 2=offline, 3=partially online.
<Resource Group Name> Node
This node lets you configure the properties of a resource group. A resource group comprises of shared sections and profile sections.
Note: This node is referred to as resource group name node in the document and it represents a resource group on an active cluster
node.
Note: The profiles, or shared sections, that are created when the resource group is online, are not visible when the same group goes
offline.
Status: indicates the resource group state. Available states are: 1=online, 2=offline, 3=partially online.
Alarm messages are sent when a resource group changes its state. An alarm is sent when the state of the resource group changes (for example,
when a resource group goes from Online to Pending or from Offline to Pending).
Note: In case a resource group is detected in the FAIL state (not ONLINE), the alarm message displays which resource(s) in that group
is/are not online. In other words, the resource(s) name(s) and status are updated in the alarm message with the resource group status.
node
state
state_text
Subs ID
This is the NimBUS subsystem ID. The mapping is listed in the NAS (Nimsoft Alarm Server).
Severity
Defines the severity level for alarms from the resource groups. You can select different severity levels for state down, state up and other state
than up or down.
Profiles Node
This node lets you add a profile to the resource group. The profiles of various CA UIM probes are associated with different resource groups and
the configurations of these profiles are transferred with the resource group in a failover scenario.
Navigation: cluster > node name > resource group name > Profiles
Set or modify the following values as required:
Profiles > Options (icon) > Add New Profile
This option lets you add a profile to the resource group and configure its properties.
Profile Name: defines a unique profile name.
Probe: specifies the probe name that you want to associate with the resource group.
Note: Make sure that the specified probe is deployed on the cluster node.
This node lets you configure the properties of the profile that you add to a resource group.
Note: This node is referred to as profile name node in the document and it represents a profile on the resource group.
Navigation: cluster > node name > resource group name > Profiles > profile name
Refer to the Add New Profile section of the Profiles node for field descriptions.
Profile name > Options (icon) > Delete
This option allows you to delete a profile.
<Shared Section> Node
A shared section is a common section between different monitoring profiles. The cluster probe is configured for various shared sections that are
synchronized between the probes that run on different clustered nodes. Any change in the shared section is also applied to the probe that runs on
an active node.
A shared section is linked to only /connections and /messages section of the probe configuration file.
Navigation: cluster > node name > resource group name > Shared Section
Set or modify the following values as required:
Shared Section > Options (icon) > Add New Shared Section
This option lets you implement one or more shared sections on the probe configuration.
Shared Section Name: defines the name of the shared section.
Probe: specifies the probe name for which you want to configure the shared section.
This node lets you configure the properties of the shared section that you implement on the probe configuration.
Note: This node is referred to as shared section name node in the document and it represents a shared section on the resource group.
Navigation: cluster > node name > resource group name > Shared Section > shared section name
Refer the Add New Shared Section of the Shared Section node for field descriptions.
Shared Section Name > Options (icon) > Delete
This option allows you to delete a shared section.
The following diagram outlines the process to configure the probe to monitor clusters.
Contents
Verify Prerequisites
Create Profile
Add Shared Section
Update NAS Subsystem ID Requirements
Verify Prerequisites
Verify that required hardware, software, and related information is available before you configure the probe. For more information, see cluster
(Clustered Environment Monitoring) Release Notes.
Follow these steps to set up the probe for cluster monitoring:
1. Deploy the cluster probe on the robot.
2. Double-click the probe to open the Probe Installation Wizard window.
3. Select the type of cluster (Microsoft, Red Hat, Veritas or HP-UX service guard) that you want to monitor.
Note: You need to install a separate instance of the probe for each cluster.
Create Profile
You need to create an associated profile for the target probe profile that you want to monitor.
Note: The cluster probe will monitor only those probes that are already deployed on cluster node, and for which the profile(s) has been
created. Both the nodes should have same probes and general configuration. Refer the respective document of the probe(s) to create
the monitoring profile.
1.
Notes:
This should be done on the cluster currently in control of the Resource group.
In HP_UX service guard cluster, a package is equivalent to a resource group.
Note: Any changes to a shared section must be implemented on the probe running on the Node currently in control of the associated
Resource Group.
The following procedure enables you to add a shared section to a resource group on the cluster probe.
Follow these steps:
1. Right-click a group and select New Shared Section.
The New Shared Section properties dialog appears.
2. Select a probe from the drop-down list.
The description and configured sections are listed in the Description and Section fields, respectively.
3. Select a section from the list and then click OK.
The sections that are saved in the cluster probe are automatically uploaded to the other node.
Key Name
Value
1.1.16
Cluster
1.1.16.1
Node
1.1.16.2
Resource Group
1.1.16.3
Package
Follow these steps to update the Subsystem IDs using Infrastructure Manager:
1. In Infrastructure Manager, right click on the NAS probe, select Raw Configure.
2. Click on the Subsystems folder.
3. Click on the New Key button.
4. Enter the Key Name and Value, Click OK.
5. Repeat this process for all of the required subsystem IDs for your probe.
6. Click Apply.
Note: In case of HP_UX service guard cluster, a package is equivalent to resource group.
A resource group or a package is visible under an active node in the tree structure that currently controls the resources.
When one of the nodes fails, the control of all resources is automatically transferred to another functioning node. The resource group is then
visible under the new functioning node in the tree structure on the GUI.
The CA UIM Cluster probe enables failover support for the standard CA UIM probes. In HP-UX service guard cluster, package failover is also
supported. Probe profiles of standard CA UIM probes can be associated with different resource groups and will then follow that group if it is
transferred to another cluster node.
Note: CA UIM probes can also expose one or more so-called shared sections. The cluster probe also sends alarms if a cluster node or
resource group changes state.
The cluster probe is configured by double-clicking the line representing the probe in the Infrastructure Manager, which brings up the configuration
tool for the cluster probe. The configuration tool consists of a tree-structure in the left pane and a list in the right pane.
The tree detects and displays the cluster structure (Nodes > Resource groups > Resources/Shared sections). The list in the right pane changes
its contents according to the selected item in the left pane. When a Resource group is selected in the left pane, then resources/shared sections th
at are defined for that resource group is listed in the right pane.
Note: While naming resources or resource groups, ensure that there are no preceding or trailing white spaces in the resource name.
For example, "Test<space> Group" is a valid resource name; however names such as "<space>TestGroup" or "TestGroup<space><sp
ace>" are not permissible.
In case you have resource names with preceding or trailing white spaces in the existing cluster probe configuration, the upgrade from cluster
probe version 2.3x to any higher version requires a clean install.
You open the Cluster Properties dialog by clicking the icon in the upper left corner or by right-clicking the cluster icon in the tree-structure and
selecting Properties. You can also add and delete nodes from this right-click menu.
The dialog has three tabs:
General
Alarm Settings
QoS Settings
General Tab
This section lets you know how to set the alarm message properties for both Node alarms and Resource Group alarms.
The fields are:
Node Alarm Properties (Active)
These alarm messages are sent when a node changes its state (for example, the node is shut down). Alarms from the cluster nodes are
activated when this option is checked.
Message
Here you can define the message text for alarms coming from nodes in the cluster. The following variables can be used in the text
string:
name
state
state_text
Clear
You can define the text for clear messages coming from nodes in the cluster. The following variables can be used in the text string:
name
state
state_text
Subs ID
Defines NimBUS subsystem ID. The mapping is listed in the NAS (Nimsoft Alarm Server).
Severity
Defines the severity level for alarms from the nodes. You can select different severity levels for state down, state up and state other
than up or down.
Resource Group Alarm Properties (Active)
These alarm messages are sent when a resource group changes its state. The alarm is sent when the state of the resource group changes (for
example, when a resource group goes from Online to Pending or from Offline to Pending).
Note: In case a resource group is detected in the FAIL state (not ONLINE), the alarm message displays which resource(s) in that group
is/are not online. In other words, the resource(s) name(s) and status are updated in the alarm message with the resource group status.
state
state_text
Clear
Resource groups can be moved to another node either manually or due to an error situation. A clear message is issued when the
resource group is moved back to the original node again. Here you can define the message text. The following variables can be used
in the text string:
name
node
state
state_text
Subs ID
Displays the NimBUS subsystem ID. The mapping is listed in the NAS (Nimsoft Alarm Server).
Severity
Defines the severity level for failover alarms from the resource groups.
QoS Settings Tab
This section lets you know how the cluster probe sends QoS messages on different resource group and node states.
The fields are:
Send QoS Message
Resource Group State
Selecting this check box enables the probe to send QoS messages. These QoS messages are sent when a resource group changes
its state to online, offline, or other.
Node State
Selecting this check box enables the probe to send QoS message when the node has state up, down or other.
Node Properties
You can right-click the node and select the following actions:
Restart Node
You can restart the cluster probes on the different nodes in the cluster by right-clicking the node in the tree-structure and selecting Restart Probe.
You can open the Log viewer window to study the log file for the cluster probe on the different nodes.
Right-click a node in the tree-structure and select View log.The Log Viewer window appears, displaying log information for the selected cluster.
Note: There are two different log files in the cluster probe directory - cluster.log and plugin.log.
Cluster.log is default log file, but the user can change to view the log file plugin.log by right-clicking the cluster probe in the Infrastructure
Manager and selecting Edit.
Change from cluster.log to plugin.log in the Log File field of the dialog and click the OK button.
Note:
Cluster.log logs the cluster probe activity, such as the communication between the cluster probes and also other tasks that are
performed by the cluster probe (sending alarms, activating/deactivating profiles for other probes etc.).
Plugin.log logs information from the cluster software (via the different plugins). This is mainly status from Nodes and Resource
Groups).
One plugin fetching information is from Microsoft Clusters and another plugin for VERITAS clusters.
Properties
The Node Properties dialog is displayed when you right-click a node icon in the tree-structure and select Properties.
After you select the Properties option, the Node properties dialog appears. The dialog has two tabs:
General
Configuration
General
The fields are:
Name
Specifies the name of the node.
Description
Specifies the description of the node.
IP Address
Specifies the IP address of the node computer.
Probe
Specifies the probe address, full path: /Domain/Hub/Node/Probe
Configuration
The Configuration tab contains options for setting check intervals for the node.
Select the Configuration tab. Specify the time interval (in seconds) for Check Group Interval (sets the check interval of the Resource groups on
the node), and Check Node Interval (sets the check interval of the Node).
Resource Group Properties
You can right-click a Resource Group icon in the tree-structure and select any of the following options:
New Profile
CA UIM probes expose one or more shared sections for the cluster probe. These sections can be set up in the cluster probe to follow a Resource
Note: You can select all the checkpoints by clicking the option under the Section column. Alternatively, you can also select individual
checkpoints by clicking them.
Properties
The cluster.cfg file displays version numbers and md5 values for the Cluster, Group, and Shared Disks.
You can open the cluster probe configuration file by selecting the cluster probe in the Infrastructure Manager and pressing CTRL + N keys.
When a checkpoint selected under Shared Sections is modified, the version number and md5 value of the relevant Cluster Group or Shared Disks
is updated on the node where the change has occurred.
As per the check intervals that are specified in the Configuration tab of Cluster Node Properties dialog, the version numbers and md5 values
are compared between the groups and nodes. In the case where the version numbers differ, the change is affected across the nodes.
Supports clustered drives in cdm probe if the serviceguard has veritas CFS file system.
Supports Serviceguard version A.11.19.00
Alarms will only be generated for discovered packages.
GUI will only show packages that are attached to a particular node.
The following diagram outlines the process to configure the cluster probe.
Notes:
Each section within the node lets you configure the properties of the probe for adding resource groups to the cluster.
The monitoring profile should be created on the probe, which is already deployed on the cluster node.
Value
1.1.16
Cluster
1.1.16.1
Node
1.1.16.2
Resource Group
1.1.16.3
Package
To update the Subsystem IDs using Admin Console, follow these steps:
1. In the Admin Console, click the black arrow next to the NAS probe, select Raw Configure.
2. Click on the Subsystems folder.
3. Click on the New Key Menu item.
4. Enter the Key Name in the Add key window, click Add.
5. The new key appears in the list of keys with a blank value.
6. Click in the Value column for the newly created key and enter the key value.
7. Repeat this process for all of the required subsystem IDs for your probe.
8. Click Apply.
This article shows the GUI Reference of Clustered Environment Monitoring (cluster) probe.
Contents
cluster Node
<Node Name (selected)> Node
<Resource Group Name> Node
Profiles Node
<Profile Name> Node
<Shared Section> Node
<Shared Section Name> Node
The cluster probe enables failover support for the supported probes. The profiles of these probes can be associated with different resource
groups. The probe configurations that are saved in a resource group remain the same, even when that resource group moves to another cluster
node.
Notes:
A shared section is a common section between different monitoring profiles. The CA UIM probes get exposed to the shared
sections. The cluster probe also sends alarms when a cluster node or resource group changes state.
In case of HP_UX service guard cluster, a package is equivalent to resource group.
All cluster nodes must have the same set of CA UIM probes installed. The cluster probe updates the changes that are made to the profile setup.
When any probe profile is updated, the changes must be implemented on the probe that runs on the node which is in control of the associated
resource group. The cluster probe automatically updates most of the changes of the probe profiles. However, general probe parameters must be
updated manually on the different nodes.
The collected data must contain the name and state of the cluster nodes, the name and state of each resource group and on what node they run.
This information is used to identify the recommended cluster node where you can run the probe configurations.
You can view the cluster node status metrics when you select any one of the available cluster nodes. However, you can view the related group
metrics only under that particular cluster node to which the group belongs.
cluster Node
This node lets you view the probe information and set the initial probe configurations.
Navigation: cluster
Set or modify the following values as required:
cluster > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
cluster > Set Up Configuration
This section lets you select the cluster type and the cluster name for the node to which you want to add the resource group. The cluster software
MSCS which stands for Microsoft Cluster Service (Windows only) or VCS which stands for VERITAS Cluster Service (Windows, Linux, Solaris,
and AIX), or RedHat Cluster Service (Linux), or Serviceguard cluster (HP_UX) is required for this probe to work.
Select the Cluster Type: specifies the type of cluster software.
Cluster Name: specifies the type of cluster node.
cluster > General Configuration
This section lets you set the log file detail level.
Log Level: specifies the detail level of the log file.
Default: 0-Fatal
Use Cluster Name as Alarm Source: uses the cluster name as the source of alarm that the probe generates. If this check box is not
selected, then the cluster node name is used as the source of the alarms.
Default: Not Selected
This node represents one of the machines of the cluster on which the CA UIM robot is installed. You can configure the resource group properties
and the alarm properties on the active node. All cluster nodes must have the same set of CA UIM probes installed.
Note: This node is referred to as node name (selected) node in the document and it represents an active node of the cluster. The
Admin Console GUI is initiated from the active cluster node.
Note: QoS data and alarms are not generated after every check interval. They are generated on events like, probe restart, resource
group state change or node state change.
Status: indicates the node state. 1 indicates up, 2 indicates down, and 3 indicates any other state.
<Resource Group Name> Node
This node lets you configure the properties of a resource group. A resource group comprises of shared sections and profile sections.
Note: This node is referred to as resource group name node in the document and it represents a resource group on an active cluster
node.
Note: The profiles, or shared sections, that are created when the resource group is online, are not visible when the same group goes
offline.
Status: indicates the resource group state. 1 indicates online, 2 indicates offline, 3 indicates partially online.
Alarm messages are sent when a resource group changes its state. An alarm is sent as soon as the state of the resource group changes (for
example when a resource group goes from "Online" to "Pending" or from "Offline" to "Pending").
Note: In case a resource group is detected in the FAIL state (not ONLINE), the alarm message will display which resource(s) in that
group is/are not online. In other words, the resource(s) name(s) and status are updated in the alarm message along with the resource
group status.
This node lets you add a profile to the resource group. The profiles of various CA UIM probes are associated with different resource groups and
the configurations of these profiles are transferred along with the resource group in a failover scenario.
Navigation: cluster > node name > resource group name > Profiles
Set or modify the following values as required:
Profiles > Add New Profile
This section lets you add a profile to the resource group and configure its properties.
Profile Name: defines a unique profile name.
Probe: specifies the probe name that you want to associate with the resource group.
Note: Make sure that the specified probe is deployed on the cluster node.
This node lets you configure the properties of the profile that you add to a resource group.
Note: This node is referred to as profile name node in the document and it represents a profile on the resource group.
Navigation: cluster > node name > resource group name > Profiles > profile name
Refer to the Add New Profile section of the Profiles node for field descriptions.
<Shared Section> Node
A shared section is a common section between different monitoring profiles. The cluster probe is configured for various shared sections that are
synchronized between the probes that run on different clustered nodes. Any change in the shared section is also applied to the probe that runs on
an active node.
A shared section is linked to only /connections and /messages section of the probe configuration file.
Navigation: cluster > node name > resource group name > Shared Section
Set or modify the following values as required:
Shared Section > Add New Shared Section
This section lets you implement one or more shared sections on the probe configuration.
Shared Section Name: defines the name of the shared section.
Probe: specifies the probe name for which you want to configure the shared section.
Description: defines a short description of the specified probe.
Section: specifies one of the shared section of the probe.
<Shared Section Name> Node
This node lets you configure the properties of the shared section that you implement on the probe configuration.
Note: This node is referred to as shared section name node in the document and it represents a shared section on the resource group.
Navigation: cluster > node name > resource group name > Shared Section > shared section name
Refer the Add New Shared Section of the Shared Section node for field descriptions.
The cluster probe is configured by double-clicking the line representing the probe in the Infrastructure Manager. This brings up the configuration
tool for the cluster probe. The configuration tool consists of a tree-structure on the left-hand side and a list on the right-hand side.
Note: CA UIM probes also may expose one or more shared sections, see Add New Shared section for further information. The cluster
probe will also send alarms if a cluster node or resource group changes state.
The following diagram outlines the process to configure the probe to monitor cluster probe.
Note: Refer the respective document of the probe(s) to create the monitoring profile.
Notes:
Cluster will only monitor those probes for which the profile has been created.
The monitoring profile should be created on the probe, which is already deployed on the cluster node
Notes:
This should be done on the cluster currently in control of the Resource group.
In HP_UX service guard cluster, a package is equivalent to a resource group.
Value
1.1.16
Cluster
1.1.16.1
Node
1.1.16.2
Resource Group
1.1.16.3
Package
To update the Subsystem IDs using Infrastructure Manager, follow these steps:
1. In Infrastructure Manager, right click on the NAS probe, select Raw Configure.
2. Click on the Subsystems folder.
3. Click on the New Key button.
4. Enter the Key Name and Value, Click OK.
5. Repeat this process for all of the required subsystem IDs for your probe.
6. Click Apply.
Note: In case of HP_UX service guard cluster, a package is equivalent to resource group.
A resource group or a package is visible under an active node in the tree structure that currently controls the resources.
When one of the nodes fails, the control of all resources is automatically transferred to another functioning node. The resource group is then
visible under the new functioning node in the tree structure on the GUI.
The CA UIM Cluster probe enables failover support for the standard CA UIM probes. In HP-UX service guard cluster, package failover is also
supported. Probe profiles of standard CA UIM probes can be associated with different resource groups and will then follow that group if it is
transferred to another cluster node.
Note: CA UIM probes can also expose one or more so-called shared sections, see Add New Shared section for further information.
The cluster probe also sends alarms if a cluster node or resource group changes state.
If configuring CA UIM probes to monitor shared resources, configure the monitoring profile on the probe residing on the Cluster Node currently in
control of these resources. When these monitoring profiles are defined as resources in the cluster probe, then the profiles are saved in the cluster
probe. The profiles are then automatically uploaded to the other nodes when they gain control of the resource group.
The cluster probe is configured by double-clicking the line representing the probe in the Infrastructure Manager, which brings up the configuration
tool for the cluster probe. The configuration tool consists of a tree-structure in the left pane and a list in the right pane.
The tree detects and displays the cluster structure (Nodes > Resource groups > Resources/Shared sections). The list in the right pane changes
its contents according to the selected item in the left pane. When a Resource group is selected in the left pane, then resources/shared sections th
at are defined for that resource group is listed in the right pane.
Note: While naming resources or resource groups, ensure that there are no preceding or trailing white spaces in the resource name.
For example, "Test<space> Group" is a valid resource name; however names such as "<space>TestGroup" or "TestGroup<space><sp
ace>" are not permissible.
In case you have resource names with preceding or trailing white spaces in the existing cluster probe configuration, the upgrade from cluster
probe version 2.3x to any higher version requires a clean install.
Probe Defaults
Probe Configuration
Probe Defaults
At the time of deploying a probe for the first time on robot, some default configuration gets deployed automatically. These probe defaults could be
Alarms, QoS, Profiles, and so on, which save time to configure the default settings. These probe defaults are seen on a fresh install, that is no
instance of that probe is already available on that robot in activated or deactivated state.
Probe Configuration
This section contains specific configuration for the cluster probe.
Cluster Properties Dialog
You open the Cluster Properties dialog by clicking the icon in the upper left corner or by right-clicking the cluster icon in the tree-structure and
selecting Properties. You can also add and delete nodes from this right-click menu.
This dialog has three tabs:
General
Alarm Settings
QoS Settings
General Tab
This section lets you know how to set the alarm message properties for both Node alarms and Resource Group alarms.
The fields are:
Node Alarm Properties (Active)
These alarm messages are sent when a node changes its state (for example, the node is shut down). Alarms from the cluster nodes are
activated when this option is checked.
Message
Here you can define the message text for alarms coming from nodes in the cluster. The following variables can be used in the text
string:
name
state
state_text
Clear
You can define the text for clear messages coming from nodes in the cluster. The following variables can be used in the text string:
name
state
state_text
Subs ID
Defines NimBUS subsystem ID. The mapping is listed in the NAS (Nimsoft Alarm Server).
Severity
Defines the severity level for alarms from the nodes. You can select different severity levels for state down, state up and state other
than up or down.
Resource Group Alarm Properties (Active)
These alarm messages are sent when a resource group changes its state. The alarm is sent when the state of the resource group changes (for
example, when a resource group goes from Online to Pending or from Offline to Pending).
Note: In case a resource group is detected in the FAIL state (not ONLINE), the alarm message displays which resource(s) in that group
is/are not online. In other words, the resource(s) name(s) and status are updated in the alarm message with the resource group status.
Active
Alarms from the resource groups are activated when this option is checked.
Message
Here you can define the message text for alarms coming from the resource group if the group has been moved to another node (either
manually or due to an error situation). The following variables can be used in the text string:
name
node
state
state_text
Clear
Resource groups can be moved to another node either manually or due to an error situation. A clear message is issued when the
resource group is moved back to the original node again. Here you can define the message text. The following variables can be used
in the text string:
name
node
state
state_text
Subs ID
Displays the NimBUS subsystem ID. The mapping is listed in the NAS (Nimsoft Alarm Server).
Severity
Defines the severity level for failover alarms from the resource groups.
QoS Settings Tab
This section lets you know how the cluster probe sends QoS messages on different resource group and node states.
The fields are:
Send QoS Message
Resource Group State
Selecting this check box enables the probe to send QoS messages. These QoS messages are sent when a resource group changes
its state to online, offline, or other.
Node State
Selecting this check box enables the probe to send QoS message when the node has state up, down or other.
Node Properties
The Node Properties dialog is displayed when you right-click a node icon in the tree-structure and select Properties.
Note: You can also open the Log viewer (through View Log option) window or restart the Cluster probe (through Restart Probe option
) on the selected Node from this right-click menu.
After you select the Properties option, the Node properties dialog appears. The dialog has two tabs:
General
Configuration
General
Configuration
The Configuration tab contains options for setting check intervals for the node.
To check intervals of the node, follow these steps:
1. Select the Configuration tab.
2. Specify the time interval (in seconds) for Check Group Interval (sets the check interval of the Resource groups on the node) and Check
Node Interval (sets the check interval of the Node).
3. Click OK button.
Resource Group Properties
Right-click a Resource Group icon in the tree-structure and select Properties. The Resource Group Properties dialog appears.
The fields are:
Name
Defines the name of the Resource Group.
Description
Specify the description of the Resource Group
Allow 'Partially Online'
By selecting this check box, you can make the cluster probe interpret 'Partially Online' resource groups as 'Online'.
1. Select a probe from the drop-down list.
The description and configured profiles are listed in the Description and Profile fields.
2. Select a profile from the list and then click OK.
The monitoring profiles that are saved in the cluster probe are automatically uploaded to the other nodes.
Add New Shared Section
CA UIM probes expose one or more shared sections for the cluster probe. These sections can be set up in the cluster probe to follow a Resource
Group upon failover.
All shared sections that the cluster probe has been configured for, are automatically synchronized between the probes running on the different
nodes in the cluster.
The shared section is not removed from the probe running on the node that loses control of the resource group. Any change that is made to that
shared section is reflected in probe configurations on the node taking over control of the associated resource group.
A shared section is linked to only the "/connections" or "/messages" section of the configuration files of the probe that runs on the node. Shared
sections represent shared resources that are defined for that resource group.
Whereas, a profile is associated with the monitoring profiles of any probe that runs on the node.
Note: Any changes to a shared section must be implemented on the probe running on the Node currently in control of the associated
Resource Group.
Restart Node
You can restart the cluster probes on the different nodes in the cluster by right-clicking the node in the tree-structure and selecting Restart Probe.
You can open the Log viewer window to study the log file for the cluster probe on the different nodes.
Follow these steps:
1. Right-click a node in the tree-structure and select View log.
The Log Viewer window appears, displaying log information for the selected cluster.
Note: There are two different log files in the cluster probe directory - cluster.log and plugin.log.
Cluster.log is default log file, but the user can change to view the log file plugin.log by right-clicking the cluster probe in the Infrastructure
Manager and selecting Edit.
Change from cluster.log to plugin.log in the Log File field of the dialog and click the OK button.
Note:
Cluster.log logs the cluster probe activity, such as the communication between the cluster probes and also other tasks that are
performed by the cluster probe (sending alarms, activating/deactivating profiles for other probes etc.).
Plugin.log logs information from the cluster software (via the different plugins). This is mainly status from Nodes and Resource
Groups).
One plugin fetching information is from Microsoft Clusters and another plugin for VERITAS clusters.
The cluster.cfg file displays version numbers and md5 values for the Cluster, Group, and Shared Disks.
You can open the cluster probe configuration file by selecting the cluster probe in the Infrastructure Manager and pressing CTRL + N keys.
When a checkpoint selected under Shared Sections is modified, the version number and md5 value of the relevant Cluster Group or Shared Disks
is updated on the node where the change has occurred.
As per the check intervals that are specified in the Configuration tab of Cluster Node Properties dialog, the version numbers and md5 values
are compared between the groups and nodes. In the case where the version numbers differ, the change is affected across the nodes.
Configure a Node
Add Monitoring Profiles
Delete Monitoring Profiles
Add Shared Section
Delete Shared Section
Configure a Node
cluster Node
<Node Name (selected)> Node
<Resource Group Name> Node
Profiles Node
<Profile Name> Node
<Shared Section> Node
<Shared Section Name> Node
The cluster probe enables failover support for the supported probes. The profiles of these probes can be associated with different resource
groups. The probe configurations that are saved in a resource group remain the same, even when that resource group moves to another cluster
node.
Note: A shared section is a common section between different monitoring profiles. The CA UIM probes get exposed to the shared
sections. The cluster probe also sends alarms when a cluster node or resource group changes state.
All cluster nodes must have the same set of CA UIM probes installed. The cluster probe updates the changes that are made to the profile setup.
When any probe profile is updated, the changes must be implemented on the probe that runs on the node which is in control of the associated
resource group. The cluster probe automatically updates most of the changes of the probe profiles. However, general probe parameters must be
updated manually on the different nodes.
The collected data must contain the name, and state of the cluster nodes, the name, and state of each resource group, and on what node they
run. This information is used to identify the recommended cluster node where you can run the probe configurations.
You can view the cluster node status metrics when you select any one of the available cluster nodes. However, you can view the related group
metrics only under that particular cluster node to which the group belongs.
cluster Node
This node lets you view the probe information and set the initial probe configurations.
Navigation: cluster
Set or modify the following values as required:
cluster > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
cluster > Set Up Configuration
This section lets you select the cluster type and the cluster name for the node to which you want to add the resource group. The cluster software,
MSCS (Microsoft Cluster Service (Windows only)) is required for this probe to work.
Select the Cluster Type: specifies the type of cluster software.
Note: The cluster probe supports all Microsoft Cluster configurations on Windows operating systems only.
Default: 0-Fatal
Use Cluster Name as Alarm Source: uses the cluster name as the source of alarm that the probe generates. If this check box is not
selected, then the cluster node name is used as the source of the alarms.
Default: Not Selected
This node represents one of the machines of the cluster on which the CA UIM robot is installed. You can configure the resource group properties
and the alarm properties on the active node. All cluster nodes must have the same set of CA UIM probes installed.
Note: This node is referred to as node name (selected) node in the document and it represents an active node of the cluster. The
Admin Console GUI is initiated from the active cluster node.
Note: QoS data and alarms are not generated after every check interval. They are generated on events like, probe restart, resource
group state change, or node state change.
Status: indicates the node state. 1 indicates up, 2 indicates down, and 3 indicates any other state.
<Resource Group Name> Node
This node lets you configure the properties of a resource group. A resource group comprises of shared sections and profile sections.
Note: This node is referred to as resource group name node in the document and it represents a resource group on an active cluster
node.
Note: The profiles, or shared sections, that are created when the resource group is online, are not visible when the same group goes
offline.
Status: indicates the resource group state. 1 indicates online, 2 indicates offline, 3 indicates partially online.
Profiles Node
This node lets you add a profile to the resource group. The profiles of various CA UIM probes are associated with different resource groups and
the configurations of these profiles are transferred along with the resource group in a failover scenario.
Navigation: cluster > node name > resource group name > Profiles
Set or modify the following values as required:
Profiles > Add New Profile
This section lets you add a profile to the resource group and configure its properties.
Profile Name: defines a unique profile name.
Probe: specifies the probe name that you want to associate with the resource group.
Note: Make sure that the specified probe is deployed on the cluster node.
This node lets you configure the properties of the profile that you add to a resource group.
Note: This node is referred to as profile name node in the document and it represents a profile on the resource group.
Navigation: cluster > node name > resource group name > Profiles > profile name
Refer to the Add New Profile section of the Profiles node for field descriptions.
<Shared Section> Node
A shared section is a common section between different monitoring profiles. The cluster probe is configured for various shared sections that are
synchronized between the probes that run on different clustered nodes. Any change in the shared section is also applied to the probe that runs on
an active node.
A shared section is linked to only /connections and /messages section of the probe configuration file.
Navigation: cluster > node name > resource group name > Shared Section
Set or modify the following values as required:
This node lets you configure the properties of the shared section that you implement on the probe configuration.
Note: This node is referred to as shared section name node in the document and it represents a shared section on the resource group.
Navigation: cluster > node name > resource group name > Shared Section > shared section name
Refer the Add New Shared Section of the Shared Section node for field descriptions.
Note: CA UIM probes also may expose one or more so-called shared sections, see Add New Shared Section for further information.
The cluster probe will also send alarms if a cluster node or resource group changes state.
Contents
Probe Defaults
Probe Configuration Interface Installation for cluster
Probe Configuration
Cluster Properties Dialog
General Tab
Alarm Settings Tab
QoS Settings Tab
Node Properties
General
Configuration
Resource Group Properties
Add New Shared Section
Restart Node
Open Log Viewer
Configuration File versions
The Cluster software, MSCS which stands for Microsoft Cluster Service (Windows only) or VCS which stands for VERITAS Cluster Service
(Windows, Linux, Solaris, and AIX), or RedHat Cluster Service (Linux) is required for this probe to work.
Each of the nodes in the cluster must have the same set of CA UIM probes installed. The cluster probe will automatically update changes to
profile setup. These changes must be implemented on the probe running on the node currently in control of the associated resource group.
General probe parameters must be updated manually on the different nodes.
The collected data must contain the name and state of the nodes, the name and state of each resource group and on what node they are running.
This information is used to manage where the different probe configurations should run and not run.
When the cdm probe runs in a clustered environment in co-operation with the cluster probe:
If the flag /disk/fixed_default/active=yes; then cdm will issue alarms about disks which no longer is available at the current node (because
it has been failed over to another node in the cluster). The cluster probe has removed the configuration, but cdm will use the default
settings and issue an alarm about the disk size being 0MB.
The solution is to set the flag /disk/fixed_default/active=no.
As mentioned previously, each of the nodes in the cluster must have the same set of CA UIM probes installed. These CA UIM probes must be
configured with setup parameters and monitoring parameters according the description of the different probes involved.
If configuring CA UIM probes to monitor shared resources, you must configure the monitoring profile on the probe residing on the Cluster Node
currently in control of these resources. When these monitoring profiles are defined as resources in the cluster probe then the profiles will be saved
in the cluster probe and will be automatically uploaded to the other nodes when they gain control of the resource group.
The cluster probe is configured by double-clicking the line representing the probe in the Infrastructure Manager. This brings up the configuration
tool for the cluster probe. The configuration tool consists of a tree-structure on the left-hand side and a list on the right-hand side.
The tree detects and displays the cluster structure (Nodes > Resource groups > Resources/Shared sections). The list on the right-hand side will
change its contents according to the selected item on the left-hand side. When a Resource group is selected on the left-hand side, then resources
/shared sections defined for that resource group will be listed on the right-hand side.
Note: While naming resources or resource groups, ensure that there are no preceding or trailing white spaces in the resource name.
For example, "Test<space> Group" is a valid resource name; however names such as "<space>TestGroup" or "TestGroup<space><sp
ace>" are not permissible.
In case you have resource names with preceding or trailing white spaces in the existing cluster probe configuration, the upgrade from cluster
probe version 2.3x to any higher version will require a clean install.
Probe Defaults
At the time of deploying a probe for the first time on robot, some default configuration will get deployed automatically. These probe defaults could
be Alarms, QoS, Profiles and so on which save time to configure the default settings. These probe defaults will be seen on a fresh install, that is
no instance of that probe is already available on that robot in activated or deactivated state.
Probe Configuration
This section contains specific configuration for the cluster probe.
This section will let you know how to set the alarm message properties for both Node alarms and Resource Group alarms.
The fields are:
Node Alarm Properties (Active)
These alarm messages are sent when a node changes its state (for example the node is shut down). Alarms from the cluster nodes are
activated when this option is checked.
Message
Here you can define the message text for alarms coming from nodes in the cluster. The following variables can be used in the text
string:
name
state
state_text
Clear
Here you can define the text for clear messages coming from nodes in the cluster. The following variables can be used in the text
string:
name
state
state_text
Subs ID
This is the NimBUS subsystem ID. The mapping is listed in the NAS (Nimsoft Alarm Server).
Severity
Defines the severity level for alarms from the nodes. You can select different severity levels for state down, state up and state other
than up or down.
Resource Group Alarm Properties (Active)
These alarm messages are sent when a resource group changes its state. Note that the alarm is sent as soon as the state of the
resource group changes (for example when a resource Group goes from "Online" to "Pending" or from "Offline" to "Pending").
Note: In case a resource group is detected in the FAIL state (not ONLINE), the alarm message will display which resource(s) in that
group is/are not online. In other words, the resource(s) name(s) and status are updated in the alarm message along with the resource
group status.
Monitoring Profiles associated with a Resource Group that is NOT "Online" will be deactivated.
Upon state change from "Online" to "Pending":
You will get an alarm from the cluster probe indicating the state of the resource group.
NO alarms from other CA UIM probes monitoring profiles associated with this resource group (these profiles will be
deactivated/removed when state changes to "Pending", and inserted/activated again when state changes back to "Online").
If this flag is turned OFF:
Monitoring profiles associated with a resource group will be kept active independently of resource group state.
Upon state change from "Online" to "Pending":
NO alarm from cluster probe about the state of the resource group.
Other CA UIM probes monitoring profiles associated with this resource group will send alarms if applicable (Not considering the state
of the resource group).
Active
Alarms from the resource groups are activated when this option is checked.
Message
Here you can define the message text for alarms coming from the resource groups. The following variables can be used in the text
string:
name
node
state
state_text
Clear
Here you can define the text for clear messages coming from resource groups in the cluster. The following variables can be used in
the text string:
name
node
state
state_text
Subs ID
This is the NimBUS subsystem ID. The mapping is listed in the NAS (Nimsoft Alarm Server).
Severity
Defines the severity level for alarms from the resource groups. You can select different severity levels for state down, state up and
other state than up or down.
This section will let you know how the cluster probe sends QoS messages on different resource group and node states.
The fields are:
Send QoS Message
Resource Group State
Selecting this check box enables the probe to send QoS messages. These QoS messages are sent when a resource group changes
its state to online, offline, or other.
Node State
Selecting this check box enables the probe to send QoS message when the node has state up, down, or other.
Node Properties
The Node Properties dialog is displayed when you right-click a node icon in the tree-structure and select Properties.
Note: You may also open the Log viewer (through View Log option) window or restart the Cluster probe (through Restart Probe option) on the
selected Node from this right-click menu.
After you select the Properties option, the Node properties dialog appears. The dialog has two tabs:
General
Configuration
General
Configuration
The Configuration tab contains options for setting check intervals for the node.
To check intervals of the node, follow these steps:
1. Select the Configuration tab.
2. Specify the time interval (in seconds) for Check Group Interval (sets the check interval of the Resource groups on the node) and Check
Node Interval (sets the check interval of the Node).
3. Click OK button.
Note: This should be done on the Cluster currently in control of the Resource group.
Note: Any changes to a shared section must be implemented on the probe running on the Node currently in control of the associated
Resource Group.
1.
The New Shared Section properties dialog appears.
2. Select a probe from the drop-down list.
The configured sections will be listed in the profile window.
3. Select a section from the list and then click OK.
The sections saved in the cluster probe will be automatically uploaded to the other nodes.
Restart Node
You can restart the cluster probes on the different nodes in the cluster by right-clicking the node in the tree-structure and selecting Restart Probe.
NOTE: There are two different log files in the cluster probe directory - cluster.log and plugin.log.
Cluster.log is default log file, but the user may change to view the log file plugin.log by right-clicking the cluster probe in the Infrastructure
Manager and selecting Edit.
Change from cluster.log to plugin.log in the Log File field of the dialog and click the OK button.
Notes:
Cluster.log logs the cluster probe activity, such as the communication between the cluster probes, and also other tasks performed by the
cluster probe (sending alarms, activating/deactivating profiles for other probes etc.).
Plugin.log logs information from the cluster software (via the different plugins). This is mainly status from Nodes and Resource Groups).
There is one plugin fetching information from Microsoft Clusters, and another plugin for VERITAS clusters.
cluster Metrics
This article describes the metrics that can be configured for the Clustered Environment Monitoring (cluster) probe.
Contents
QoS Metrics
QoS Metrics
The following table describes the checkpoint metrics that can be configured using the cluster probe.
QoS Name
Units
Description
Version
QOS_CLUSTER_GROUP_STATE
State
3.0
QOS_CLUSTER_NODE_STATE
State
3.0
Warning
Threshold
Warning
Severity
Error
Threshold
Error
Severity
Description
Node Alarm
none
minor
none
none
Resource Group
Alarm
none
minor
none
none
Resource Group
Failover Alarm
none
minor
none
none
Resource group has moved to another node due to failover. This alarm is not cleared
untill the resource group moves back to the original cluster node.
Units
Description
Version
QOS_CLUSTER_PACKAGE_STATE
State
3.2
QOS_CLUSTER_NODE_STATE
State
3.2
Warning
Threshold
Warning
Severity
Error
Threshold
Error
Severity
Description
Node Alarm
none
minor
none
none
Package State
Alarm
none
minor
none
none
Package Fail
over Alarm
none
minor
none
none
Package has moved to another node due to failover. This alarm is not cleared until the
package moves back to the original cluster node.
cluster Troubleshooting
This article describes the troubleshooting steps for Clustered Environment Monitoring (cluster) probe.
Symptom:
The cluster node name and its corresponding hostname do not match. Hence the probe gives an error "The node has no valid cluster address ***"
while creating the profile/section."
Solution:
For example, you have 2 nodes with their corresponding cluster node name as lcl-node1 and lclnode2 and their corresponding hostname as
HN1 & HN2 respectively.
Follow these steps:
1. Open cluster Raw Configuration on node HN1
2. In Setup section, enter lcl-node1 key in 'node' textbox.
3. Click OK and restart the probe.
4. Follow the above steps for hostname HN2 and cluster node name lclnode2.
More information:
See the CA Unified Infrastructure Management site for information on how to Discover Systems to Monitor.
controller
The controller probe is a core component of the CA UIM robot.
Controllers schedule, start, and stop the probes that the robot manages.
Controllers keep configured probes running according to the probe configurations.
Controllers maintain contact with parent hubs and handle CA UIM operations, including hub state, name-lookup services, and licenses.
Controllers respond to directives in the robot.cfg and the controller.cfg files, and to commands issued over the
registered port TCP/48000.
Note: SSL communication mode options are more meaningful with the release of controller v7.70. The controller creates the robot.
pem certificate file during startup. The file enables encrypted communication over the OpenSSL transport layer. The treatment of
the robot.pem certificate file has changed. For details, see Impact of Hub SSL Mode When Upgrading Nontunneled Hubs in the h
ub Release Notes. Changes in treatment impact communication for:
Hubs set to mode 0 (unencrypted)
Hubs set to mode 2 (SSL only)
Hub managed components
Important! The cloning of robots is not supported. Use the cloud installation option for the robot to ensure that it starts after the
instance is created. For more information, see Install a Windows Robot in the CA Unified Infrastructure Management documentation.
More information:
Controller Release Notes
See the Robot Attribute Reference in the CA Unified Infrastructure Management documentation for information about required
and optional robot attributes.
Controller
The Controller node lets you view information about the hub and adjust settings.
Probe Information (read only) provides the probe name, start time, version, and vendor.
Hub Connectivity (read only) provides the name of the current hub, the names of the primary (parent) hub and nonprimary hub that this
robot attaches to during failover.
General Configuration
Identification property User Tag 1 and Identification property User Tag 2
User tags are optional values that can be attached to probe-generated messages to control access to the data in USM.
On a robot system, user tags are specified in robot.cfg.
In hub v7.80, if user tags do not exist in the hub.cfg file and the os_user_retrieve_from_robot option is true
(default):
The user tags are copied from the robot.cfg file to the hub.cfg file when the hub restarts.
After the user tags are copied, the os_user_retrieve_from_robot option is set to false in the hub.cfg file.
When the option is set to false, a user can clear the user tags in the
In hub v7.70, user tags on a hub system are specified in hub.cfg. User tags that are defined in robot.cfg are ignored.
Before hub v7.70, user tags on a hub system are read from
Log Level
Sets the amount of detail to be logged in the log file
Log as little as possible during normal operation to reduce disk consumption
Increase the log level for debugging. Default, 0 - Fatal
Log Size (KB)
Change the size of the log file
When the log size is reached, new entries are added and the oldest log file entries are deleted. Default, 1024 KB
Hub update interval (minutes)
The interval at which the hub is contacted with an alive message.
The range is 1 through 180 minutes. Default, 15 minutes
The hub is notified on a shorter interval when changes occur in the probe list.
On Robot uptime, reported as state changes If this option is selected, QoS messages on robot uptime are sent.
The QoS message status = up is sent when the robot starts.
The QoS message status
Select this option to use the robot name for the QoS source for alarms.
By default, the host name of the computer hosting the probe is used as the QoS source. Default, Enabled
Status (read only) provides information about the status of the robot.
Setup
The controller, Setup node lets you view and modify robot configuration settings.
Robot Name
The default name is the computer host name, and is auto-detected by default. Default, Auto Detect (recommended)
Secondary Hub Robot Name
If the robot parent hub is unavailable, specify the method to use to locate a temporary parent hub.
By default, the robot auto-detects the nearest active hub (recommended).
Search the subnet for a temporary hub when primary and nonprimary hubs are unavailable
Select this option to enable the robot process and port controller to search the Internet for a temporary hub. Default, Automatically
Detect (Search the subnet)
Advanced
Automatically unregister from hub on shutdown
When the robot stops, an unregister message is sent to the parent hub.
When selected, the robot disappears from Infrastructure Manager.
When not selected, the stopped robot displays a red icon, enabling the operator to detect the situation.
Suspend all probes when no network connection is available
When a robot does not have a network connection, the robot can remain active or can enter a sleep mode. In sleep mode, all
probes are suspended until a network connection is available.
If this option is not selected, the alarm messages are spooled and flushed when a network connection becomes available. Default,
selected (recommended)
Configuration file locking
Lock the configuration.
First Probe port number
Daemon type probes typically register a command port, which is allocated at runtime during probe start-up.
Use this option to specify a specific range of port numbers for router or firewall purposes. Default, 48000
Time offset from UTC (override the local time zone setting) sec
This option overrides the local time zone setting
The time specification must be entered as a time offset from UTC in seconds
When no contact with hub
An unmanaged robot is one which has lost connection with a hub. Default, Allow move only within domain (recommended)
Environment Variables
Variable Name The name of an environment variable that is defined in the robot. The robot manages the probes that inherit the
variable.
Variable Value The value that is assigned to the variable name.
Alarms
Probe restart when the probe does not respond Select the alarm to send when a probe does not respond and is restarted. Default,
no alarm
Dispatch error (internal) Select the alarm to send when there is a dispatch error for the probe. Default, major
Max restart attempts reached for the probe Select the alarm to send when the maximum number of restart attempts are reached.
Default, major
Error starting probe Select the alarm to send when there is an error starting the probe. Default, major
Timed probe not finished at the next start time Select the alarm to send when a timed probe is not complete at the next start time.
Default, warning
Timed probe did not return 0 on termination Select the alarm to send when the probe does not return a value of 0. Default, warning
Probe unregisters port but keeps running Select the alarm to send when a running probe unregisters the port. Default, major
Probe distribution request error Select the alarm to send when there is an error during probe distribution. Default, major
Distribution post-install error Select the alarm to send when there is an error after the probe is distributed and installed. Default, no
alarm
Time interval at which alarm will be resent (minutes) Select the number of minutes elapsed before resending an alarm. Default, 0
Virtual Remote probes install and run on computers without a robot. The remote probe is configured with the path to the robot. Virtual en
ables the controller to set up the communication between the robot and the virtual probe running a remote probe. Virtual robots, using the
proxy probe, are created for remote probes.
Important! The Netware System probe is the only probe that can be set up on a virtual robot.
NAT lets you allocate an external IP address to a robot residing on an IP network with addressing which is incompatible to the hub. The
robot is able to send alarms and QoS messages. The hub is not able to communicate with the robot. When Network Address Translation
(NAT) is used, the robot must be configured to provide the correct address to the hub. All communication from the hub to the robot uses
this address. NAT can be set for the primary and the nonprimary hubs.
IP of the Robot as seen from the primary hub - Enter the name of the robot as shown in the primary hub configuration. Default,
Re
al_IP
IP of the Robot as seen from the secondary hub - Enter the name of the robot as shown in the secondary hub configuration.
Default, Real_IP
IP
Robot IP Address - Select the option for detecting the IP address, either automatically detect or enter a specific hub address.
IP version support - Select the IP version support for the robot.
Local IP validation - Select this option to perform local IP validation.
Data origin
Origin - Enter the origin name to associate with the robot.
Note: Changing the origin might cause probes (other than hub, controller, and spooler) to restart to pick up the changes.
Marketplace Settings
Marketplace Username - The username that is used to run the marketplace probes.
Important! The username applies to all marketplace probes on the same robot. We highly recommend that you do not use
the root or the Administrator account. Grant to the marketplace user the least privileges required to run the probes.
Changing the marketplace settings might cause marketplace probes to start or restart to pick up the changes.
Important! If the controller is not in passive mode, select Passive Mode Required. If passive mode is not required, the
robot allows marketplace probes to run anyway. For security reasons, we highly recommend that marketplace probes run
on passive robots. Changing the marketplace settings might cause marketplace probes to start or restart to pick up the
changes.
Setup
Nimsoft
Misc
Advanced
Environment
Virtual
Alarm
NAT
IP
Status
Advanced Configuration
Using DNS Names for the Hub Server in robot.cfg
Limiting the Number of Open Ports
Robot Maintenance Mode
Setup
Nimsoft
Set up the robot and a secondary failover hub in the Nimsoft tab.
Note: if the Search the subnet for a temporary hub option is selected, this option changes to the Automatically detect option.
Automatically detect (searches the subnet) - the robot searches the subnet for a responding hub within the domain when the
parent and secondary hubs are unavailable.
Note: This option is displayed if the Search the subnet for a temporary hub option is selected.
Specify Hub (IP/name) - the robot attaches to the hub specified by IP address or host name
Specify Hub (domain name) - the robot attaches to the hub within the specified CA UIM domain
Search the subnet for a temporary hub when primary and secondary hubs are unavailable - If the parent and secondary hubs
are unavailable, the robot searches the subnet for a failover hub
Note: Selecting this option changes the Wait for the primary hub to become available option to the Automatically
detect (searches the subnet) option.
Misc
Use the Misc tab to set logging information, identification properties, quality of service, and the robot interaction mode.
Identification properties - specify User tag 1 and User tag 2. User tags are user-defined tags which are used as a grouping and
locating mechanism. The tags display in various lists in Infrastructure Manager.
Log Level - set the amount of detail to be logged to the log file. Log as little as possible during normal operation to reduce disk
consumption. Increase the level of detail when you are debugging.
Log Size - change the maximum size of the log file. The default size of the log file is 1024 KB.
Hub update Interval - the interval, in minutes, the hub is contacted with an alive message. The range is 1 through 180 minutes. The
hub is notified at a shorter interval when changes occur in the probe list.
Quality of Service Robot mode - select normal or passive. Passive robots do not initiate the communication with a hub. The hub
initiates all contact with the robot.
On Robot uptime, reported as state change - sends QoS messages for robot up time. The QoS message status
sent when the robot starts, and status
= up is
Set QoS source to robot name instead of computer hostname - use the robot name that is specified on the Setup, Nimsoft tab.
Note: This option is disabled unless the Set specific name (override) option is selected on the Setup, Nimsoft tab.
Advanced
Use the Advanced tab to set robot options, and the data origin.
Environment
The robot controller reads the variable, value pair in the list and inserts it into the robot environment. The probes inherit the environment from the
robot.
You can add, edit, or delete environment variables. Right-click and select the appropriate menu option. Only add environment variables that all
probes on the robot use.
Virtual
Virtual robots are robots for probes that are installed on computers lacking a local robot. Probes of this type are named remote probes. The Virtua
l tab lists the virtual robots.
The netware probe is the only probe that can be set up on a virtual robot.
The proxy probe creates virtual robots for remote probes. The configuration of the remote probe determines which virtual robot to use.
In Infrastructure Manager, the remote probe appears as a probe on a virtual robot.
To add user tags for a virtual robot:
1. Select the hub in Infrastructure Manager. The robots are listed in the main window.
The Version column identifies the virtual robots.
In the following example, lab
5 is a virtual robot on xpkost. The robot xpkost serves the virtual robot lab 5.
2. Right-click the xpkost robot in the navigation pane, and select Properties to open the controller configuration tool.
3. Select the Setup, Virtual tab. Right-click to define user tags for the virtual robot.
Alarm
The controller issues alarms for error conditions. The Alarm tab lists the internal alarm messages. Select the severity level for each alarm
message. You can select no alarm for the condition Probe restart when the probe does not respond.
Time interval at which alarms will be resent - the time interval in minutes between alarms which are sent for an error condition.
If this field is left blank, the controller does not resend the alarms.
NAT
Use the Network Address Translation (NAT) feature to allocate an external IP address to a robot. Use NAT when the network addresses of the
hub and robot are incompatible.
Use the Setup, NAT tab to setup NAT.
If the robot and the hub are separated by a NAT device, the robot can send alarms and QoS messages. The hub cannot respond. To solve the
problem, set up static NAT mappings for each robot behind the NAT device. Use Raw Configure to add the key robotip_alias to the
controller section of the robot.cfg file on the robot.
connects to the hub.
robotip_alias specifies the static NAT to IP address mapping. The hub and the clients on the other side of the NAT device use the
mapping to access the robot. For example:
<controller>
robotip_alias = 193.71.55.153
...
<\controller>
IP
Use the IP tab to configure the IP information for the robot.
Robot IP address:
Automatically detect
The host IP address is used
Set specific address(es) (override)
Specify an IP address or set of IP addresses for the robot. An override is typically used when a host has more than one network
interface.
Separate multiple IP addresses with a comma. IP addresses can contain asterisk wildcards.
Valid entries:
198.2.3.5, 138.3.4.10
198.2.*.*, 138.3.4.10
The controller and the probes it starts only listen for connections on the NIC addressed by
ses, the controller and the probes listen on
198.2.
138.3.4.10
Status
Robot name
The actual or current name of the robot
Robot IP address
The actual or current IP address of the robot
Robot version
The version of the robot
Op. Sys.
Operating system name
Op. Sys. type
The type of the operating system (UNIX, Windows)
Op. Sys descr
Operating system description
Indicator and status information
The status indicator, together with a status message, gives information about the current controller status.
Green - OK
Red - error condition
Right-click the indicator to put the robot into maintenance mode for the period specified.
Installed packages
The Installed Packages button opens a window which lists all the packages which are installed on the robot.
Robot environment
The Robot Environment button opens the Environment window, which displays the variables and values for the computer running the
robot.
Note: Any probes that the robot starts inherit the environment.
Hub connectivity
Display the current, primary, and nonprimary hubs
Advanced Configuration
Using DNS Names for the Hub Server in robot.cfg
Version 2.80 or higher of the controller, allows the DNS name of the hub server to be used in the
use the full DNS name instead of the IP address in the hubip parameter
use the full DNS name in the hub_dns_name parameter
Use the full DNS name of the server as the hubip parameter to allow the robot to recognize the main hub and return to it after a failover.
The robot move operation replaces the hubip parameter with an IP address. If the DNS server is down, the robot can fail over to a different
hub.
Use the full DNS name in the parameter hub_dns_name to allow the robot to use the hub_dns_name to look up the hub IP address. If
the lookup fails, the hubip parameter is used. When the DNS name lookup is successful, and the IP address is different from the hubip par
ameter, the parameter is replaced.
The same functionality is available for the secondary hub, using the secondary_hub_dns_name parameter.
The
the secondary_hub_dns_name is lost when the controller configuration tool changes the secondary hub.
<controller>
hubip =
hub_dns_name =
secondary_hub_dns_name =
</controller>
robot.cfg, set proxy_mode = 1 to send all the traffic through a specific port, for example, port 48000.
pooler, hdb, hub, distsrv, nas, and proxy). Any other probes are suspended while the robot is in maintenance mode.
When the maintenance mode is invoked, supply the stop time for the maintenance mode. When the stop time is reached, the robot returns to
normal operation. Any suspended probes are restarted.
Probe distribution and activation can be performed while the robot is in maintenance mode. When the maintenance mode is complete, the
affected probes are started.
Controller
Setup
To access the controller configuration interface, select the robot in the Admin Console navigation pane. In the Probes list, click the arrow to the
left of the controller probe and select Configure.
Controller
The Controller node lets you view information about the hub and adjust log file settings.
Probe Information (read only) provides the probe name, start time, version, and vendor.
Hub Connectivity (read only) provides the name of the hub, the names of the primary (parent) hub and secondary hub that this hub's
robots will attach to during failover.
General Configuration lets you configure the following items.
Identification property User Tag 1 and Identification property User Tag 2: User tags are optional values that can be attached to
probe-generated messages to control access to the data in USM. On a robot system, user tags are specified in robot.cfg. As of hub
v7.70, user tags on a hub system are specified in hub.cfg, and user tags defined in robot.cfg are ignored. Prior to hub v7.70, user tags
on a hub system were read from robot.cfg. Default: blank
Log Level: Sets the amount of detail to be logged to the log file. Log as little as possible during normal operation to reduce disk
consumption, and increase the level of detail when debugging. Default: 0 - Fatal
Log Size (KB): Allows you to change the size of the log file according to your needs. When the log file size is reached, new entries
are added and the oldest log file entries will be deleted. Default: 1024 KB
Hub update interval (minutes): Determines at what interval the hub is contacted with an "alive" message. The range is 1 to 180
minutes. Note that the hub is notified on a shorter interval when changes occur in the probe list.
On Robot uptime, reported as state changes: If this option is selected, QoS messages on robot uptime will be sent. The QoS
message status = up is sent when the robot starts, and status = down is sent when the robot stops.
Set QoS source to robot name instead of computer host name: Select this option to use the robot name for the QoS source when
sending alarms. By default, the host name of the computer hosting the probe is used as the QoS source. Default: Disabled
Status provides the status of the robot and is read-only.
Setup
Th controller > Setup node lets you view and modify robot configuration settings.
Nimsoft lets you specify:
Robot Name: The default name is the computer host name, and is auto-detected by default (recommended). Default: Auto Detect
(recommended).
Secondary Hub Robot Name: Defines the method used to locate a temporary parent hub if the robot's own parent hub is
unavailable. By default, the robot will auto-detect the nearest active hub (recommended).
Search the subnet for a temporary hub when primary and secondary hubs are unavailable: Select this option to enable the
robot process and port controller to search the internet for a temporary hub. Default: not selected.
Advanced lets you specify:
Automatically unregister from hub at shutdown: When the robot is stopped, an unregister message is sent to the hub on which it
is registered. This will make the robot disappear from Infrastructure Manager. If this is not selected, the stopped robot will appear with
a red icon, enabling the operator to detect the situation.
Suspend all probes when no network connection is available: When running the robot on a computer with no network connection,
you can determine whether the robot should be active or enter a sleep mode where all probes are suspended until a network
connection is again available. If this option is not selected the alarm messages will be spooled and flushed when a network connection
is again available. Default: selected (recommended).
Important! The Netware System probe is the only probe that can be set up on a virtual robot.
NAT lets you allocate an external IP address to a robot that resides on another IP network with incompatible addressing. The robot will
be able to send alarms and QoS messages, but the hub will not be able to communicate back to the robot. When Network Address
Translation (NAT) is in effect between the robot and the hub, the robot must be configured to provide the correct address to the hub. All
communication from the hub to the robot will use this address. NAT can be set for the primary and secondary hubs.
IP of the Robot as seen from the primary hub: Enter the name of the robot as shown in the primary hub configuration.
IP of the Robot as seen from the secondary hub: Enter the name of the robot as shown in the secondary hub configuration.
IP
Robot IP Address: Select the option for detecting the IP address, either automatically detect or enter a specific hub address.
IP version support: Select the IP version support for the robot.
Local IP validation: Select this option if you want local IP validation performed.
Data origin
Origin: Enter the origin name you would like associated with this robot.
Update button: Click this button to update the origin name associated with this robot.
Setup
Nimsoft
Misc
Advanced
Environment
Virtual
Alarm
NAT
IP
Status
Advanced Configuration
Using DNS Names for the Hub Computer in robot.cfg
Limiting the Number of Open Ports
Robot Maintenance Mode
Setup
The Setup tab contains the following sections:
Nimsoft
Misc
Advanced
Environment
Virtual
Alarms
NAT
IP
Nimsoft
The Nimsoft tab allows you to set up the robot and a secondary hub.
Secondary HUB defines the method used to determine the secondary hub. The secondary hub will be used if the primary hub is
unavailable. Options are:
Wait for primary hub to become available prevents the robot from attaching to another hub. It will attach to its parent hub when the
hub is active.
Note: This option is replaced by the Automatically detect option if the Search the subnet for a temporary hub option is
selected.
Automatically detect (searches the subnet) allows the robot to search the subnet for a responding hub within the domain when the
parent and secondary hubs are unavailable.
Note: This option is displayed if the Search the subnet for a temporary hub option is selected.
Specify Hub (IP/name) allows the robot to attach to the specified hub.
Specify Hub (domain name) allows the robot to attach to the specified hub.
Search the subnet for a temporary hub when primary and secondary hubs are unavailable allows the robot to search for a
temporary hub if the parent and secondary hubs are unavailable. Selecting this option changes the Wait for the primary hub to
become available option to Automatically detect (searches the subnet) option.
Misc
The Misc tab allows you to set logging information, identification properties, quality of service and robot interaction mode.
Identification properties lets you specify User tag 1 and User tag 2, which are user-defined tags to be used as a grouping/locating
mechanism. The tags are displayed in various lists in Infrastructure Manager.
Log Level sets the amount of detail to be logged to the log file. Log as little as possible during normal operation to reduce disk
consumption, and increase the level of detail when debugging.
Log Size allows you to change the size of the log file according to your needs. The default size of the log file is 1024 KB.
Hub update Interval determines at what interval the hub is contacted with an "alive" message. The range is 1 to 180 minutes. Note that
the hub is notified at a shorter interval when changes occur in the probe list.
Quality of Service:
On Robot uptime, reported as state change sends QoS messages on robot uptime. The QoS message status = up is sent when the
robot starts; status down is sent when the robot stops.
Set QoS source to robot name instead of computer hostname uses the robot name specified on the Setup > Nimsoft tab.
Note: This option is disabled unless the Set specific name (override) option is selected on the Setup > Nimsoft tab.
Robot mode is normal or passive. Passive robots cannot initiate communication with a hub. All contact must be initiated by the hub.
Advanced
The Advanced tab allows you to set robot options and the data origin.
Environment
The robot controller will read the variable/value pair in the list and insert it into the robot environment. This environment is inherited by the probes
managed by the robot.
You can add, edit or delete these environment variables by right-clicking in the screen and selecting the appropriate menu option. Only add
environment variables that you want all probes on this robot to use.
Virtual
This tab lists virtual robots served by the robot controller. The netware probe is the only probe that can be set up on a virtual robot.
Virtual robots will, via the proxy probe, be created for remote probes (probes installed and running on computers without a robot). The remote
probe is configured to know which robot to be served by.
In Infrastructure Manager, the remote probe will appear as a probe on a virtual robot.
Alarm
This tab contains the internal alarm messages issued by the controller on the different error situations that may occur. You are allowed to select
the severity level for each alarm message. For the condition Probe restart when the probe does not respond, you can also select that no alarm
message is issued.
The option Time interval at which alarms will be resent defines the time interval in minutes between alarms being sent on an error condition.
If this field is left blank, the controller will never attempt to resend the alarms.
NAT
The feature allows you to allocate an external IP address to a robot that resides on another IP network with incompatible addressing.
This tab contains the setup for NAT (Network Address Translation).
If the robot is separated from the hub by a NAT (Network Address Translation) device, the robot will be able to send alarms and QoS messages,
but the hub will be unable to communicate back. One solution is to setup static NAT mappings for each robot behind the NAT device. Using Raw
Configure, you can then add the key robotip_alias to the controller section of the robot.cfg file on the robot computer. This key changes the IP
address that gets registered when the robot initially connects to the hub.
The key robotip_alias should specify the static NAT mapping IP address that the hub and the clients on the other side of the NAT device should
use to access the robot. For example:
<controller>
robotip_alias = 193.71.55.153
...
<\controller>
IP
The IP tab allows you to configure the IP information for this robot.
Robot IP-address:
Automatically detect
Status
This section describes the properties for the Status tab.
Robot name
The actual/current name of the robot.
Robot IP-addr.
The actual/current ip-address of the robot.
Robot version
The version of the robot.
Op. Sys.
Operating system name.
Op. Sys. type
The type of the operating system (UNIX/Windows etc.).
Op. Sys descr
Operating system description.
Indicator and status information
The status indicator will, together with a status message, give information about the current controller status (Green icon means OK, Red
icon means an error situation).
Right-clicking on the indicator lets you set the robot into maintenance mode for the period specified. See the section for more information.
Installed packages
Lists all packages installed on the robot in a separate window.
Robot environment
Opens the Robot Environment window. This window displays the variables and values for the computer running the robot.
Note: The values are inherited by any probe started by this robot.
Hub connectivity
Displays the current, primary and secondary hubs.
Advanced Configuration
Using DNS Names for the Hub Computer in robot.cfg
Version 2.80 or higher of the probe allows the DNS name of the hub machine to be used in the robot.cfg file in two ways: using the full DNS name
instead of the IP address in the hubip parameter or using the full DNS name in the hub_dns_name parameter.
Using the full DNS name of the computer instead of the IP address (as the hubip parameter) allows the robot to recognize its main hub and return
to it after a failover situation.
The hubip parameter will be replaced by an IP address on a robot move operation. The robot can fail over to a different hub if the DNS is down.
Using the full DNS name in the parameter hub_dns_name allows the robot to attempt to use the hub_dns_name to look up the hub IP address. If
this fails, the hubip parameter is used. When the DNS name lookup is successful and the ip address found is different from the hubip parameter,
this parameter is replaced.
The same functionality is available for the secondary hub, using the secondary_hub_dns_name parameter.
Note that the hub_dns_name is lost on robot move and secondary_hub_dns_name is lost on change of secondary hub initiated by the controller
configuration tool.
<controller>
hubip =
hub_dns_name =
secondary_hub_dns_name =
</controller>
Controller
Setup
To access the controller configuration interface, select the robot in the Admin Console navigation pane. In the Probes list, click the arrow to the
left of the controller probe and select Configure.
Controller
Navigation: controller
This section lets you view information about the hub and adjust log file settings.
Probe Information
This section provides the basic probe information and is read-only.
Hub Connectivity
This section provides the hub information and is read-only.
General Configuration
This section lets you configure the following items.
Identification property User Tag 1: User defined tag used as a grouping/locating mechanism.
Default: blank
Identification property User Tag 2: User defined tag used as a grouping/locating mechanism.
Default: blank
Log Level: Sets the amount of detail to be logged to the log file. Log as little as possible during normal operation to reduce disk
consumption, and increase the level of detail when debugging.
Default: 0 - Fatal
Log Size (KB): Allows you to change the size of the log file according to your needs. When the log file size is reached, new entries
are added and the oldest log file entries will be deleted.
Default: 1024 KB
Hub update interval (minutes): Determines at what interval the hub is contacted with an "alive" message. The range is 1 to 180
minutes. Note that the hub is notified on a shorter interval when changes occur in the probe list.
On Robot uptime, reported as state changes: If this option is selected, QoS messages on robot uptime will be sent. The QoS
message status = up is sent when the robot starts, and status = down is sent when the robot stops.
Set QoS source to robot name instead of computer host name: Select this option to use the robot name for the QoS source when
sending alarms. By default, the host name of the computer hosting the probe is used as the QoS source.
Default: Disabled
Status
This section provides the status of the robot and is read-only.
Setup
Navigation: controller > Setup
This section lets you view and modify robot configuration settings.
Nimsoft
This section lets you specify:
Robot Name: The default name is the computer host name, and is auto-detected by default (recommended).
Default: Auto Detect (Recommended)
Secondary Hub Robot Name: Defines the method used to locate a temporary parent hub if the robot's own parent hub is
unavailable. By default, the robot will auto-detect the nearest active hub (recommended).
Search the subnet for a temporary hub when primary and secondary hubs are unavailable: Select this option to enable the
robot process and port controller to search the internet for a temporary hub. Default: not selected.
Advanced
Automatically unregister from hub at shutdown: When the robot is stopped, an unregister message is sent to the hub on which it
is registered. This will make the robot disappear from Infrastructure Manager. If this is not selected, the stopped robot will appear with
a red icon, enabling the operator to detect the situation.
Suspend all probes when no network connection is available: When running the robot on a computer with no network connection,
you can determine whether the robot should be active or enter a sleep mode where all probes are suspended until a network
connection is again available. If this option is not selected the alarm messages will be spooled and flushed when a network connection
is again available.
Time offset from UTC (override the local time zone setting) sec: Overrides the local time zone setting. The time specification must
be entered as time offset from UTC (in seconds).
When no contact with hub: Set limitations for attempts to connect an unmanaged robot (a robot that has lost the contact with the
hub) to a hub.
Default: Allow move only within domain (recommended)
Environment Variables
Variable Name: Environment variable for your UIM system. This variable is inherited by all the probes managed by this robot.
Variable Value: The value assigned to the variable name.
Alarms
Probe restart when the probe does not respond: Select the alarm to be sent when a probe is not responding and is restarted.
Dispatch error (internal): Select the alarm to be sent when there is a dispatch error for the probe.
Max restart attempts reached for the probe: Select the alarm to be sent when the maximum restart attempts have been reached.
Error starting probe: Select the alarm to be sent when there is an error starting the probe.
Timed probe did not return 0 on termination: Select the alarm to be sent when the probe does not return a value of 0.
Probe unregisters port but keeps running: Select the alarm to be sent when the probe unregisters the port but continues to run.
Probe distribution request error: Select the alarm to be sent when there is an error during probe distribution.
Distribution post-install error: Select the alarm to be sent when there is an error after the probe is distributed and installed.
Time interval at which alarm will be resent (minutes): Select the length of time (in minutes) when an alarm will be resent.
Virtual
The Controller probe sets up the communication between the robot and the virtual probe running a remote probe.
Important! The Netware System probe is the only probe that can be set up on a virtual robot.
Virtual robots, using the proxy probe, will be created for 'remote' probes. remote probes are installed and running on computers without a
robot. The remote probe is configured with the path to the robot.
NAT
You can allocate an external IP address to a robot that resides on another IP network with incompatible addressing using Network
Address Translation. The robot will be able to send alarms and QoS messages, but the hub will not be able to communicate back to the
robot.
When Network Address Translation (NAT) is in effect between the robot and the hub, the robot must be configured to provide the correct
address to the hub. All communication from the hub to the robot will use this address. NAT can be set for the primary and secondary
hubs.
IP of the Robot as seen from the primary hub: Enter the name of the robot as shown in the primary hub configuration.
IP of the Robot as seen from the secondary hub: Enter the name of the robot as shown in the secondary hub configuration.
IP
Robot IP Address: Select the option for detecting the IP address, either automatically detect or enter a specific hub address.
IP version support: Select the IP version support for the robot.
Local IP validation: Select this option if you want local IP validation performed.
Data origin
Origin: Enter the origin name you would like associated with this robot.
Update button: Click this button to update the origin name associated with this robot.
Setup
Nimsoft
Misc
Advanced
Environment
Virtual
Alarm
NAT
IP
Status
Advanced Configuration
Using DNS Names for the Hub Computer in robot.cfg
Limiting the Number of Open Ports
Robot Maintenance Mode
Double-clicking the controller probe in the Infrastructure Manager application brings up the configuration GUI.
Setup
The Setup tab contains the following sections:
Nimsoft
Misc
Advanced
Environment
Virtual
Alarms
NAT
IP
Nimsoft
The Nimsoft tab allows you to set up the robot and a secondary hub.
Wait for primary hub to become available prevents the robot from attaching to another hub. It will attach to its parent hub when the
hub is active.
Note: This option is replaced by the Automatically detect option if the Search the subnet for a temporary hub option is
selected.
Automatically detect (searches the subnet) allows the robot to search the subnet for a responding hub within the domain when the
parent and secondary hubs are unavailable.
Note: This option is displayed if the Search the subnet for a temporary hub option is selected.
Specify Hub (IP/name) allows the robot to attach to the specified hub.
Specify Hub (domain name) allows the robot to attach to the specified hub.
Search the subnet for a temporary hub when primary and secondary hubs are unavailable allows the robot to search for a
temporary hub if the parent and secondary hubs are unavailable. Selecting this option changes the Wait for the primary hub to
become available option to Automatically detect (searches the subnet) option.
Misc
The Misc tab allows you to set logging information, identification properties, quality of service and robot interaction mode.
Identification properties lets you specify User tag 1 and User tag 2, which are user-defined tags to be used as a grouping/locating
mechanism. The tags are displayed in various lists in Infrastructure Manager.
Log Level sets the amount of detail to be logged to the log file. Log as little as possible during normal operation to reduce disk
consumption, and increase the level of detail when debugging.
Log Size allows you to change the size of the log file according to your needs. The default size of the log file is 1024 KB.
Hub update Interval determines at what interval the hub is contacted with an "alive" message. The range is 1 to 180 minutes. Note that
the hub is notified at a shorter interval when changes occur in the probe list.
Quality of Service:
On Robot uptime, reported as state change sends QoS messages on robot uptime. The QoS message status = up is sent when the
robot starts; status down is sent when the robot stops.
Set QoS source to robot name instead of computer hostname uses the robot name specified on the Setup > Nimsoft tab.
Note: This option is disabled unless the Set specific name (override) option is selected on the Setup > Nimsoft tab.
Robot mode is normal or passive. Passive robots cannot initiate communication with a hub. All contact must be initiated by the hub.
Advanced
The Advanced tab allows you to set robot options and the data origin.
Environment
The robot controller will read the variable/value pair in the list and insert it into the robot environment. This environment is inherited by the probes
managed by the robot.
You can add, edit or delete these environment variables by right-clicking in the screen and selecting the appropriate menu option. Only add
environment variables that you want all probes on this robot to use.
Virtual
This tab lists virtual robots served by the robot controller. The netware probe is the only probe that can be set up on a virtual robot.
Virtual robots will, via the proxy probe, be created for remote probes (probes installed and running on computers without a robot). The remote
probe is configured to know which robot to be served by.
In Infrastructure Manager, the remote probe will appear as a probe on a virtual robot.
Alarm
This tab contains the internal alarm messages issued by the controller on the different error situations that may occur. You are allowed to select
the severity level for each alarm message. For the condition Probe restart when the probe does not respond, you can also select that no alarm
message is issued.
The option Time interval at which alarms will be resent defines the time interval in minutes between alarms being sent on an error condition.
If this field is left blank, the controller will never attempt to resend the alarms.
NAT
The feature allows you to allocate an external IP address to a robot that resides on another IP network with incompatible addressing.
This tab contains the setup for NAT (Network Address Translation).
If the robot is separated from the hub by a NAT (Network Address Translation) device, the robot will be able to send alarms and QoS messages,
but the hub will be unable to communicate back. One solution is to setup static NAT mappings for each robot behind the NAT device. Using Raw
Configure, you can then add the key robotip_alias to the controller section of the robot.cfg file on the robot computer. This key changes the IP
address that gets registered when the robot initially connects to the hub.
The key robotip_alias should specify the static NAT mapping IP address that the hub and the clients on the other side of the NAT device should
use to access the robot. For example:
<controller>
robotip_alias = 193.71.55.153
...
<\controller>
IP
The IP tab allows you to configure the IP information for this robot.
Robot IP-address:
Automatically detect
The host IP address will be used.
Set specific address(es) (override)
Specify an IP-address or set of IP-addresses for the robot. This is typically used when a host has more than one network interface.
For more than one IP-address, the addresses must be separated by a comma. IP addresses can also contain wildcards (*).
Valid entries:
198.2.3.5, 138.3.4.10
198.2.*.*, 138.3.4.10
The controller and all probes it starts only listen for connections ON their NIC attached to addresses that start 198.2. If there are not
any addresses then they would listen ON 138.3.4.10.
IP version support Select the appropriate IP version you are running.
Local IP validation With this option enabled, if the robot IP address is not set to localhost, it is checked against a list of IP addresses
that are known valid for that server before it is used.
IP-binding:
Listen only on the first valid address from configured IP addresses
You can only select this option if you set a specific address in the Robot IP-address section of this screen. The controller will only list
ON a specific IP address on servers with multiple NICs.
Status
Robot name
The actual/current name of the robot.
Robot IP-addr.
The actual/current ip-address of the robot.
Robot version
The version of the robot.
Op. Sys.
Operating system name.
Op. Sys. type
The type of the operating system (UNIX/Windows etc.).
Op. Sys descr
Operating system description.
Indicator and status information
The status indicator will, together with a status message, give information about the current controller status (Green icon means OK, Red
icon means an error situation).
Right-clicking on the indicator lets you set the robot into maintenance mode for the period specified. See the section for more information.
Installed packages
Lists all packages installed on the robot in a separate window.
Robot environment
Opens the Robot Environment window. This window displays the variables and values for the computer running the robot.
Note: The values are inherited by any probe started by this robot.
Hub connectivity
Displays the current, primary and secondary hubs.
Advanced Configuration
Using DNS Names for the Hub Computer in robot.cfg
Version 2.80 or higher of the probe allows the DNS name of the hub machine to be used in the robot.cfg file in two ways: using the full DNS name
instead of the IP address in the hubip parameter or using the full DNS name in the hub_dns_name parameter.
Using the full DNS name of the computer instead of the IP address (as the hubip parameter) allows the robot to recognize its main hub and return
to it after a failover situation.
The hubip parameter will be replaced by an IP address on a robot move operation. The robot can fail over to a different hub if the DNS is down.
Using the full DNS name in the parameter hub_dns_name allows the robot to attempt to use the hub_dns_name to look up the hub IP address. If
this fails, the hubip parameter is used. When the DNS name lookup is successful and the ip address found is different from the hubip parameter,
this parameter is replaced.
The same functionality is available for the secondary hub, using the secondary_hub_dns_name parameter.
Note that the hub_dns_name is lost on robot move and secondary_hub_dns_name is lost on change of secondary hub initiated by the controller
configuration tool.
<controller>
hubip =
hub_dns_name =
secondary_hub_dns_name =
</controller>
Note: The support integration of CA Unified Infrastructure Management Cloud User Experience Monitor or CUE in UMP 2.6 and cuegtw
probe is only for English locale.
When upgrading the probe from a previous version to 1.07 in the German locale, you must enter the password for each profile and save
changes. The process ensures that the probe reads the password and lets you connect to the probe.
More information:
cuegtw (Cloud Monitoring Gateway) Release Notes
cuegtw Node
CA Unified Infrastructure Management Cloud Monitor URL Node
<Profile Name> Node
Configure a Node
Manage Profiles
Delete Profile
cuegtw Alert Metrics Default Settings
cuegtw Node
The cuegtw node lets you configure general properties and the internet proxy settings. You can view the probe information and the alarm details.
Navigation: cuegtw
Set or modify the following values as required:
cuegtw > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
cuegtw > General Configuration
This section is used to configure log level of the probe.
Log Level: specifies the detail level of the log file.
Default: 2 - Warn
Interval: defines the time interval for fetching the new RSS feeds from the CA Unified Infrastructure Management Cloud Monitor.
Default: 5
Interval Unit: specifies the time interval unit.
Default: Minutes
cuegtw > Proxy Configuration
This section lets you configure the proxy user authentication details for the probe to connect with the CA Unified Infrastructure
Management Cloud Monitor API.
Enable Proxy: enables the proxy settings.
Proxy Type: specifies the proxy type of your networking environment.
User ID: defines the user name for authenticating the probe on the proxy server.
Proxy URL: defines the IP address of the proxy server.
Port: defines the port number where the proxy server listens to the incoming requests.
Domain Name: defines the domain of the user. This option is applicable only when the proxy type is NTLM.
cuegtw > Message Configuration
This section lets you view the alarm messages of the probe. You can view the message name, message text, and the message severity.
The probe has two alarm messages and are read only.
Important: The username and password for the CA Unified Infrastructure Management Cloud Monitor web site and API can be
different. The probe requires credentials for cloud monitor API. You can change the API password by selecting the Subscriptio
n > Change Password option after logging on to the cloud monitor web site.
profile name > Alert/Reminder Monitor Configuration
This section lets you generate an alarm when the probe receives an RSS feed from the cloud monitor API.
Publish Alarms: activates the profile for generating the alarm.
Alert: specifies the alarm message when the RSS feed type is alert.
Reminder: specifies the alarm message when the RSS feed type is reminder.
Configure a Node
This procedure provides the information to configure a particular section within a node. Each section in a node lets you configure probe properties
for connecting to the cloud monitor website and fetching RSS feeds.
Follow these steps:
1. Navigate to the section in a node that you want to configure.
2. Update the field information and click Save.
The specified section of the probe is configured.
Manage Profiles
This procedure provides the information to create a monitoring profile for connecting to the cloud monitor API. A monitoring profile contains valid
user details to fetch RSS feeds for generating alarms.
Follow these steps:
1. Click the Options icon next to the CA Unified Infrastructure Management Cloud Monitor URL node in the navigation pane.
2. Click the Add New Profile option.
3. Enter profile details in the Profile Configuration dialog and click Submit.
The profile is saved for fetching the RSS feeds from the cloud monitor API. You can configure alarms for each RSS feed for populating
the NAS.
Delete Profile
You can delete a monitoring profile when it no longer requires monitoring the RSS feeds of the cloud monitor.
Follow these steps:
1. Click the Options icon next to the profile name node that you want to delete.
2. Click the Delete option.
3. Click Save.
The profile is deleted.
Warning Threshold
Warning Severity
Error Threshold
Error Severity
Description
Alert
Critical
Reminder
Warning
data_engine
The data_engine probe manages and maintains data that is collected by Quality of Service (QoS) enabled probes. The data_engine creates all
tables and stored procedures necessary to manage the collected data.
Data that is produced by the QoS probes is stored in the UIM database, in tables named raw data tables (table prefix is RN_). Raw data is kept
for a user-defined period, then summarized and aggregated into Hourly data (HN_tables). Hourly data is then summarized and aggregated into
Daily data (DN_ tables).
Note: The data_engine probe does not generated alarms and has no metrics.
Tables
How the data_engine Probe Collects and Maintains QoS Data
RN_QOS_DATA Table Columns
RN_table Indexes
Data_engine Start Up
Parallel Mode
Serial Mode
Tables
The tables created in the UIM database have prefixes indicating the type of data they contain.
The naming convention for the tables is as follows:
S_ for tables used to store system data
D_ for data tables
H_ for tables containing historic data
HN_ for data tables containing hourly/compressed data
DN_for data tables containing daily/compressed data
RN_ for data tables containing unprocessed (raw) data directly from the probes
The QoS data structure is dynamically created by the data_engine on the first startup, and when the first unique QOS_DEFINITION or
QOS_MESSAGE message is received from a probe. The S_QOS_DEFINITION table contains the definitions of known QoS types (for example,
QOS_CPU_USAGE), and is updated when a probe sends a QOS_DEFINITION describing a new QoS type.
The S_QOS_DATA table contains an index of all data tables for the QoS objects. When a probe sends a QOS_MESSAGE containing a QoS
object that is not already defined in the S_QOS_TABLE, a new entry is added to the table and the data is inserted into the table referenced in
column r_table (typically RN_QOS_DATA_nnnn) with the table_id that the new row is given when inserted into the S_QOS_DATA table.
Note: Do not drop the data tables manually. Instead delete the entry from the S_QOS_DATA table, and the tables will be dropped by a
Description
TableID
Sampletime
Samplevalue
QoS value
Samplestdev
Samplerate
Rate of sampling
Tz_offset
RN_table Indexes
The default indexes in RN_tables are optimized for writing data:
Index
Description
Idx0
sampletime, table_id
Idx1
The RN_QoS_DATA_tables do not have primary keys, as both tableID and sampletime can be duplicated.
Data_engine Start Up
By default, the data_engine commits data to a database in parallel mode using ten parallel threads. In parallel mode, the data_engine uses
multiple threads to do work in parallel and makes the data_engine less vulnerable to performance issues when commits to a particular table are
slow. For more details, see Parallel Mode, Serial Mode, and Thread Count.
When the data_engine probe starts, it loads both S_QOS_DATA and S_QOS_DEFINITION into memory and establishes a bulk connection to the
database.
Parallel Mode
The data_engine probe can run in either parallel or serial mode. The mode is determined by the thread count, which is the number of preallocated
threads used to commit data to a database. Earlier versions of data engine ran in serial mode by default, but the current default is parallel mode.
When the thread_count_insert parameter in Raw Configure is set to a value of greater than one, the data_engine commits data to the database in
parallel mode; when set to a value less than one serial mode is used.
4. All objects are marked as ready to commit and a reference is placed onto another list.
5. Concurrently, a thread continuously runs to validate, sort, and place messages into a list that is written to the database.
6. Writes any messages that have been marked as 'ready to commit data' to the database using a thread pool of worker threads.
Serial Mode
Serial mode is the original mode for the data_engine probe. If the thread_count_insert parameter is set to zero in the Raw Configure menu, the
data_engine probe defaults to serial mode.
In this mode, the data_engine:
1. Reads messages from the hub for a given time period (default around 1 second).
From 1 through 20 messages are read at a time, depending on how many are in the queue or if the hub queue size has changed
(default 20).
The read messages are then validated and sorted into lists that can be quickly inserted into the database.
2. Stops reading new messages and iterates over all the lists, checking to see if any are full. By default the list is flushed if it contains more
than 5000 messages or if it has not been flushed in the last 5 seconds.
3. Goes back to reading messages from hub. If one bulk object takes too long to insert, then the writing of all data to the database is
delayed.
Important! (MS SQL Server) It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only
reorganize individual partitions. Performing automatic indexing for large tables from the data_engine is discouraged, as indexing might
not complete in a reasonable amount of time.
Partition a Database
Microsoft SQL Server Enterprise Edition
Oracle
MySQL 5.6
Note: When you install a new database or database maintenance is performed, 14 extra partitions are created to allow you up to two
weeks to correct an issue if an error occurs during data maintenance.
For the data_engine to perform the partitioning process, you must license the Oracle Partitioning Option (from Oracle) and it must be enabled for
the Oracle database on UIM Servers.
Although partitioning your Oracle database provides enhanced performance, be aware that depending on the size and number of data tables in
your database, the partitioning process can take a few minutes, several hours, or possibly several days and can be resource intensive.
In addition to the processing time required to complete partitioning, the partitioning process requires the availability of enough free disk space to
create a partitioned interim table. The partitioning process copies the contents from the original data tables into the interim data tables, swaps the
original data tables and interim data tables, and finally deletes the original data tables. This process is performed while the tables and indexes are
online.
If there is not enough free disk space for the interim tables, the partitioning process will not be able to finish. The process is similar to the one
described in this Oracle article.
We recommend you perform the following tasks before selecting the Partition Data Tables option.
1. The partitioning process requires the default tablespace (for the interim table) to be 1.6 times larger than the largest data table to be
partitioned. In general, RN tables are the largest data tables in the database. To find the largest RN table, generate and run the following
script:
Expand source
select
segment_name
table_name,
sum(bytes)/(1024*1024) table_size_meg
from
user_extents
where
(segment_type='TABLE' or segment_type='TABLE PARTITION')
and
segment_name like 'RN_QOS_DATA_%'
group by segment_name
order by table_size_meg desc;
2. Initial partitioning of the tables online also consumes disk space in the SYSTEM tablespace. The disk space required prior to partitioning
is approximately the same as that required by the largest unpartitioned RN table. Determine the utilization of the system tables by
generating and running the following script:
Expand source
3. Initial partitioning of the tables also consumes storage in the TEMP tablespace. The free disk space required prior to partitioning is
approximately the same as that required by the largest unpartitioned RN table. Determine the utilization of the TEMP tablespace by
generating and running the following script:
Expand source
Note: data_engine might have a slower data throughput rate with Oracle Database 12c than with Oracle Database 11g.
Oracle databases are partitioned online, meaning the data_engine can continue to write data to tables. You can set up partitioning to improve
performance when accessing the raw sample data tables.
Follow these steps:
1. In the data_engine probe configuration menu, click the Database Configuration folder.
2. In the Connection Information field select the type of database from the drop-down menu.
3. Select the Partition data tables check box.
4. Click Save.
Partitioning will be performed during the next maintenance period. The time required to execute the partitioning is dependent on both the amount
of data and the performance of the disk subsystem. Partitioning can take up to several days on especially large installations.
By default, the MySQL 5.6 innodb_file_per_table parameter is ON. When working with larger MySQL databases, it is recommended to turn off
innodb_file_per_table to improve data_engine's insert rate and thus performance.
To change the setting for the innodb_file_per_table parameter, follow these steps:
1. Depending on your operating system access the my.ini or my.cnf files.
2. Set the parameter to OFF by entering:
innodb_file_per_table = 0
MySQL 5.6 databases are partitioned offline, meaning data is blocked in the queue while a data table is being partitioned, and then written to the
database when the partitioning process is finished. Depending on the size of the table being partitioned (for example, if a table is 10 GB or larger)
and the rate of the data being sent to the data_engine, you might want to stop partitioning periodically to allow the data_engine to write the
backed-up data to the database (see step 7 in the following procedure).
Note: Due to database performance issues, we do not support partitioning on MySQL 5.5 and earlier. CA UIM Server requires that you
are using MySQL 5.6 to support partitioning.
2. For the Recurring fields, select Yearly and leave default values in the remaining fields.
This temporarily turns off daily maintenance.
3. In the data_engine probe configuration menu, click the Database Configuration folder.
4. In the Connection Information field, verify the value is set to MySQL.
5. Select the Partition data tables check box.
6. Open the Probe Utility from the data engine context menu and run the run_admin_now callback to perform manual maintenance that will
partition data tables.
7. If the partitioning process runs for 24 hours:
a. Restart the data_engine. It might take the data_engine several hours to write all the data from the queue to the database.
b. Run the run_admin_now probe utility again to continue the partitioning process.
8. Repeat step 7 every 24 hours until all the tables in the database are partitioned.
More information:
See v8.3 data_engine AC GUI Reference for more details about partitioning a database.
data_engine
Probe Information
General Configuration
Quality of Service Type Status
Database Configuration
MySQL 5.6
Microsoft SQL Server
Oracle
Quality of Service
Scheduler
To access the data_engine configuration interface, select the robot that the data_engine probe resides on in the Admin Console navigation pane.
In the Probes list, click the arrow to the left of the probe and select Configure.
data_engine
Navigation: data_engine
This section lets you view probe and QoS information, change the log level, and set data management values.
Probe Information
Note: Do not enter a zero for delete raw, historic, or daily average datathis means the data is never deleted.
Action Button: Click the button located on the right side of the page to display a table populated with the QoS status data.
Quality of Service Type Status
Click Actions > QoS Status (located on the right side of the page) to display a table populated with the QoS status data. This table is read-only.
Note: The status information is created based on statistics that are generated by the database provider. If incorrect information is
displayed, you might need to update the table statistics. For more information, see the "Out-of-date Information in the Quality of Service
Type Status Table" section in the Troubleshooting article for more details.
Database Configuration
Important! The database connection properties should only be changed in limited circumstances such as recovery operations.
Changing the Database Vendor can cause connection issues. If you are changing database vendors, CA recommends reinstalling CA
UIM.
The Database Configuration section allows you to specify the database connection settings. These settings are different for each database
vendor (MySQL, MS SQL Server, and Oracle).
To test the connection for all vendors, select Actions>Test Connection at the top of the page.
MySQL 5.6
Note: The password cannot have any special characters, such as ";".
Partition Data Tables: Select this check box to perform partitioning on the raw sample data tables.
Microsoft SQL Server
Partition
Data
Tables
Index
Maintenance
Mode for
Index
Maintenance
SQL Server
Enterprise Edition
Online or
offline
Supported
Dynamic - select dynamic when you're not sure which edition of SQL Server is used in
your environment, or you want data_engine to choose the reorganize or rebuild mode
based on the Fragmentation Level threshold settings.
Online or
offline
Reorganize - a partition or an entire index for a table can be reorganized online or offline
when the Partition Data Table option is selected.
Rebuild - only an entire index for a table can be rebuilt online (this process can take some
time), but a partition or an entire index for a table can be rebuilt offline when the Partition
Data Table option is selected.
Offline
Supported
Dynamic - select dynamic when you're not sure which edition of SQL Server is used in
your environment, or you want data_engine to choose the reorganize or rebuild mode
based on the Fragmentation Level threshold settings.
Reorganize - only an index for a table can be rebuilt offline when the Partition Data Table
option is selected.
Rebuild - only an index for a table can be rebuilt offline when the Partition Data Table
option is selected.
SQL Server
Express Edition
Not supported
This section lets you configure the connection and maintenance options for a Microsoft SQL Server database.
Provider: The SQL server provider.
Initial Catalog: The database name.
Data Source: The database server.
User ID: The login user.
Password: The login user password.
Note: The password cannot have any special characters, such as ";".
Note: This option is not available in the Microsoft SQL Express edition.
Offline
Index Maintenance: Perform table re-indexing with other maintenance routines, which by default are executed every 24 hours.
Note: On very large tables (over 10 GB), running index maintenance may not complete in a reasonable amount of time.
Quality of Service
Scheduler
Navigation: data_engine > Scheduler
This section allows you to schedule database maintenance.
Start time - Select either Now or a specific date and time. Selecting now begins the new database maintenance schedule immediately.
Ending - Select Never, After x occurrences, or By a specific date and time.
Recurring - select one of the following occurrence patterns:
Minutely
Hourly
Daily (including a specific time)
Weekly (including a specific time and days of the week)
Monthly (including occurrence, calendar day, and specific time)
Yearly (including month and specific time)
3. For 5 compatibilities, select the appropriate version of the compat-libstdc++ library for your distribution and install it.
Contents
Prerequisites
The General Tab
Index Maintenance Properties (SQL Server only)
The Quality of Service Type Status Window
The Database Tab
Microsoft SQL Server Options
MySQL Options
Oracle Options
The Quality of Service Tab
The Schedule Tab
Configure the data_engine by selecting the data_engine probe in Infrastructure Manager, then right-click and select Configure. This opens the
data_engine configuration dialog.
Important! It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only reorganize
individual partitions. Performing automatic indexing for large tables from the data_engine is discouraged, as indexing may not
complete in a reasonable amount of time.
Index maintenance properties (SQL Server option only): Click the Advanced button to activate or deactivate the index maintenance
properties. For more details, go to the Index Maintenance Properties section. This option will not appear if you are using Oracle or
MySQL.
Partition data tables (SQL Server option only): Option to partition the data tables.
Note: Although you can select the Partition data tables check box for data_engine v8.3, this function has been disabled within
IM. If you select this check box in IM, no actions occur and the data tables is not partitioned. Access the data_engine
configuration in Admin Console to turn on partitioning for Microsoft SQL Server Enterprise Edition database.
Log Setup: This section allows you to set the log level details.
Log Level: Sets the level of details written to the log file. Log as little as possible during normal operation to minimize disk consumption,
and increase the amount of detail when debugging.
Status: This button opens the Status window and provides the current work status of the probe. For more details, see the Status Window section.
Statistics: This section displays the current transmission rate for transferring QoS messages into the database.
Note: It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only reorganize individual
partitions. Performing automatic indexing for very large tables from the data_engine is strongly discouraged, as indexing may
not complete in a reasonable amount of time.
Online mode: Defines the effect of maintenance on concurrent use of the QoS tables.
Dynamic: The maintenance is determined by the edition of SQL Server. If SQL Server is the Enterprise Edition, then Online mode will be
used for maintenance (if the chosen maintenance mode supports it); otherwise, offline mode will be used.
Online: The QoS tables are available for update and query during the table maintenance period. Online mode offers greater concurrency
but requires more resources.
Offline: The QoS tables are unavailable for update and query during the table maintenance period.
Fragmentation Level: The fragmentation level information is used if Index name pattern is anything other than "All".
Low Threshold: If the fragmentation for an index is less than the low threshold percent value, then no maintenance will be performed.
High Threshold: If Dynamic maintenance mode is selected and fragmentation is between the low and high threshold percentages, then
the Reorganize mode will be used; otherwise the Rebuild mode will be used.
Index name pattern: The indexes that are maintained. The default is Blank, which indicates a blank entry that results in all indexes being
considered for maintenance.
The Database tab enables you to specify the database connection settings. The currently supported databases include:
Microsoft SQL Server
MySQL
Oracle
The database settings specified here are used by various other probes, such as dashboard_engine, discovery_server, sla_engine, wasp, and
dap.
Important: If you install CA Unified Infrastructure Management (UIM) and MS-SQL on a nonstandard port you must configure the
data_engine "server" parameter to include the port. Example: "<server>,< port>".
MySQL Options
Database vendor: Select MySQL option.
Schema: Enter the database schema name
Server Host: Enter the database server name or IP address
Port: Enter the port number to connect to the database server
Username: The login user name
Password: The login users password. Ensure that the password does NOT contain any special characters (such as ";").
Click the Test Connection button to check the current connection setup.
Oracle Options
Prerequisites
Microsoft SQL Server
Verify that you have the proper permissions to access a Microsoft SQL Server database when you are not using the sa login. See SQL
Authentication on Microsoft SQL Server.
Oracle
Before partitioning an Oracle database, review prerequisites to ensure that you have enabled the Oracle Partitioning Option and have
enough free disk space to create a partitioned interim table that is used during the partitioning process.
Contents
Prerequisites
Using the Admin Console to Access the data_engine Configuration GUI
Change the Database Connection Properties
Configure the Data Retention Settings
Override the Data Retention Settings on Individual QoS Objects
Set up Index Maintenance for Microsoft SQL Server Databases
Partition a Database
Microsoft SQL Server Prerequisite: SQL Authentication
Oracle Database Prerequisite: Oracle Partitioning Option Required
Oracle Database Prerequisite: Verify Tablespace Before Partitioning
Partition an Oracle or Microsoft SQL Server Enterprise Edition Database
Schedule Database Maintenance
Important! (MS SQL Server) It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only
reorganize individual partitions. Performing automatic indexing for large tables from the data_engine is discouraged, as indexing might
not complete in a reasonable amount of time.
Partition a Database
Prerequisites
Microsoft SQL Server
Oracle database
Procedures
Microsoft SQL Server
Oracle database
Read and update permissions on the master and tmpdb system databases.
The serveradmin database role to create and execute stored procedures properly.
select
segment_name
table_name,
sum(bytes)/(1024*1024) table_size_meg
from
user_extents
where
(segment_type='TABLE' or segment_type='TABLE PARTITION')
and
segment_name like 'RN_QOS_DATA_%'
group by segment_name
order by table_size_meg desc;
2. Initial partitioning of the tables online also consumes disk space in the SYSTEM tablespace. The disk space required prior to partitioning
is approximately the same as that required by the largest unpartitioned RN table. Determine the utilization of the system tables by
generating and running the following script:
3. Initial partitioning of the tables also consumes storage in the TEMP tablespace. The free disk space required prior to partitioning is
approximately the same as that required by the largest unpartitioned RN table. Determine the utilization of the TEMP tablespace by
generating and running the following script:
data_engine
Probe Information
General Configuration
Quality of Service Type Status
Database Configuration
MySQL 5.6
Microsoft SQL Server
Oracle
Quality of Service
Scheduler
To access the data_engine configuration interface, select the robot that the data_engine probe resides on in the Admin Console navigation pane.
In the Probes list, click the arrow to the left of the probe and select Configure.
data_engine
Navigation: data_engine
This section lets you view probe and QoS information, change the log level, and set data management values.
Probe Information
Note: Do not enter a zero for delete raw, historic, or daily average datathis means the data is never deleted.
This section provides data regarding the QoS tables and is read-only.
Note: The status information is created based on statistics that are generated by the database provider. If incorrect information is
displayed, you might need to update the table statistics. For more information, see the "Out-of-date Information in the Quality of Service
Type Status Table" in the Troubleshooting article.
Database Configuration
Important! The database connection properties should only be changed in limited circumstances such as recovery operations.
Changing the Database Vendor can cause connection issues. If you are changing database vendors, CA recommends reinstalling CA
UIM.
The Database Configuration section allows you to specify the database connection settings. These settings are different for each database
vendor:
MySQL
Microsoft
Oracle
To test the connection for all vendors, select Actions>Test Connection at the top of the page.
MySQL 5.6
Partition
Data
Tables
Index
Maintenance
Mode for
Index
Maintenance
SQL Server
Enterprise Edition
Online or
offline
Supported
Dynamic - select dynamic when you're not sure which edition of SQL Server is used in
your environment, or you want data_engine to choose the reorganize or rebuild mode
based on the Fragmentation Level threshold settings.
Online or
offline
Reorganize - a partition or an entire index for a table can be reorganized online or offline
when the Partition Data Table option is selected.
Rebuild - only an entire index for a table can be rebuilt online (this process can take some
time), but a partition or an entire index for a table can be rebuilt offline when the Partition
Data Table option is selected.
Offline
Supported
Dynamic - select dynamic when you're not sure which edition of SQL Server is used in
your environment, or you want data_engine to choose the reorganize or rebuild mode
based on the Fragmentation Level threshold settings.
Reorganize - only an index for a table can be rebuilt offline when the Partition Data Table
option is selected.
Rebuild - only an index for a table can be rebuilt offline when the Partition Data Table
option is selected.
SQL Server
Express Edition
Not supported
This section lets you configure the connection and maintenance options for a Microsoft SQL Server database.
Provider: The SQL server provider.
Offline
Note: The password cannot have any special characters, such as ";".
Note: This option is not available in the Microsoft SQL Express edition.
Index Maintenance: Perform table re-indexing with other maintenance routines, which by default are executed every 24 hours.
Note: On very large tables (over 10 GB), running index maintenance may not complete in a reasonable amount of time.
Note: The password cannot have any special characters, such as ";".
Quality of Service
Navigation: data_engine > Quality of Service
The Quality of Service section displays the attributes for the QoS metrics.
Name: The QoS type name.
Description: Description of the QoS type.
QoS Group: The QoS group is a logical group to which the QoS belongs (optional).
Unit: The unit of the QoS data (the abbreviated form of the QoS data unit).
Has Max Value: The data type has an absolute maximum. For example, disk size or memory usage have absolute maximums.
Is Boolean: The data type is logical (yes/no). For example, a host is available/unavailable or printer is up/down.
Type: Different data types:
0 = Automatic (The sample value is read at fixed intervals, which are set individually for each probe).
1 = Asynchronous (The sample value is read only when the value changes, and the new value is read).
Override Raw Age: Select this check box to override the raw age of the QoS metric.
Raw Age: The number of days you want to retain the QoS metric information.
Override History Age: Select this check box to override the history age for the QoS metric.
History Age: The number of days you want to retain the history information.
Override Daily Average Age: Select this check box to override the daily average age for the QoS metric.
Daily Average Age: The number of days you want to retain the daily average information.
Override Compression: Select this check box to override compression settings for data in RN and HN tables.
Compress: Raw data is summarized and aggregated into Hourly (or historic) data before it is deleted from the RN tables. This Hourly
data is then summarized and aggregated into Daily data before it is deleted from the HN tables.
Scheduler
Navigation: data_engine > Scheduler
This section allows you to schedule database maintenance.
Start time - Select either Now or a specific date and time. Selecting now begins the new database maintenance schedule immediately.
Ending - Select Never, After x occurrences, or By a specific date and time.
Recurring - select one of the following occurrence patterns:
Minutely
Hourly
Daily (including a specific time)
Weekly (including a specific time and days of the week)
Monthly (including occurrence, calendar day, and specific time)
Yearly (including month and specific time)
Prerequisites
The General Tab
Index Maintenance Properties (SQL Server only)
Partitioning of Raw Sample Data (SQL Server)
The Quality of Service Type Status Window
The Database Tab
Microsoft SQL Server Options
MySQL Options
Oracle Options
The Quality of Service Tab
The Schedule Tab
Configure the data_engine by selecting the data_engine probe in Infrastructure Manager, then right-click and select Configure. This opens the
data_engine configuration dialog.
Important! It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only reorganize
individual partitions. Performing automatic indexing for large tables from the data_engine is discouraged, as indexing may not
complete in a reasonable amount of time.
Index maintenance properties (SQL Server option only): Click the Advanced button to activate or deactivate the index maintenance
properties. For more details, go to the Index Maintenance Properties section. This option will not appear if you are using Oracle or
MySQL.
Partition data tables (SQL Server option only): Option to partition the data tables.
Note: This option is only available for Microsoft SQL Server Enterprise Edition. It is not available in the Microsoft SQL Express edition.
Log Setup: This section allows you to set the log level details.
Log Level: Sets the level of details written to the log file. Log as little as possible during normal operation to minimize disk consumption,
and increase the amount of detail when debugging.
Status: This button opens the Status window and provides the current work status of the probe. For more details, see the Status Window section.
Statistics: This section displays the current transmission rate for transferring QoS messages into the database.
Important! It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only reorganize individual
partitions. Performing automatic indexing for very large tables from the data_engine is strongly discouraged, as indexing may not
complete in a reasonable amount of time
Online mode: Defines the effect of maintenance on concurrent use of the QoS tables.
Dynamic: The maintenance is determined by the edition of SQL Server. If SQL Server is the Enterprise Edition, then Online mode will be
used for maintenance (if the chosen maintenance mode supports it); otherwise, offline mode will be used.
Online: The QoS tables are available for update and query during the table maintenance period. Online mode offers greater concurrency
but requires more resources.
Offline: The QoS tables are unavailable for update and query during the table maintenance period.
Fragmentation Level: The fragmentation level information is used if Index name pattern is anything other than "All".
Low Threshold: If the fragmentation for an index is less than the low threshold percent value, then no maintenance will be performed.
High Threshold: If Dynamic maintenance mode is selected and fragmentation is between the low and high threshold percentages, then
the Reorganize mode will be used; otherwise the Rebuild mode will be used.
Index name pattern: The indexes that are maintained. The default is Blank, which indicates a blank entry that results in all indexes being
considered for maintenance.
Important! When using the Partitioning feature, schedule maintenance to run daily.The time required to execute the partitioning
depends on amount of data as well as performance of disk subsystem but can for large installations take several hours or even up to
several days.
The sample data tables can be partitioned in order to achieve improved performance.
The sample data will be partitioned by day so if you for instance have configured the system to delete raw sample data older than 365 days, then
the sample data tables (RN_QOS_DATA_xxxx) will each be configured with 365 partitions (plus a few extra partitions in order allow for faster
maintenance).
SQL Server: If using partitioning then the property Delete raw data older than must be between 1 and 900. SQL Server, up to and including 2008
SP1, limits a table to 1000 partitions.
The partitioning will contribute to improved performance when accessing the raw sample data tables:
higher insert rates
faster read access to data
faster data maintenance (delete/compress of sample data)
faster index maintenance
You enable/disable the partition by checking or unchecking the Partition data tables checkbox in the data_engine configuration dialog.
If the state of the data_engine partitioning table changes, you will then be asked to choose how to execute the partitioning:
If you chose "Start now" you'll be asked to define for how long you'll allow the partitioning to run, and a confirmation message box will
show the progress of the execution compared to the length of time you specified. A successful completion message will be displayed
when the partitioning successfully has been executed.
You can let the scheduled maintenance execute the partitioning by choosing "Run at maintenance". In this case the partitioning will be
executed before the actual data maintenance is executed.
When using the Partitioning feature all maintenance activities will be optimized to use the optimizations partitioning offers, in particular
Purging of raw sample data will be done by dropping partitions
Index maintenance will be done on last partition only (SQL Server only)
While the default settings for partitioned maintenance will work well for most installations running daily maintenance, it is possible to override
these settings. Go to Override Default Partition Maintenance Settings for further details.
The Database tab enables you to specify the database connection settings. The currently supported databases include:
Microsoft SQL Server
MySQL
Oracle
The database settings specified here are used by various other probes, such as dashboard_engine, discovery_server, sla_engine, wasp, and
dap.
Important: If you install CA Unified Infrastructure Management (UIM) and MS-SQL on a nonstandard port you must configure the
data_engine "server" parameter to include the port. Example: "<server>,< port>".
MySQL Options
Database vendor: Select MySQL option.
Schema: Enter the database schema name
Server Host: Enter the database server name or IP address
Port: Enter the port number to connect to the database server
Username: The login user name
Password: The login users password. Ensure that the password does NOT contain any special characters (such as ";").
Click the Test Connection button to check the current connection setup.
Oracle Options
Database vendor: Select Oracle option.
Hostname: Enter the database server name or IP address
Port: Enter the port number to connect to the database server
Username: The login user name
Password: The login user's password. Ensure that the password does NOT contain any special characters (such as ";")
Service Name: Enter the Service name
Click the Test Connection button to check the current connection setup.
Is Boolean: Is the data type a logical type? Example: host is available, printer is up.
Type: Different data types:
0 = Automatic (The sample value is read at fixed intervals, individually set for each of the probes).
1 = Asynchronous (the sample value is read only each time the value changes, and the new value is read.
The individual QoS definitions can be edited to have specific values:
1. Double click on a row in the Quality of Service tab to open the Override window.
2. Select "Override with" to specify a value to change the default QoS definitions.
3. Click OK, then click Apply to save your changes.
Prerequisites
Microsoft SQL Server
Verify that you have the proper permissions to access a Microsoft SQL Server database when you are not using the sa login. See SQL
Authentication on Microsoft SQL Server.
Oracle
Before partitioning an Oracle database, review prerequisites to ensure that you have enabled the Oracle Partitioning Option and have
enough free disk space to create a partitioned interim table that is used during the partitioning process.
Contents
Prerequisites
Using the Admin Console to Access the data_engine Configuration GUI
Change the Database Connection Properties
Configure the Data Retention Settings
Override the Data Retention Settings on Individual QoS Objects
Set up Index Maintenance for Microsoft SQL Server Databases
Partition a Database
SQL Authentication on Microsoft SQL Server
Important! (MS SQL Server) It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only
reorganize individual partitions. Performing automatic indexing for large tables from the data_engine is discouraged, as indexing might
not complete in a reasonable amount of time.
Partition a Database
SQL Authentication on Microsoft SQL Server
If you are not using the System Administrator (sa) login and require SQL Authentication on Microsoft SQL Server, your user account must have
the following permissions:
The db_owner database role for the UIM database.
Read and update permissions on the master and tmpdb system databases.
The serveradmin database role to create and execute stored procedures properly.
This section provides a list of tasks we recommend you perform before selecting the Partition Data Tables option.
1. The partitioning process requires the default tablespace (for the interim table) to be 1.6 times larger than the largest data table to be
partitioned. In general, RN tables are the largest data tables in the database. To find the largest RN table, generate and run the following
script:
select
segment_name
table_name,
sum(bytes)/(1024*1024) table_size_meg
from
user_extents
where
(segment_type='TABLE' or segment_type='TABLE PARTITION')
and
segment_name like 'RN_QOS_DATA_%'
group by segment_name
order by table_size_meg desc;
2. Initial partitioning of the tables online also consumes disk space in the SYSTEM tablespace. The disk space required prior to partitioning
is approximately the same as that required by the largest unpartitioned RN table. Determine the utilization of the system tables by
generating and running the following script:
3. Initial partitioning of the tables also consumes storage in the TEMP tablespace. The free disk space required prior to partitioning is
approximately the same as that required by the largest unpartitioned RN table. Determine the utilization of the TEMP tablespace by
generating and running the following script:
data_engine
Probe Information
General Configuration
Quality of Service Type Status
Database Configuration
MySQL
Microsoft SQL Server
Oracle
Quality of Service
Scheduler
To access the data_engine configuration interface, select the robot that the data_engine probe resides on in the Admin Console navigation pane.
In the Probes list, click the arrow to the left of the probe and select Configure.
data_engine
Navigation: data_engine
This section lets you view probe and QoS information, change the log level, and set data management values.
Probe Information
This section provides data regarding the QoS tables and is read-only.
Note: The status information is created based on statistics that are generated by the database provider. If incorrect information is
displayed, you might need to update the table statistics. For more information, see Out-of-date Information in the Quality of Service
Type Status Table.
Database Configuration
Important! The database connection properties should only be changed in limited circumstances such as recovery operations.
Changing the Database Vendor can cause connection issues. If you are changing database vendors, CA recommends reinstalling CA
UIM.
The Database Configuration section allows you to specify the database connection settings. These settings are different for each database
vendor:
MySQL
Microsoft
Oracle
To test the connection for all vendors, select Actions>Test Connection at the top of the page.
MySQL
This section lets you configure the connection options for a MySQL database.
Schema: The database schema name.
Server Host: The database server name or IP address.
Port: The port number to connect to the database server. The default port number is 3306.
Username: The login user name.
Password: The login user password.
Note: The password cannot have any special characters, such as ";".
Index Maintenance: Perform table re-indexing with other maintenance routines, which by default are executed every 24 hours.
Compression mode: The method that is used for data compression:
None: No compression occurs.
Page: Optimizes storage of multiple rows in a page, a super-set of row compression.
Row: Stores fixed-length data types in variable-length storage format.
Maintenance mode: How the indexes are maintained:
Dynamic: Maintenance is performed based on the index statistics.
Reorganize: Maintenance is performed using the "alter index ... reorganize" SQL Server script.
Rebuild: Maintenance is performed using the "alter index ... rebuild" SQL Server script.
Important for users of database partitioning: SQL Server 2012 and earlier can only reorganize the indexes for individual partitions
and cannot rebuild them. For this reason, if you have enabled partitioning for your UIM database, you should set maintenance mode to
Reorganize. However, if you have large tables in your environment, automatic indexing from the data_engine is discouraged, as index
maintenance may not complete in a reasonable amount of time. In this case, you can disable automatic indexing and work with your
DBA to design a custom solution to manage the index maintenance for your database.
Online mode: The effect of maintenance on concurrent use of the QoS tables:
Dynamic: The maintenance is determined by the edition of SQL Server. If SQL Server is the Enterprise Edition, then Online mode is
used for maintenance (if the chosen maintenance mode supports it); otherwise, Offline mode is used.
Online: The QoS tables are available for update and query during the data maintenance period. Online mode offers greater
concurrency but demands more resources.
Note: Setting maintenance mode to Rebuild, and setting online mode to Online requires SQL Server Enterprise Edition.
SQL Server Standard Edition and earlier does not support online index rebuilding. If you're unsure of the SQL Server edition
you're using, set online mode to Dynamic.
Offline: The QoS tables are unavailable for update and query during the data maintenance period. Benign alarms might be generated
for failure to insert QoS data during the maintenance period, but the data will be re-inserted by data_engine at a later time when the
tables are once again made available for updates.
Fragmentation level: Low threshold: If the fragmentation for an index is less than the low threshold percent value, then no maintenance
is performed.
Fragmentation level: High threshold: If dynamic maintenance mode is selected and fragmentation is between the low and high threshold
percentages, then the Reorganize mode is used; otherwise the Rebuild mode is used.
Note: This option is only available for Microsoft SQL Server Enterprise Edition. It is not available when using the Microsoft SQL Express
edition.
Index name pattern: The indexes that are maintained. The default is Blank (a blank entry results in all indexes being considered for
maintenance).
Oracle
Quality of Service
Navigation: data_engine > Quality of Service
The Quality of Service section displays the attributes for the QoS metrics.
Name: The QoS type name.
Description: Description of the QoS type.
QoS Group: The QoS group is a logical group to which the QoS belongs (optional).
Unit: The unit of the QoS data (the abbreviated form of the QoS data unit).
Has Max Value: The data type has an absolute maximum. For example, disk size or memory usage have absolute maximums.
Is Boolean: The data type is logical (yes/no). For example, a host is available/unavailable or printer is up/down.
Scheduler
Navigation: data_engine > Scheduler
This section allows you to schedule database maintenance.
Start time - Select either Now or a specific date and time. Selecting now begins the new database maintenance schedule immediately.
Ending - Select Never, After x occurrences, or By a specific date and time.
Recurring - select one of the following occurrence patterns:
Minutely
Hourly
Daily (including a specific time)
Weekly (including a specific time and days of the week)
Monthly (including occurrence, calendar day, and specific time)
Yearly (including month and specific time)
Prerequisites
The General Tab
Index Maintenance Properties (SQL Server only)
Partitioning of Raw Sample Data (SQL Server)
The Quality of Service Type Status Window
The Database Tab
Microsoft SQL Server Options
MySQL Options
Oracle Options
Important! It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only reorganize
individual partitions. Performing automatic indexing for large tables from the data_engine is discouraged, as indexing may not
complete in a reasonable amount of time.
Index maintenance properties (SQL Server option only): Click the Advanced button to activate or deactivate the index maintenance
properties. For more details, go to the Index Maintenance Properties section. This option will not appear if you are using Oracle or
MySQL.
Partition data tables (SQL Server option only): Option to partition the data tables.
Note: This option is only available for Microsoft SQL Server Enterprise Edition. It is not available in the Microsoft SQL Express edition.
Log Setup: This section allows you to set the log level details.
Log Level: Sets the level of details written to the log file. Log as little as possible during normal operation to minimize disk consumption,
and increase the amount of detail when debugging.
Status: This button opens the Status window and provides the current work status of the probe. For more details, see the Status Window section.
Statistics: This section displays the current transmission rate for transferring QoS messages into the database.
Important! It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only reorganize individual
partitions. Performing automatic indexing for very large tables from the data_engine is strongly discouraged, as indexing may not
complete in a reasonable amount of time
Online mode: Defines the effect of maintenance on concurrent use of the QoS tables.
Dynamic: The maintenance is determined by the edition of SQL Server. If SQL Server is the Enterprise Edition, then Online mode will be
used for maintenance (if the chosen maintenance mode supports it); otherwise, offline mode will be used.
Online: The QoS tables are available for update and query during the table maintenance period. Online mode offers greater concurrency
but requires more resources.
Offline: The QoS tables are unavailable for update and query during the table maintenance period.
Fragmentation Level: The fragmentation level information is used if Index name pattern is anything other than "All".
Low Threshold: If the fragmentation for an index is less than the low threshold percent value, then no maintenance will be performed.
High Threshold: If Dynamic maintenance mode is selected and fragmentation is between the low and high threshold percentages, then
the Reorganize mode will be used; otherwise the Rebuild mode will be used.
Index name pattern: The indexes that are maintained. The default is Blank, which indicates a blank entry that results in all indexes being
considered for maintenance.
The sample data tables can be partitioned in order to achieve improved performance.
The sample data will be partitioned by day so if you for instance have configured the system to delete raw sample data older than 365 days, then
the sample data tables (RN_QOS_DATA_xxxx) will each be configured with 365 partitions (plus a few extra partitions in order allow for faster
maintenance).
SQL Server: If using partitioning then the property Delete raw data older than must be between 1 and 900. SQL Server, up to and including 2008
SP1, limits a table to 1000 partitions.
The partitioning will contribute to improved performance when accessing the raw sample data tables:
higher insert rates
faster read access to data
faster data maintenance (delete/compress of sample data)
faster index maintenance
You enable/disable the partition by checking or unchecking the Partition data tables checkbox in the data_engine configuration dialog.
If the state of the data_engine partitioning table changes, you will then be asked to choose how to execute the partitioning:
If you chose "Start now" you'll be asked to define for how long you'll allow the partitioning to run, and a confirmation message box will
show the progress of the execution compared to the length of time you specified. A successful completion message will be displayed
when the partitioning successfully has been executed.
You can let the scheduled maintenance execute the partitioning by choosing "Run at maintenance". In this case the partitioning will be
executed before the actual data maintenance is executed.
When using the Partitioning feature all maintenance activities will be optimized to use the optimizations partitioning offers, in particular
Purging of raw sample data will be done by dropping partitions
Index maintenance will be done on last partition only (SQL Server only)
While the default settings for partitioned maintenance will work well for most installations running daily maintenance, it is possible to override
these settings. Go to Override Default Partition Maintenance Settings for further details.
The Database tab enables you to specify the database connection settings. The currently supported databases include:
Microsoft SQL Server
MySQL
Oracle
The database settings specified here are used by various other probes, such as dashboard_engine, discovery_server, sla_engine, wasp, and
dap.
Important: If you install CA Unified Infrastructure Management (UIM) and MS-SQL on a nonstandard port you must configure the
data_engine "server" parameter to include the port. Example: "<server>,< port>".
MySQL Options
Database vendor: Select MySQL option.
Schema: Enter the database schema name
Server Host: Enter the database server name or IP address
Port: Enter the port number to connect to the database server
Username: The login user name
Password: The login users password. Ensure that the password does NOT contain any special characters (such as ";").
Click the Test Connection button to check the current connection setup.
Oracle Options
Database vendor: Select Oracle option.
Hostname: Enter the database server name or IP address
Port: Enter the port number to connect to the database server
Username: The login user name
Password: The login user's password. Ensure that the password does NOT contain any special characters (such as ";")
Service Name: Enter the Service name
Click the Test Connection button to check the current connection setup.
Prerequisites
SQL Authentication on Microsoft SQL Server
Oracle Database Prerequisite: Oracle Partitioning Option Required
Oracle Database Prerequisite: Verify Tablespace Before Partitioning an Oracle Database
Upgrade Prerequisites for Deployments with an Oracle Database
Using the Admin Console to Access the data_engine Configuration GUI
Change the Database Connection Properties
Configure the Data Retention Settings
Override the Data Retention Settings on Individual QoS Objects
Set up Index Maintenance (MS SQL Server and Oracle)
Set up Partitioning for Raw Sample Data (MS SQL Server and Oracle)
Schedule Database Maintenance
Prerequisites
If you are using SQL Authentication on Microsoft SQL Server, see SQL Authentication on Microsoft SQL Server.
Review Preparing to Partition Your Oracle Database to ensure that you you have enabled the Oracle Partitioning Option and have enough free
disk space to create a partitioned interim table that is used during the partitioning process.
Before upgrading UIM Server 8.0 in deployments with an Oracle database, you must grant the appropriate user permissions. See Upgrade
Prerequisites for Deployments with an Oracle Database for details.
select
segment_name
table_name,
sum(bytes)/(1024*1024) table_size_meg
from
user_extents
where
(segment_type='TABLE' or segment_type='TABLE PARTITION')
and
segment_name like 'RN_QOS_DATA_%'
group by segment_name
order by table_size_meg desc;
2. Initial partitioning of the tables online also consumes disk space in the SYSTEM tablespace. The disk space required prior to partitioning
is approximately the same as that required by the largest unpartitioned RN table. Determine the utilization of the system tables by
generating and running the following script:
3. Initial partitioning of the tables also consumes storage in the TEMP tablespace. The free disk space required prior to partitioning is
approximately the same as that required by the largest unpartitioned RN table. Determine the utilization of the TEMP tablespace by
generating and running the following script:
If the upgrade is performed before you grant the appropriate user permissions, UIM Server errors will occur during scheduled maintenance for
data_engine. The following error message appears in the data_engine log file:
SPN_DE_DATAMAINT is invalid
To correct this situation, use the following procedure to grant the appropriate permissions and recompile the stored procedures.
Follow these steps:
1. Log in to the database server as SYSDBA and execute:
2. Use a tool such as Oracle SQL Developer to recompile the following stored procedures:SPN_DE_DATAMAINT
SPN_DE_DATAMAINTDELETEOLDDATA
SPN_DE_UNPARTITIONTABLE
SPN_DE_PARTITIONTABLE
SPN_DE_UPDATEQOSDEFMETRIC
Set up Partitioning for Raw Sample Data (MS SQL Server and Oracle)
Set up Index Maintenance (MS SQL Server and Oracle)
To open the data_engine configuration GUI:
1. In the Admin Console navigation tree, click the down arrow next to the hub, then the robot the data_engine probe resides on.
2. Click the down arrow next to the data_engine probe, select Configure.
You can set up Index Maintenance to improve the speed of data retrieval operations.
Follow these steps:
1. In the data_engine probe configuration menu, click the Database Configuration folder.
2. Select the Index Maintenance check box.
3. Change the desired Index Maintenance options. See Database Configuration for more information about the individual fields for each
database vendor.
Important! (Oracle) If you have partitioned your Oracle data tables, there is no need to select the Index Maintenance option.
4. Click Save.
Index Maintenance is performed during the next maintenance period.
Set up Partitioning for Raw Sample Data (MS SQL Server and Oracle)
You can set up partitioning to improve performance when accessing the raw sample data tables.
Follow these steps:
1. In the data_engine probe configuration menu, click the Database Configuration folder.
2. Select the Partition Data Tables check box.
3. Click Save.
Partitioning will be performed during the next maintenance period. The time required to execute the partitioning is dependent on both the amount
of data and the performance of the disk subsystem. Partitioning can take up to several days on especially large installations.
data_engine
Probe Information
General Configuration
Quality of Service Type Status
Database Configuration
MySQL
Microsoft SQL Server
Oracle
Quality of Service
Scheduler
To access the data_engine configuration interface, select the robot that the data_engine probe resides on in the Admin Console navigation pane.
In the Probes list, click the arrow to the left of the probe and select Configure.
data_engine
Navigation: data_engine
This section lets you view probe and QoS information, change the log level, and set data management values.
Probe Information
This section provides data regarding the QoS tables and is read-only.
Note: The status information is created based on statistics that are generated by the database provider. If incorrect information is
displayed, you might need to update the table statistics. For more information, see Out-of-date Information in the Quality of Service
Type Status Table.
Database Configuration
Important! The database connection properties should only be changed in limited circumstances such as recovery operations.
Changing the Database Vendor can cause connection issues. If you are changing database vendors, CA recommends reinstalling CA
Unified Infrastructure Management.
The Database Configuration section allows you to specify the database connection settings. These settings are different for each database
vendor:
MySQL
Microsoft SQL Server
Oracle
To test the connection for all vendors, select Actions>Test Connection at the top of the page.
MySQL
Index Maintenance: Perform table re-indexing with other maintenance routines, which by default are executed every 24 hours.
Compression mode: The method that is used for data compression:
None: No compression occurs.
Page: Optimizes storage of multiple rows in a page, a super-set of row compression.
Row: Stores fixed-length data types in variable-length storage format.
Maintenance mode: How the indexes are maintained:
Dynamic: Maintenance is performed based on the index statistics.
Reorganize: Maintenance is performed using the "alter index ... reorganize" SQL Server script.
Rebuild: Maintenance is performed using the "alter index ... rebuild" SQL Server script.
Important! It is not possible to rebuild the index for single partitions prior to SQL Server 2014. You can only reorganize individual
partitions. Performing automatic indexing for large tables from the data_engine is discouraged, as indexing might not complete in a
reasonable amount of time.
Online mode: The effect of maintenance on concurrent use of the QoS tables:
Dynamic: The maintenance is determined by the edition of SQL Server. If SQL Server is the Enterprise Edition, then Online mode is
used for maintenance (if the chosen maintenance mode supports it); otherwise, Offline mode is used.
Online: The QoS tables are available for update and query during the data maintenance period. Online mode offers greater
concurrency but demands more resources.
Offline: The QoS tables are unavailable for update and query during the data maintenance period.
Fragmentation level: Low threshold: If the fragmentation for an index is less than the low threshold percent value, then no maintenance
is performed.
Fragmentation level: High threshold: If dynamic maintenance mode is selected and fragmentation is between the low and high threshold
percentages, then the Reorganize mode is used; otherwise the Rebuild mode is used.
Note: This option is only available for Microsoft SQL Server Enterprise Edition. It is not available when using the Microsoft SQL Express
edition.
Index name pattern: The indexes that are maintained. The default is Blank (a blank entry results in all indexes being considered for
maintenance).
Oracle
Quality of Service
Navigation: data_engine > Quality of Service
The Quality of Service section displays the attributes for the QoS metrics.
Name: The QoS type name.
Description: Description of the QoS type.
QoS Group: The QoS group is a logical group to which the QoS belongs (optional).
Unit: The unit of the QoS data (the abbreviated form of the QoS data unit).
Has Max Value: The data type has an absolute maximum. For example, disk size or memory usage have absolute maximums.
Is Boolean: The data type is logical (yes/no). For example, a host is available/unavailable or printer is up/down.
Type: Different data types:
0 = Automatic (The sample value is read at fixed intervals, which are set individually for each probe).
1 = Asynchronous (The sample value is read only when the value changes, and the new value is read).
Override Raw Age: Select this check box to override the raw age of the QoS metric.
Raw Age: The number of days you want to retain the QoS metric information.
Override History Age: Select this check box to override the history age for the QoS metric.
History Age: The number of days you want to retain the history information.
Override Daily Average Age: Select this check box to override the daily average age for the QoS metric.
Daily Average Age: The number of days you want to retain the daily average information.
Override Compression: Select this check box to override compression settings for data in RN and HN tables.
Compress: Raw data is summarized and aggregated into Hourly (or historic) data before it is deleted from the RN tables. This Hourly
data is then summarized and aggregated into Daily data before it is deleted from the HN tables.
Scheduler
Navigation: data_engine > Scheduler
This section allows you to schedule database maintenance.
Start time - Select either Now or a specific date and time. Selecting now begins the new database maintenance schedule immediately.
Ending - Select Never, After x occurrences, or By a specific date and time.
Recurring - select one of the following occurrence patterns:
Minutely
Hourly
Daily (including a specific time)
Weekly (including a specific time and days of the week)
Monthly (including occurrence, calendar day, and specific time)
Yearly (including month and specific time)
Prerequisites
The General Tab
Index Maintenance Properties (SQL Server only)
Partitioning of Raw Sample Data (SQL Server)
The Quality of Service Type Status Window
The Database Tab
Microsoft SQL Server Options
MySQL Options
Oracle Options
The Quality of Service Tab
The Schedule Tab
Configure the data_engine by selecting the data_engine probe in Infrastructure Manager, then right-click and select Configure. This opens the
data_engine configuration dialog.
Important! It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only reorganize
individual partitions. Performing automatic indexing for large tables from the data_engine is discouraged, as indexing may not
complete in a reasonable amount of time.
Index maintenance properties (SQL Server option only): Click the Advanced button to activate or deactivate the index maintenance
properties. For more details, go to the Index Maintenance Properties section. This option will not appear if you are using Oracle or
MySQL.
Partition data tables (SQL Server option only): Option to partition the data tables.
Note: This option is only available for Microsoft SQL Server Enterprise Edition. It is not available in the Microsoft SQL Express
edition.
Log Setup: This section allows you to set the log level details.
Log Level: Sets the level of details written to the log file. Log as little as possible during normal operation to minimize disk consumption,
and increase the amount of detail when debugging.
Status: This button opens the Status window and provides the current work status of the probe. For more details, see the Status Window section.
Statistics: This section displays the current transmission rate for transferring QoS messages into the database.
Note: It is not possible to rebuild the index for single partitions prior to SQLServer 2014. You can only reorganize individual
partitions. Performing automatic indexing for very large tables from the data_engine is strongly discouraged, as indexing may
not complete in a reasonable amount of time.
Online mode
Defines the effect of maintenance on concurrent use of the QoS tables.
Note: The online maintenance option is not supported if partitioning is enabled on SQL Server.
The sample data tables can be partitioned in order to achieve improved performance.
The sample data will be partitioned by day so if you for instance have configured the system to delete raw sample data older than 365 days, then
the sample data tables (RN_QOS_DATA_xxxx) will each be configured with 365 partitions (plus a few extra partitions in order allow for faster
maintenance).
SQL Server: If using partitioning then the property Delete raw data older than must be between 1 and 900. SQL Server, up to and including 2008
SP1, limits a table to 1000 partitions.
The partitioning will contribute to improved performance when accessing the raw sample data tables:
higher insert rates
faster read access to data
faster data maintenance (delete/compress of sample data)
faster index maintenance
You enable/disable the partition by checking or unchecking the Partition data tables checkbox in the data_engine configuration dialog.
If the state of the data_engine partitioning table changes, you will then be asked to choose how to execute the partitioning:
If you chose "Start now" you'll be asked to define for how long you'll allow the partitioning to run, and a confirmation message box will
show the progress of the execution compared to the length of time you specified. A successful completion message will be displayed
when the partitioning successfully has been executed.
You can let the scheduled maintenance execute the partitioning by choosing "Run at maintenance". In this case the partitioning will be
executed before the actual data maintenance is executed.
When using the Partitioning feature all maintenance activities will be optimized to use the optimizations partitioning offers, in particular
Purging of raw sample data will be done by dropping partitions
Index maintenance will be done on last partition only (SQL Server only)
While the default settings for partitioned maintenance will work well for most installations running daily maintenance, it is possible to override
these settings. Go to Override Default Partition Maintenance Settings for further details.
The Database tab enables you to specify the database connection settings. The currently supported databases include:
Microsoft SQL Server
MySQL
Oracle
The database settings specified here are used by various other probes, such as dashboard_engine, discovery_server, sla_engine, wasp, and
dap.
Database vendor: Select the name of the vendor of the database. For MS-SQL, select Microsoft option.
Provider: The ADO provider used for the database connection.
Initial Catalog: The database name
Data Source: The database server
User ID: The login user
Password: The login users password. Ensure that the password does NOT contain any special characters (such as ";").
Parameters: Other parameters to the OLEDB connection
Click the Test Connection button to check the current connection setup.
Important: If you install CA Unified Infrastructure Management (UIM) and MS-SQL on a nonstandard port you must configure the
data_engine "server" parameter to include the port. Example: "<server>,< port>".
MySQL Options
Database vendor: Select MySQL option.
Schema: Enter the database schema name
Server Host: Enter the database server name or IP address
Port: Enter the port number to connect to the database server
Username: The login user name
Password: The login users password. Ensure that the password does NOT contain any special characters (such as ";").
Click the Test Connection button to check the current connection setup.
Oracle Options
Database vendor: Select Oracle option.
Hostname: Enter the database server name or IP address
Port: Enter the port number to connect to the database server
Username: The login user name
Password: The login user's password. Ensure that the password does NOT contain any special characters (such as ";")
Service Name: Enter the Service name
Click the Test Connection button to check the current connection setup.
Configuration Parameters
bucket_flush_size = 5000
Number of allowed buckets before its flushed
bucket_flush_time = 5
Number of seconds before flushing a bucket
dispatcher_time = 1000
Number of milliseconds to spend in dispatcher loop
hub_bulk_size = 20
Number of messages to receive each time from hub
thread_count_insert = 0
Number of preallocated threads used to commit data to database. If this key is zero or not present, serial mode will be used. If this
number is 1 or higher, the number indicates the number of threads that will be preallocated for the commit task.
queue_limit_total = 100000
This number indicates how many messages you want to allow data_engine to keep in memory before suspending reading messages
from hub. This configuration is used in case committing to the database server starts to fall behind.
Note: The more rows you have in memory, the more data will potentially be lost if there is a power surge or data_engine crashes for an
unknown reason.
Viewing Output
The text file will be updated continuously after each QoS thread cycle, meaning after one round of dispatching messages, one round of iterating rn
buckets to commit data to database.
User may want to use a utility which can tail files, otherwise you need to close and reopen the file.
5. Time callback: The amount of milliseconds data_engine used to validate and sort QoS messages into RN bulk objects.
Note: The callback time is included as part of the "dispatcher time", which covers both receiving and validating/sorting
messages.
6. Time commit: The amount of milliseconds data_engine used last iteration to bulk commit data to database server.
7. Time total: The summarized time in milliseconds data_engine used both in dispatcher and committing data to database server. This also
includes the time data_engine used to check if connection was still up.
8. Rows committed: Number of rows committed last iteration
9. Lists flushed: This is the number of RN bulk objects that had data in them that were just flushed to database server
10. Lists flushed size: The number of lists (of the total lists flushed) that were flushed because they exceeded the number of messages we
need before a bulk insert is conducted.
11. Lists flushed time: The number of lists (of the total lists flushed) that were flushed because the age of the data in the lists exceeded the
allowed delay before committing to database server.
To turn off the logging, user can call the log_statistics callback again and pass in parameter log_stats = no.
The output of the log might look something like this for parallel mode:
Start: 2011-01-24 12:17:47 queue: data_engine flags: Unpack, Process, Sort, Commit,
time
used ms
| hub/s
proc/s
com/s
|
mem
| com
-----------------------------------------------------------------------------------2011-01-24 14:44:51
999
|
0
14
0
|
19
|
2011-01-24 14:44:52
1000
|
0
0
0
|
19
|
2011-01-24 14:44:53
1005
|
0
0
18
|
0
|
2011-01-24 14:44:54
999
|
0
0
0
|
0
|
2011-01-24 14:44:55
1000
|
0
0
0
|
0
|
2011-01-24 14:44:56
1000
|
0
0
0
|
0
|
Thread Count
Multi-threading is not enabled by default in the data_engine probe. To increase data_engine performance, you can increase the number of
threads for the thread_count_insert parameter in Raw Configure. The optimum thread count is highly dependent on several factors, including:
The number of CPUs running on the system.
The number of RN tables in the UIM database.
The size of your CA UIM deployment.
data_engine Troubleshooting
This article provide troubleshooting information for issues you might encounter while upgrading, configuring, or using different version of the
data_engine probe.
Solution:
The index names for the BN and RN tables used by older versions of data_engine (versions 7.6 and earlier) do not match the index names for the
index tables used by the data_engine v8.0 and later. If you partition the Oracle database after an upgrade, the index names for the BN and RN
tables are updated. However, if you do not partition the Oracle database after upgrading to a newer version of CA UIM, the old index names
remain for the BN and RN tables and data maintenance fails. To correct this issue, run the following stored procedures on your Oracle database.
Script to change IDX2 to IDX0 for all the RN tables:
lNewIndex;
lNewIndex;
Data Maintenance on an Oracle Database Doesn't Work After Upgrading From NMS 7.6
Symptom:
(Oracle databases only) I upgraded from NMS 7.6 to a newer version of CA UIM. When I attempt to partition my Oracle database, data
maintenance is not performed and I receive the following error in the data_engine log file:
ORA-01031: insufficient privileges
Solution:
Execute the grant command as sysdba before or after upgrading from NMS 7.6 to CA UIM 8.0 or later.
Follow these steps:
1. Log in to the database server as SYSDBA using a tool such as Oracle SQL Developer and execute:
grant
grant
grant
grant
grant
grant
Note: When you do an upgrade from a fresh install of CA UIM 8.0 or later, the appropriate user permissions to an Oracle
database are granted and it is not necessary to run the grant command.
2. Execute the tool in report mode (-r flag set) with the java version installed with UIM Server (formerly name NM Server):
3. The patch utility scans the S_QOS_DEF_SKIP_UNIT table in the database and finds QoS Definitions that are suspected to be corrupt.
Important! The S_QOS_DEF_SKIP_UNIT table holds QoS definition values that should not override what is sent by a QoS
probe. This table is pre-populated with three values: variant, none, and user defined.
If you have defined additional custom values, which have been incorrectly overridden by a previous data_engine version, add these
values as new rows to the S_QOS_DEF_SKIP_UNIT table prior to step 5 below, so that the patch utility will find, report, and fix them
as well. (Use standard database management tools to connect to the NIS database and add the rows and new values to the
S_QOS_DEF_SKIP_UNIT table.)
The report generated by the utility shows corrupt QoS Definitions (if any) and the probe or probes associated with that data. Here is an
excerpt from an example report:
List current problems...
ANALYZE table RN_QOS_DATA_XXXX COMPUTE STATISTICS; - The entire table is analyzed using a full table scan and stored in the
data dictionary.
Using the ANALYZE command in MySQL can be a time-consuming operation, especially for large databases. Only perform the command
sporadically and do not use Automated Maintenance Tasks.
In MySQL, the ANALYZE command holds a read lock on tables, which can negatively impact database performance.
For more information, refer to the following MySQL documentation:
http://dev.mysql.com/doc/refman/5.5/en/tables-table.html
http://dev.mysql.com/doc/refman/5.6/en/analyze-table.html
Default value
Type
statistics_age
24
Time in hours. This means when the stored procedure is called, statistics that are older than this
number will be updated. This value is used by the stored procedure, not data_engine itself.
If this number is set to 0 (zero), statistics will be disabled and not be run at all by the
data_engine.
statistics_pattern
RN_QOS_DATA%
statistics_loglevel
statistics_time_pattern
<not set>
The scheduling string that determines when to run statistics. If this key is empty or not set, the
same schedule that is defined for data management will be used. This
means the statistics will be run when data_engine has finished index maintenance and data
management.
If this value is specified to a different schedule, the statistics will be updated independently of
when data management is scheduled.
The string will be used by the calendar scheduling library, which is used by various UIM
components. It supports RFC2445. See short example below.
Some string examples that are copied from the library help file.
/**********************************************************************
******************
** nimCalCreate - Creates a handle to a nimCal structure
***********************************************************************
******************
** PARAMETERS:
**
char
*pattern
- RFC2445 ,'weekly' or 'dts'
** char
*start
- startdate: yyyy-mm-dd [hh:mm:ss] || NULL
**
: weekly format 1 or 2
**
**
start = 'yyyy-mm-dd [hh:mm:ss]' will expect the 'pattern' to
comply with RFC2445.
**
= NULL results in setting start to 'now'
**
e.g.
**
h =
nimCalCreate("DTSTART:19970610T090000|RRULE:FREQ=YEARLY;COUNT=10",NULL)
;
**
h =
nimCalCreate("DTSTART:19970610T090000|RRULE:FREQ=YEARLY;COUNT=10","2007
-07-25");
**
** pattern = 'weekly' handles two 'start' formats:
**
1. 0123,10:00,14:00 [,NOT]
(old NAS
format)
**
2. MO=12:00-14:00,15:30-17:00;TU=08:00-16:00
(new, allow
8-16)
**
**
h = nimCalCreate("weekly","012,10:00,14:00");
**
h =
nimCalCreate("weekly","MO=12:00-14:00,15:30-17:00;TU=08:00-16:00");
**
h = nimCalCreate("dts","2007-08-20 08:00,2007-08-27
08:00,2007-09-03 08:00,2007-09-10"
**
** Note: Free the handle using nimCalFree.
***********************************************************************
*****************/
You can also create a schedule in nas and use the resulting string from there or use data_engine scheduler to create a string.
SELECT sqltext.TEXT,
req.session_id,
req.status,
req.command,
req.cpu_time,
req.total_elapsed_time
FROM sys.dm_exec_requests req
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS sqltext;
If the results display multiple "CREATE nonclustered index" statements, you have more than one partitioning job running.
To stop a partitioning job:
KILL [session_id]
The DB2 Database Monitoring probe supports DB2 versions 10.1 and 10.5 only.
The probe monitors around 280 DB2 snapshots and statistic counters, and calculates values such as:
i_agents_created_ratio
i_piped_sorts_rejected
db_pool_hit_ratio
db_avg_sort_time
db_pct_sort_overflows
db_avg_sort_heap
db_pct_hjs_overflows
db_pool_sync_reads
db_pool_sync_writes
db_pool_sync_idx_writes
db_pool_sync_idx_reads
db_pool_avg_async_read_time
db_pool_avg_async_write_time
db_pool_sync_write_time
db_pool_avg_write_time
db_avg_direct_read_time
db_avg_direct_write_time
db_cat_cache_hit_rto
app_avg_sort_time
app_pct_sort_overflows
app_pool_hit_ratio
app_avg_direct_read_time
app_avg_direct_write_time
app_cat_cache_hit_rto
app_pkg_cache_hit_rto
app_locklist_util
bp_pool_hit_ratio
bp_pool_avg_async_read_time
bp_pool_avg_async_write_time
bp_pool_sync_write_time
bp_pool_avg_write_time
bp_avg_direct_read_time
bp_avg_direct_write_time
bp_pool_sync_reads
bp_pool_sync_writes
bp_pool_sync_idx_writes
bp_pool_sync_idx_reads
ts_usable_pages_pct
ts_used_pages_pct
ts_free_pages_pct
ts_max_used_pages_pct
i_pct_active_connections
More information:
db2 (DB2 Database Monitoring) Release Notes
Contents
Verify Prerequisites
Set up General Properties
Create a Connection
Test the Connection
Create a Profile
Add Profile Checkpoints
Configure Checkpoints
Alarm Thresholds
Create Custom Checkpoints
Add Exclude Patterns
Use Regular Expressions
Verify Prerequisites
Verify that required hardware, software, and related information is available before you configure the probe. For more information, see db2 (DB2
Database Monitoring) Release Notes.
1. Select the db2 node and specify the following values as required:
Clear Alarm on Restart Only: enables you to clear all alarms when the probe restarts.
Default: Selected
Alarm Severity Filter: sets a filter on severity levels of an event that can become potential alarms. For example, as a database
administrator, you want to convey important events on to the operations center or help-desk, so that the event can trigger emails. The
Alarm Severity Filter considers the events matching or exceeding the selected severity level alarms. If you select major, then only
the messages with severity level as major and above are considered as alarms.
Default: clear
You can select from the following options:
clear
information
warning
minor
major
critical
Log Size(KB): specifies the maximum size of the probe log file.
Default: 100
LogLevel: specifies the level of details that are written to the log file. Log as little as possible during normal operations to minimize the
disk consumption, and increase the amount of detail when debugging.
Default: 2 - High
QoS V2 Compatibility: allows you to provide backward compatibility to the V2 framework.
Default: Not Selected
All Database Status: enables you to retrieve status of all database instances except the default database.
Default: Not Selected
2. Click Save.
Create a Connection
A connection can be used by more than one monitoring profile. The probe allows you to define connections that helps you monitor required
database instances.
Follow these steps:
1. Click the Options icon next to the db2 node in the navigation pane.
2. Click Create new connection option.
3. Enter a name of the connection in the Connection Name field.
4. Specify the User ID and password for connecting to the monitoring server.
5. Specify the DB Default Details as required:
DB Default Name(Manually): allows you to manually specify the database instance node and the associated database node that you
want to connect.
DB Default Name: allows you to select the database instance node and the associated database node that you want to connect.
6. Set or modify the following values, as required:
Retry Attempts: allows you to specify the number of attempts made by the probe to connect in case of connection failure.
Default: 3
Retry Delay: allows you to specify the time that the probe waits between the two connection attempts.
Default: 5
Retry Delay(Units): allows you to specify the unit to measure the retry delay value.
Default: sec
Timeout: allows you to specify the time before considering a connection failure.
Default: 0
Timeout(Units): allows you to specify the timeout unit.
Default: sec
7. Click Submit.
Navigate to db2 > Connection name > DB2 > Connection name node. Click Actions > Test Connection in the Connection Name node. If the
connection is successful, it returns the instance name and database name. If the connection is unsuccessful, the probe displays a failure
message.
Create a Profile
You can create multiple profiles that monitor the DB2 server performance using various parameters. The profile runs on the DB2 server using the
connections that you have defined.
Follow these steps:
1. Navigate to db2 > Connection name > DB2 > Connection name node and click the Options (icon).
2. Click the Create new profile option.
3. Enter a unique name of the profile in the Name field.
4. Select the connection to be used in the profile from the Connection field.
5. Click Submit.
The profile is saved.
Note: The probe adds default checkpoints to the profile, however, they are inactive by default. For more information about activating the
required checkpoint, see the Add Profile Checkpoints section.
Configure Checkpoints
You can configure the properties of a checkpoint to monitor the DB2 server performance or unwanted events. The checkpoint monitors the
corresponding real-time object in the DB2 server. When the configured profile runs on the DB2 server, the set criteria for the checkpoints are
scanned against the real-time objects. If any unwanted event occurs, the probe generates alarms or QoS messages.
Follow these steps:
1. Navigate to the checkpoint name node under the required profile.
Note: You can also configure a checkpoint from db2 > Checkpoints > checkpoint name.
All changes that are made to a checkpoint are applied to all instances of the checkpoint in the configured profiles.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Check interval (Units): allows you to specify the check interval unit. You can select any of the following units:
sec
min
hrs
Samples: allows you to specify the number of samples to calculate an average value. This average value is compared to the specified
alarm threshold.
Default: 1
When the probe starts, the average value from the number of available samples is calculated. For example, if you specify the Sample
s value as 3, the probe performs the following actions:
uses the first sample value in the first interval
uses the average of samples 1 and 2 in the second interval.
Clear message: allows you to specify the message name that is used for the clear alarm.
Clear severity: enables you to select the severity that is used for generating the clear alarm message.
Default: clear
You can select from the following options:
clear
information
warning
minor
major
critical
Use Exclude: enables the exclude list, which defines objects that are excluded from monitoring.
4. Click Save to save the checkpoint configuration details.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
generating QoS.
9. Click Submit.
Note: This test is possible only for running active profiles and checkpoints that you do not want to monitor on the checkpoint.
6. Click Save.
The exclude pattern is added to the checkpoint.
db2 Node
Checkpoints Node
<Checkpoint Name> Node
Monitors Node
Connection <Connection Name> Node
<Connection Name> Node
<Profile Name> Node
db2 Node
The db2 node contains the configuration details specific to the db2 probe. This node lets you view the probe information and configure the general
properties of the probe. You can also view a list of all alarm messages that are available in the probe.
Navigation: db2
Set or modify the following values as required:
Checkpoints Node
The Checkpoints node lets you create a checkpoint to monitor the DB2 database. The default checkpoints are in this node, and you can also
create custom checkpoints.
Navigation: db2 > Checkpoints
Set or modify the following values as required:
Checkpoint > Options icon > Create New Checkpoint
This section lets you create a checkpoint.
Name: specifies the name of the checkpoint.
Connection Name: specifies the name of the connection.
Query File: defines the query file name where the query is stored.
Query: defines the query for creating the checkpoint.
Interval Modus: subtracts the variable value from the value that is generated at the end of the interval.
If you select Interval Modus, the probe subtracts the variable value at the beginning of an interval from the value found at the end.
The probe uses this result for checking alarms and adds $column_interval_value.i to the list of message variables for each column
(threshold object). This list can be viewed in the Edit Threshold dialog box.
If you do not select the Interval Modus check box, the value of variable as returned from the query will be used for checking and
generating QoS.
Default: Not selected
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Check Interval(Units): specifies the check interval unit. You can select any of the following units:
sec
min
hrs
Samples: specifies the number of samples to calculate an average value. This average value is compared to the specified alarm
threshold.
Default: 1
When the probe starts, the average value from the number of available samples is calculated. For example, if you specify the Samples v
alue as 3, the probe performs the following actions:
uses the first sample value in the first interval
uses the average of samples 1 and 2 in the second interval.
Clear severity: enables you to select the severity that is used for generating the clear alarm message.
Default: clear
You can select from the following options:
clear
information
warning
minor
major
critical
Clear message: specifies the message name that is used for the clear alarm.
Use Exclude: enables the exclude list, which defines objects that are excluded from monitoring.
checkpoint name > Exclude List
This section lets you add an exclude list, which defines objects that you do not want to monitor.
Exclude Pattern: defines a regular expression on which the exclude functionality works.
checkpoint name > Checkpoint Config
This section lets you configure the checkpoint.
Query File: defines the query file name where the query is stored.
Query: defines the query statement.
Interval Modus/ Report Per Second Metric: subtracts the variable value from the value that is generated at the end of the interval.
Note: This section appears only for the custom checkpoints.
Monitors Node
The Monitors node lets you view and define the QoS and thresholds for monitoring the checkpoint.
Navigation: db2 > Checkpoints > checkpoint name > Monitors
Set or modify the following values as required:
Monitors > Quality of Service
This section lets you configure the QoS for the checkpoint.
Description: defines the additional information about the QoS.
Unit: defines the QoS unit.
Abbreviation: defines the QoS abbreviation.
Metric: specifies the metric to be used for the checkpoint QoS.
Max value: specifies the maximum value of the QoS.
Object: specifies the name of the database object that you want to monitor.
Monitors > Threshold
This section lets you configure threshold values for a checkpoint.
Threshold object name: displays the name of the threshold object. The probe uses this field value as default for a checkpoint.
Threshold value: specifies the value that is used for threshold evaluation.
Severity: specifies the severity level of the alarm message that is generated for the threshold.
Message: specifies the message name used.
Message Text: specifies the message text containing variables that are replaced at run time. If the message text is changed from a
profile list, you must create a new message.
Note: The field descriptions are the same as described in Create New Connection section in the db2 node.
Note: The field descriptions are the same as described in Create New Connection section in the db2 node.
Note: This number is a common denominator to all used check interval values. The higher the value the lower is the profile
overhead.
Heartbeat(Units): specifies the heartbeat unit.
Default: sec
Connection: specifies the connection that is used in the profile. The drop-down list displays the list of connections that you have defined
to connect to the DB2 database.
Check Interval: specifies the time interval at which the DB2 database is scanned. Reduce this interval to generate alarms frequently.
Default: 1
Check Interval(Units): specifies the check interval unit.
Default: min
Clear message: specifies the message for the timeout clear alarm message.
Default: p_timeout_1
SQL Timeout: specifies the SQL query timeout. Every checkpoint query runs asynchronously. In case the query reaches the SQL
timeout, the checkpoint processing is terminated. The probe starts processing the next checkpoint and issues an alarm message.
Default: 15
SQL Timeout(Units): specifies the SQL Timeout unit.
Default: sec
Message: specifies the message name that is used for SQL timeout alarm.
Default: sql_timeout_1
Profile Timeout: specifies the maximum processing time for all checkpoints in the profile. If this timeout is reached, the interval
processing finishes and the probe waits for next heartbeat to evaluate any checkpoint schedules. Alarm message is issued.
Default: 10
Profile Timeout(Units): specifies the profile timeout unit.
Default: min
Message: specifies the profile timeout message.
Default: p_timeout_1
Timeout Severity: defines the severity for timeout messages.
Default: major
Alarm Source: overrides the source name of the alarm on the Unified Service Management (USM). If you do not specify a value, robot
IP address is used.
You can configure the db2 probe by defining various profiles and connections to the DB2 server. The connections that you set up are used for
running monitoring profiles on the DB2 server. These profiles monitor real-time events occurring in the server using various checkpoints. You can
select the required checkpoints and activate the profile that notifies you when any unwanted event occurs in the server. The probe provides
predefined checkpoints that you can customize. For example, you can create a profile that uses db_status checkpoint to monitor the status of
database at a given time. You can also create checkpoints.
These profiles can also be configured to generate alarm and QoS messages.
The following diagram outlines the process to configure the db2 probe.
Contents
Verify Prerequisites
Set up General Properties
Create a Connection
Create a Profile
Add Profile Checkpoints
Configure Checkpoints
Create Custom Checkpoints
Configure Alarm Thresholds
Define Schedules
View Profile Status
Create a Group
Use Regular Expressions
Verify Prerequisites
Verify that required hardware, software, and related information is available before you configure the probe. For more information, see db2 (DB2
Database Monitoring) Release Notes.
Note: The probe verifies the events matching the selected severity level only when you disable the Generate status only fi
eld.
Note: The Status Auto-Update value is saved in the configuration file but the checkbox is cleared when you restart the
probe.
Create a Connection
A connection can be used by more than one monitoring profile. You can create separate connections to access the Data Source Name (DSN) or
DB2 instances, as required.
Follow these steps:
4. Click
adjacent to the Instance node field and double-click the required instance that you want to monitor from the Node List dial
og.
The Node List dialog displays the list of instance nodes available for the DB2 connection.
5. Click
adjacent to Default DB name field and double-click the required database that you want to monitor from the Database Lis
t dialog.
The Database List dialog displays the list of databases associated with the selected instance node.
6. Specify values for the following fields:
DSN: select to specify the ODBC Data Source Name. The probe enables only the Password field when you select this check box.
See Define an ODBC Connection for more information on how to create an ODBC connection.
Description: specifies additional information about the connection.
Retry attempts: specifies the number of attempts that the probe tries to connect in case of connection failure.
Retry delay: specifies the time period for which the probe waits between two connection attempts.
Timeout: specifies the connection timeout value.
7. Click Test to verify the status of connection. If the connection is successful, it returns the instance name and database name. If the
connection is unsuccessful, the probe displays an error message.
8. Click OK.
The new connection is added to the Connections tab.
Create a Profile
You can create multiple profiles that monitor the DB2 server to verify that the performance always remains optimal.
Follow these steps:
1. Right-click in the Profiles tab and select New.
2. Enter a unique name for the profile in Add New Profile dialog and click OK.
The New Profile[Profile Name] dialog appears.
3. Set or modify the following values, as required:
Description: specifies additional information about the profile.
Heartbeat: specifies the interval at which all profile checkpoint schedules are tested and executed. This number is a common
denominator to all used check interval values. The higher the value, the lower is the profile overhead.
Default: 1 seconds
Connection: specifies the connection used in this profile. The drop-down lists all the connections displayed in the Connections tab.
Check Interval: specifies the time interval at which the DB2 server is scanned.
Default: 5 minutes
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Clear message: specifies the message name used for clear alarm.
SQL Timeout: specifies the SQL query timeout. Every checkpoint query runs asynchronously. In case the query reaches the SQL
timeout, the checkpoint processing is terminated. The probe starts processing the next checkpoint and issues an alarm message.
Default: 30 seconds
Note: On every check interval, the SQL timeout alert is first cleared based on the query of the particular checkpoint. The
SQL timeout alert is thrown again if the issue remains.
Message: specifies the message name used for SQL timeout alarm.
Profile Timeout: specifies the maximum processing time for all checkpoints in the profile. If this timeout is reached, the interval
processing is done. The probe waits for next heartbeat to evaluate any checkpoint schedules and issues an alarm message.
Default: 15 minutes
Message: specifies the message name used for profile timeout alarm.
Timeout severity: specifies the severity for timeout messages.
Group: specifies the name of the group. When you select a group, all the checkpoints enabled under the group becomes available for
monitoring. The drop-down lists the defined groups from the Groups tab.
Alarm Source: overrides the source name of the alarm on the Unified Service Management. If you do not specify a value, robot IP
address is used.
Suspended/Resumed: specifies the profile state as running or suspended. This indicator is green when the profile is active. The
indicator changes to yellow when the profile is suspended and to black when deactivated.
4. Select the required checkpoints that you want to add from the Profile Checkpoints section.
5. Click OK to save the profile for monitoring.
Configure Checkpoints
You can set the checkpoint attributes to monitor the DB2 server performance. You can use dynamic checkpoint templates, which means that the
checkpoints are defined globally (under the Templates tab) and represent the default settings. So, when you change the template value, it
reflects on all profiles using dynamic templates strategy.
Follow these steps:
1. Double-click the required checkpoint from the Edit Profile dialog.
The Edit template checkpoint dialog appears.
2. Select Active to activate a checkpoint.
3. Set or modify the following values, as required:
Description: specifies additional information about the checkpoint.
Check Interval: specifies the time interval at which the DB2 server is scanned.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Send Quality of Service: enables you to activate QoS values being sent into the QoS database. If no QoS is available for a
checkpoint, the check box is disabled.
QoS List: allows you to define QoS to be used in the checkpoint. Click
to open the QoS list dialog that displays the current
QoS definition list. Right-click in the list to add new QoS definitions and copy, or edit any existing QoS definition.
Note: Some checkpoints cannot be used for generating QoS . For such checkpoints, the QoS dialog cannot be activated.
Samples: specifies the number of samples to be used for calculating an average value. This average value should be compared to
the specified alarm threshold.
Use excludes: allows you to specify the patterns to be excluded in the template checkpoint. Click
to open the Exclude list dial
og and specify the patterns to be excluded for the checkpoint. You can define objects that you do not want to monitor in the
template checkpoint.
Note: This checkbox is available for a custom checkpoint from version 4.9 or later. In previous versions, this check box is
disabled after you create a custom checkpoint.
Scheduling: allows you to specify the schedule settings, if defined. You can select any of the following:
a.
rules: defines rules for running the schedules of the template checkpoint.
exceptions: defines exceptions for running the schedules of the template checkpoint.
Clear message: specifies the message name used for clear alarm message.
Clear severity: specifies the severity used for message issued when the threshold is not breached. You can select from the following:
clear
information
warning
minor
major
critical
4. Click OK.
The checkpoint is configured. If you want specific settings to be valid for one profile only, right-click the checkpoint in the list and select C
hange to Static.
8. After you have entered values for the above mentioned fields, click QoS List.
The QoS list dialog appears.
9. Right-click in the grid view and select New from the context menu.
10. Enter a name and description of the QoS metric that you want to define.
11. Set or modify the following values as required in the Edit dialog:
Unit: indicates the unit of QoS metric.
Abbreviation: specifies the abbreviated name of the QoS metric.
Metric: selects the required QoS metric.
Max value: specifies the maximum value of the QoS metric.
12. Click OK in the Edit dialog and in the QoS list dialog.
Define Schedules
You can specify a schedule for running a checkpoint. A schedule is the execution period of the checkpoint on specified days, time and date
values. If you are using exceptions, the schedule is considered as an execution break.
If the schedules list is empty, the checkpoint is executed in every 24 hours. Additionally, there can be defined number of schedules for each
checkpoint. These schedules define additional rules to the check interval or exceptions of it. The rules and exceptions cannot be mixed in one
checkpoint.
Follow these steps:
1. Right-click in the Schedules section of General tab and select New from the context menu.
Notes:
First execution of schedule happens when you specify the value in Date from and Time from fields.
Run once causes the checkpoint to run only once a day in the defined period (unlike multiple times if Run interval used).
Severity: specifies the severity of alarm message. Select from the following options:
clear
information
warning
minor
major
critical
4. Click OK.
Create a Group
You can define a group to enable a set of checkpoints that are logically related. If you select the required group, all the checkpoints defined under
the selected group will be attached to the monitoring profile. This saves time while configuring a profile. You can add, copy, or modify a group.
Follow these steps:
1. Right click in the Group tab and select the New option.
The Add New Group dialog appears.
2. Enter a unique name for the group and click OK.
The New Group dialog appears.
3. Enter a description for the new group in the Description text box.
4. From the Group checkpoints section, select the check points that you want to enable for the new group.
5. Click OK.
The new group is added in the Groups list.
Setup Tab
Connections Tab
Profiles Tab
Templates Tab
Status Tab
Group Tab
Setup Tab
The Setup tab contains the following two sub-tabs:
General
Message pool
By default, the General sub-tab is selected.
General Tab
The General tab enables you to set the clear alarms on the probe restart, log size, and log level.
This tab contains the following fields:
Generate status only: instructs the probe to only generate status, not to issue an alarm when a threshold is breached. Select the Status
Tab to see the status for the different checkpoints.
Default: Not Selected
Clear Alarm On Restart: allows you to clear alarms on probe restart.
Default: Selected
Alarm severity filter: allows you to set a filter on the severity levels that can become potential alarms. For example, as a database
administrator, you want to convey important events on to the operations center or help-desk, so that the event can trigger emails. The Ala
rm severity filter considers the events matching or exceeding the selected severity level alarms. If you select major, then only the
messages with severity level as major and above are considered as alarms.
Note: The probe verifies the events matching the selected severity level only when you disable the Generate status only field.
Status Auto-Update: select to specify the automatic refresh time of the monitoring profile. By default, this checkbox is not selected.
The Status Auto-Update parameter specifies the automatic refresh interval of monitoring profiles displayed in the Status tab.
Default: 60 sec
Note: The Status Auto-Update value is saved in the configuration file but the checkbox is cleared when you restart the probe.
Log Size: specifies the maximum size of the probe log file.
Default: 100 KB
Log Level: specifies the level of details that are written to the log file. Log as little as possible during normal operation to minimize disk
consumption, and increase the amount of detail when debugging.
QoS V2 compatibility: enables the backward-compatibility of the probe to the V2 framework.
All Database Status: enables you to get the status of all existing databases. If not selected, this option enables you to get the status only
for the default database.
Message Pool Tab
The Message Pool tab lets you to view the list of all alarm messages defined in the probe.
Connections Tab
This tab displays one predefined connection and allows you to view, create, modify, or delete the connections to different instances of the DB2
server. You need to specify the user name, password, and server name to be used to connect to the server instance. The password information is
encrypted and placed in the configuration file. A connection can be used by more than one profile. The following commands are available in the
right-click menu of the Connections tab:
New: creates a new connection to the DB2 server
Copy: creates a copy of the selected connection
Edit: modifies the fields of the selected connection
Delete: deletes the selected connection
New Connection Dialog
You can right-click in the Connections tab and select New to create a new connection set up to the DB2 server. Specify a unique name in the Na
me field of the Add New Connection dialog and click OK.
Set or modify the following fields in the New Connection[Connection Name] dialog, as needed:
DSN: specifies the ODBC Data Source Name.
Description: specifies the additional information about the connection.
User ID: defines the user identification code with SYSADM, SYSCTRL or SYSMAINT authorization.
Password: defines a valid password for the specified user.
Instance node: specifies the node name under which the DB2 instance (that the probe connects) is cataloged in the node directory.
Default DB name: specifies the database name used for connection tests (for example in i_check_dbalive).
Retry attempts: specifies the number of attempts made by the probe to connect in case of connection failure.
Retry delay: specifies the time for which the probe waits between two connection attempts.
Timeout: specifies the connection timeout value.
Test button: enables you to test the connection status. If the connection is successful, it returns the instance name and version number.
If the connection is unsuccessful, the probe displays an error message.
Profiles Tab
This tab displays the list of profiles that are defined for monitoring the DB2 server. Every profile runs as a separate thread, and multiple profiles
can be used to monitor one instance. Thus, the probe enables you to independently monitor several instances simultaneously.
The color of indicator icon appearing before the profile name indicates the following:
Green icon: profile is active and running.
Yellow icon: profile is active but suspended.
Black icon: profile is inactive.
The following commands are available in the right-click menu of the Profiles tab:
New: enables you to create a profile for monitoring the DB2 server.
Copy: enables you to create a copy of the selected monitoring profile.
Edit: enables you to modify the fields of the selected monitoring profile.
Delete: enables you to delete the selected monitoring profile.
Suspend: enables you to stop the selected profile from monitoring the server.
Resume: enables you to restart the selected profile to monitor the server.
You can add, edit, delete and copy profiles.
New Profile Dialog
The New option on right clicking in the Profiles tab enables you to create a profile for monitoring the DB2 server. The New Profile [Profile
Name] dialog allows you to configure the general profile properties and available checkpoints properties. The Suspended / Resumed commands
allows stopping/starting profile monitoring dynamically without deactivating/activating the probe.
Set or modify the following values, as needed:
Description: specifies additional information about the profile.
Heartbeat: specifies the interval at which all profile checkpoints schedules are tested and executed.
Default: 5 seconds
Note: This number is a common denominator to all used check interval values. The higher the value of heartbeat, the lower is the
profile overhead.
Connection: specifies the connection used in this profile. You must define the connection in the Connections dialog before creating a
profile.
Check Interval: specifies the time interval at which the DB2 server is scanned. This is used if nothing else is defined in the checkpoint
and overwrites the default checkpoint list setting.
Default: 1 minute
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
At the bottom of New Profile dialog, a list of available checkpoints is displayed. Select the required checkpoints that you want to attach to your
profile. The default checkpoint settings are used globally, unless you modify the settings locally for your profile.
When defining a profile, you can use the following two different strategies while configuring the checkpoints:
Dynamic: allows you to define the checkpoint properties globally (under the Templates tab) that represent the default settings. Every
time that you change the template value, it reflects on all profiles that uses these dynamic templates. Double-click the checkpoint in the P
rofile Checkpoints list or Templates tab. When modified, the new settings becomes valid for all profiles, unless overruled by static
settings in the profile.
Static: allows you to manage the checkpoint properties locally. Set the checkpoint to static in your profile before modifying it. When
modified, the new settings becomes valid for this profile only.
There can be both template and static checkpoints in one profile. If a checkpoint is static, the checkpoint name appears in the list with
Note: If you want to have specific settings valid for one profile, right-click the checkpoint in the list and select Change to static. The
probe displays a warning message if you attempt to modify a template checkpoint in the Profile dialog without changing it to static.
When deciding which checkpoints to activate/deactivate for a profile, see v4.0 db2 Checkpoint Metrics for a description of the different
checkpoints.
Templates Tab
The Templates tab displays a list of predefined set of checkpoints that can be used in your profiles. You can modify, create, or delete the required
checkpoints.
Double-click the required checkpoint that you want to modify.
By default, most checkpoints are active with a default threshold value. The checkpoint attributes can be managed either globally (using Template
option) or locally (using Static option) for a profile.
Edit Template Checkpoint Dialog
The Create new option on right clicking in the Templates tab displays the Edit Template Checkpoint dialog. This dialog consists of the
following sub-tabs:
General tab
Hint Editor tab
Query tab
General Tab
The upper section of General tab contains general checkpoint settings. The lower section is divided into two parts: the first part displays the list of
thresholds and the second part displays the list of schedules.
Set or modify the following values, as needed:
Description: specifies additional information about the checkpoint.
Active: allows you to activate the checkpoint.
Condition: displays the conditional operator to evaluate the threshold values.
Check Interval: specifies the time interval for scanning the checkpoint. Every checkpoint can have a different check interval value. The
probe considers the default value from the profile definition, if it is not defined there, than considers the value from the default checkpoint
list.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Send Quality of Service: activates QoS values being sent into the QoS database. If not available in a checkpoint, this field is disabled.
QoS List: allows you to define the QoS to be used in the checkpoint. Clicking on
icon opens the QoS list dialog that displays the
current QoS definition list. By default, the current checkpoint definition is listed. Right-clicking in this list enables you to add new QoS
definitions and copy, edit, or delete an existing QoS definition.
The Edit QoS dialog offers available metrics (numerical variables that are reported as QoS) and available object variables (to be added
to the QoS source).
The name of the QoS has to start with the checkpoint name. QoS can be activated/deactivated as usual.
Note: Some checkpoints cannot be used for generating QoS . For such checkpoints, the QoS dialog cannot be activated.
Samples: specifies the number of samples for calculating an average value. This average value is compared to the specified alarm
threshold.
When the probe starts running, the average value from the number of samples that are available is calculated.
For example,
if you specify the value of Samples as 3, the probe performs the following:
uses the first sample value in the first interval
uses the average of samples 1 and 2 in the second interval
if you specify the value of Samples as 1, the probe does not perform any sampling.
if you specify the value of Samples as 0, the probe takes the number of samples from the template
Note: Many checkpoints calculate an interval value, so the probe does not use any value in the first interval (and there is no threshold
checking).
Use excludes: allows you to specify the patterns to be excluded in the template checkpoint. Click
to open the Exclude list dialog
and specify the patterns to be excluded for the checkpoint. You can define objects that you do not want to monitor in the
template checkpoint.
Excludes list: displays a list that contains the patterns that are excluded from the template checkpoint. Click
to open the Exclude
list dialog that contains these patterns. These patterns are used for the checkpoint if the Use excludes checkbox is selected.
Right-clicking in the list lets you add new excludes or edit, copy or delete existing excludes.
When adding (or editing) an exclude pattern, a Match expression dialog is displayed, letting you edit or define the exclude pattern.
Excludes are defined using regular expression patterns. A test button in the Match expression dialog lets you test the defined exclude
pattern. This test is possible only for running active profiles and checkpoints. The test uses the status list (on the status tab) as input.
Note: If there already are active excludes, the excluded objects are excluded from the status list before the test.
When you click Test, an Exclude test list appears that shows the result of the test. Red text lines show the objects that are
excluded using the tested pattern.
The "object thresholds" function as an "include list". This means that if there are specific thresholds defined for a specific
object, this object will always stay in, even if the exclude pattern would eliminate it normally. This is considered also in the test
function.
Scheduling: allows you to specify how to use the schedules settings, if defined. You can select any of the following:
rules: defines rules for running the schedules of the template checkpoint.
exceptions: defines exceptions for running the schedules of the template checkpoint.
For example, if you specify the Scheduling as rules, then the probe runs the defined schedules on the specified dates to generate
alarms/QoS for the checkpoint. If you specify the Scheduling as exceptions, the probe does not run the defined schedules on the
specified dates.
Clear message: specifies the message name used for clear alarm message.
Clear severity: enables you to select the severity used for message issued when the threshold is not breached. You can select from the
following:
clear
information
warning
minor
major
critical
Thresholds/Schedules: displays the list of thresholds and schedules defined for the template checkpoint. Refer section Thresholds and
Schedules.
Hint Editor Tab
This tab allows you to specify the help text messages to be shared between the database administrators.
Query Tab
When you create a new template checkpoint, the probe allows you to define and associate a query to the checkpoint for retrieving the required
data from the DB2 server. For default checkpoints, the query is already defined.
Set or modify the following values, as needed:
Checked value: specifies the name of the columns for which monitoring is required.
Condition: specifies one of the comparison operators such as =, != ,>= and, <=
Row identification: specifies the unique identification code for the row.
Message variables: specifies the variables to be used in the message.
Query File: specifies the file that stores the query.
Query: defines a query to the database. Clicking Test enables you to validate the query.
Read: enables you to read the query from the file specified in the Query file field. The probe reads the query from the file and displays
the query in the Query textbox.
Test: enables you to test whether the defined query is valid or not. Clicking Test displays the Query Result dialog.
Interval modus: enables you to subtract the variable value from the value generated at the end of the interval.
Thresholds
The Thresholds section contains the list of object profiles that are used for monitoring their corresponding real-time object counterparts. These
object profiles are set to a threshold value that notify you if any matching real-time objects meet the set criteria. You can define multiple threshold
objects in a template checkpoint. By default, most checkpoints are active with a default threshold value. The threshold values are modified as
defined by modifying checkpoints in the respective profile. Every checkpoint has to have at least one threshold object.
The probe allows you to identify a specific monitoring object by defining various parameters for an object name (if applicable), such as tablespace
name, userid, and threshold ID that is numbered from 0. Threshold values should be descending or ascending, depending on the condition used
in a checkpoint, starting with the highest severity threshold condition.
Double clicking on an object profile from the Thresholds section displays the Edit threshold dialog
Set or modify the following values, as needed:
Threshold object name: specifies the monitoring object name, if applicable. If you do not specify the object name, the probe uses the
name as default. Some special checkpoints have a second threshold called count (for example, locked_users).
Threshold value: specifies the value used for threshold evaluation.
Current value: specifies the last measured value, if invoked from the status report.
Severity: specifies the severity of alarm message to be used for the threshold.
Message: specifies name of message used for threshold alarm.
Message text: specifies the message text, containing variables. If the message text is changed from a profile list, you must create new
message.
Variables: displays the list of variables that are available only for the custom checkpoints. For example, $check, $profile, $instance,
$object, $state. The variables for every custom checkpoint are different, depending on the variable query.
Note: To view the list of variables, double-click on the custom checkpoint and then double-click on the threshold object from
the Thresholds/Schedule section. The list of variables is displayed in the Edit threshold dialog.
Schedules
If the schedules list is empty, the checkpoint will be executed at an interval of 24 hours.
You can also define several schedules per checkpoint, each defining additional rules to the check interval or exceptions of it. The rules and exc
eptions cannot be mixed in one checkpoint.
In principle, a schedule is a definition of an execution period (or execution break if exceptions used) with specified days, time from/to and date
from/to values. If only Date from and Time from is defined, first execution can be defined. Run once will cause the checkpoint run only once a
day in the defined period (unlike multiple times if Run interval used).
Status Tab
The Status tab lists the defined profiles and their corresponding checkpoint templates. The status is displayed in an hierarchical fashion, with
profile name nodes in right pane and one or more checkpoint nodes in the left pane (only active checkpoints are considered here). The highest
status is propagated. Select the checkpoint in the navigation tree (to your left) to bring up the corresponding events.
This tab also enables you to modify the properties of an individual checkpoint object.
Group Tab
This tab lets you create multiple groups which can be associated with profiles. You can add, copy, modify, or delete a group.
i_agents_created_ratio - ratio
Calculated as: (i_agents_created_empty_pool / i_agents_from_pool) * 100.
Description: Monitors % of agents created due to empty agent pool by agents assigned from pool.
i_piped_sorts_rejected - count
Calculated as: i_piped_sorts_requested - i_piped_sorts_accepted.
Description: Monitors number of piped sort requests rejected.
db_pool_hit_ratio - ratio
Calculated as: (1.0 - ((db_pool_data_p_reads + db_pool_index_p_reads) / (db_pool_data_l_reads + db_pool_index_l_reads))) * 100.
Description: Monitors percentage of time a page was found in buffer pool on request.
db_avg_sort_time - average
Calculated as: (db_total_sort_time / db_total_sorts) / 1000.
Description: Monitors average sort time in interval in seconds.
db_pct_sort_overflows - ratio
Calculated as: (db_sort_overflows / db_total_sorts) * 100.
Description: Monitors % of sort overflows in interval.
db_avg_sort_heap - average
app_avg_direct_read_time - average
Calculated as: app_direct_read_time / app_direct_reads.
Description: Monitors the average time for direct read in ms.
app_avg_direct_write_time - average
Calculated as: app_direct_write_time / app_direct_writes
Description: Monitors the average time for direct write in ms.
app_cat_cache_hit_rto - ratio
Calculated as: (1 - (app_cat_cache_inserts / app_cat_cache_lookups)) * 100.
Description: Monitors percentage of time table descriptor was found in catalog cache.
app_pkg_cache_hit_rto
Calculated as: (1 - (app_pkg_cache_inserts / app_pkg_cache_lookups)) * 100.
Description: Monitors percentage of time package section was found in package cache.
app_locklist_util - ratio
Calculated as: ((app_locks_held * locksize) / (app_dbcfg_lock_list * 4096)) * 100.
Description: Monitors lock list utilization by application in percent.
app_sys_cpu_time - count
Calculated as: agent_sys_cpu_time_s + (agent_sys_cpu_time_ms / 1000000.0).
Description: Monitors the total system CPU time (in seconds) used by database manager agent process.
app_usr_cpu_time - count
Calculated as: agent_usr_cpu_time_s + (agent_usr_cpu_time_ms / 1000000.0).
Description: Monitors the total user CPU time (in seconds) used by database manager agent process.
app_uow_elapsed_time - count
Calculated as: uow_elapsed_time_s + (uow_elapsed_time_ms / 1000000.0).
Description: Monitors the elapsed execution time of the most recently completed UOW.
bp_pool_hit_ratio - ratio
Calculated as: (1 - ((bp_pool_data_p_reads + bp_pool_index_p_reads) / (bp_pool_data_l_reads + bp_pool_index_l_reads))) * 100.
Description: Monitors percentage of time a page was found in buffer pool.
bp_pool_avg_async_read_time - average
Calculated as: bp_pool_async_read_time / bp_pool_async_data_reads.
Description: Monitors average asynchronous read time in ms in interval.
bp_pool_avg_async_write_time - average
Calculated as: (bp_pool_async_write_time / (bp_pool_async_data_writes + bp_pool_async_index_writes)).
Description: Monitors average asynchronous write time in ms in interval.
bp_pool_sync_write_time - count
Calculated as: bp_pool_write_time - bp_pool_async_write_time.
Description: Monitors synchronous write time in ms in interval.
bp_pool_avg_write_time - average
Calculated as: bp_pool_async_write_time / (bp_pool_async_data_writes + bp_pool_async_index_writes).
Description: Monitors average asynchronous write time in ms in interval.
bp_avg_direct_read_time - average
Calculated as: bp_direct_read_time / bp_direct_reads.
Description: Monitors average time for direct read in ms in interval.
bp_avg_direct_write_time - average
Calculated as: bp_direct_write_time / bp_direct_writes.
Description: Monitors average time for direct write in ms in interval.
bp_pool_sync_reads - count
Calculated as: bp_pool_data_p_reads - bp_pool_async_data_reads.
Description: Monitors number of synchronous data reads in interval.
bp_pool_sync_writes - count
Calculated as: bp_pool_data_writes - bp_pool_async_data_writes.
Description: Monitors number of synchronous data writes in interval.
bp_pool_sync_idx_writes - count
Calculated as: bp_pool_index_writes - bp_pool_async_index_writes.
Description: Monitors number of synchronous index page writes in interval.
bp_pool_sync_idx_reads - count
Calculated as: bp_pool_index_p_reads - bp_pool_async_index_reads.
Description: Monitors number of synchronous index page reads in interval.
ts_usable_pages_pct - ratio
Calculated as: (ts_usable_pages / ts_total_pages) * 100.
Description: Monitors percent of usable pages in DMS table space (exlc. overhead).
ts_used_pages_pct - ratio
Calculated as: (ts_used_pages / ts_total_pages) * 100.
db2 Metrics
The following table describes the QoS metrics that can be configured using the db2 probe.
Monitor Name
Units
Description
Version
QOS_DB2_ACTIVE_CONNECTIONS_PERCENTAGE
4.0
QOS_DB2_ACTIVE_SORTS
count
4.0
QOS_DB2_AGENTS_CREATED_EMPTY_POOL
count
4.0
QOS_DB2_AGENTS_CREATED_RATIO
4.0
QOS_DB2_AGENTS_FROM_POOL
count
4.0
QOS_DB2_AGENTS_REGISTERED
count
4.0
QOS_DB2_AGENTS_REGISTERED_TOP
count
4.0
QOS_DB2_AGENTS_STOLEN
count
Agents Stolen
4.0
QOS_DB2_AGENTS_TOP
count
4.0
QOS_DB2_AGENTS_WAITING_ON_TOKEN
count
4.0
QOS_DB2_AGENTS_WAITING_TOP
count
4.0
QOS_DB2_APPL_SECTION_INSERTS
count
4.0
QOS_DB2_APPL_SECTION_LOOKUPS
count
4.0
QOS_DB2_APPLS_CUR_CONS
count
Database Connected
4.0
QOS_DB2_APPLS_IN_DB2
count
4.0
QOS_DB2_AVG_DIRECT_READ_TIME
ms
4.0
QOS_DB2_AVG_DIRECT_WRITE_TIME
ms
4.0
QOS_DB2_AVG_SORT_HEAP
count
4.0
QOS_DB2_AVG_SORT_TIME
sec
4.0
QOS_DB2_BINDS_PRECOMPILES
count
Database Binds/Precompiles
4.0
QOS_DB2_CAT_CACHE_HEAP_FULL
count
4.0
QOS_DB2_CAT_CACHE_HIT_RTO
4.0
QOS_DB2_CAT_CACHE_INSERTS
count
4.0
QOS_DB2_CAT_CACHE_LOOKUPS
count
4.0
QOS_DB2_CAT_CACHE_OVERFLOWS
count
4.0
QOS_DB2_CHECK_DBALIVE
Availability
4.0
QOS_DB2_COMM_PRIVATE_MEM
Byte
4.0
QOS_DB2_COMMIT_SQL_STMTS
count
Database Commits
4.0
QOS_DB2_CON_LOCAL_DBASES
count
4.0
QOS_DB2_CONNECTIONS_TOP
count
4.0
QOS_DB2_COORD_AGENTS_TOP
count
4.0
QOS_DB2_DB_CONNECT_TIME
seconds
4.0
QOS_DB2_DB_HEAP_TOP
Bytes
4.0
QOS_DB2_DB_LOG_UTIL_RTO
4.0
QOS_DB2_DB_STATUS
status
Database Status
4.0
QOS_DB2_DDL_SQL_STMTS
count
4.0
QOS_DB2_DEADLOCKS
count
Database Deadlocks
4.0
QOS_DB2_DIRECT_READ_REQS
count
4.0
QOS_DB2_DIRECT_READ_TIME
ms
4.0
QOS_DB2_DIRECT_READS
count
Direct Reads
4.0
QOS_DB2_DIRECT_WRITE_REQS
count
4.0
QOS_DB2_DIRECT_WRITE_TIME
ms
4.0
QOS_DB2_DIRECT_WRITES
count
Direct Writes
4.0
QOS_DB2_DYNAMIC_SQL_STMTS
count
4.0
QOS_DB2_FAILED_SQL_STMTS
count
4.0
QOS_DB2_FILES_CLOSED
count
4.0
QOS_DB2_FREE_PAGES
count
4.0
QOS_DB2_FREE_PAGES_PCT
4.0
QOS_DB2_GW_CONS_WAIT_CLIENT
count
4.0
QOS_DB2_GW_CONS_WAIT_HOST
count
4.0
QOS_DB2_GW_CUR_CONS
count
4.0
QOS_DB2_GW_TOTAL_CONS
count
4.0
QOS_DB2_HASH_JOIN_OVERFLOWS
count
4.0
QOS_DB2_HASH_JOIN_SMALL_OVERFLOWS
count
4.0
QOS_DB2_IDLE_AGENTS
count
Unassigned Agents
4.0
QOS_DB2_INT_AUTO_REBINDS
count
4.0
QOS_DB2_INT_COMMITS
count
4.0
QOS_DB2_INT_DEADLOCK_ROLLBACKS
count
4.0
QOS_DB2_INT_ROLLBACKS
count
4.0
QOS_DB2_INT_ROWS_DELETED
count
4.0
QOS_DB2_INT_ROWS_INSERTED
count
4.0
QOS_DB2_INT_ROWS_UPDATED
count
4.0
QOS_DB2_LOCAL_CONS
count
4.0
QOS_DB2_LOCAL_CONS_IN_EXEC
count
4.0
QOS_DB2_LOCK_ESCALS
count
4.0
QOS_DB2_LOCK_LIST_IN_USE
Bytes
4.0
QOS_DB2_LOCK_TIMEOUTS
count
4.0
QOS_DB2_LOCK_WAIT_TIME
ms
4.0
QOS_DB2_LOCK_WAITS
count
4.0
QOS_DB2_LOCKS_HELD
count
4.0
QOS_DB2_LOCKS_WAITING
count
4.0
QOS_DB2_LOG_READS
count
4.0
QOS_DB2_LOG_WRITES
count
4.0
QOS_DB2_MAX_AGENT_OVERFLOWS
count
MAXAGENT Overflows
4.0
QOS_DB2_MAX_USED_PAGES
count
4.0
QOS_DB2_MAX_USED_PAGES_PCT
4.0
QOS_DB2_NEW_CHECKPOINT_12345
test checkpoint
4.0
QOS_DB2_NUM_ASSOC_AGENTS
count
Database Agents
4.0
QOS_DB2_PCT_HJS_OVERFLOWS
4.0
QOS_DB2_PCT_SORT_OVERFLOWS
4.0
QOS_DB2_PIPED_SORTS_ACCEPTED
count
4.0
QOS_DB2_PIPED_SORTS_REJECTED
count
4.0
QOS_DB2_PIPED_SORTS_REQUESTED
count
4.0
QOS_DB2_PKG_CACHE_INSERTS
count
4.0
QOS_DB2_PKG_CACHE_LOOKUPS
count
4.0
QOS_DB2_POOL_ASYNC_DATA_READ_REQS
count
4.0
QOS_DB2_POOL_ASYNC_DATA_READS
count
4.0
QOS_DB2_POOL_ASYNC_DATA_WRITES
count
4.0
QOS_DB2_POOL_ASYNC_INDEX_READS
count
4.0
QOS_DB2_POOL_ASYNC_INDEX_WRITES
count
4.0
QOS_DB2_POOL_ASYNC_READ_TIME
ms
4.0
QOS_DB2_POOL_ASYNC_WRITE_TIME
ms
4.0
QOS_DB2_POOL_AVG_ASYNC_READ_TIME
ms
4.0
QOS_DB2_POOL_AVG_ASYNC_WRITE_TIME
ms
4.0
QOS_DB2_POOL_AVG_WRITE_TIME
ms
4.0
QOS_DB2_POOL_DATA_FROM_ESTORE
count
4.0
QOS_DB2_POOL_DATA_L_READS
count
4.0
QOS_DB2_POOL_DATA_P_READS
count
4.0
QOS_DB2_POOL_DATA_TO_ESTORE
count
4.0
QOS_DB2_POOL_DATA_WRITES
count
4.0
QOS_DB2_POOL_DRTY_PG_STEAL_CLNS
count
4.0
QOS_DB2_POOL_DRTY_PG_THRSH_CLNS
count
4.0
QOS_DB2_POOL_HIT_RATIO
4.0
QOS_DB2_POOL_INDEX_FROM_ESTORE
count
4.0
QOS_DB2_POOL_INDEX_L_READS
count
4.0
QOS_DB2_POOL_INDEX_P_READS
count
4.0
QOS_DB2_POOL_INDEX_TO_ESTORE
count
4.0
QOS_DB2_POOL_INDEX_WRITES
count
4.0
QOS_DB2_POOL_LSN_GAP_CLNS
count
4.0
QOS_DB2_POOL_READ_TIME
ms
4.0
QOS_DB2_POOL_SYNC_IDX_READS
count
4.0
QOS_DB2_POOL_SYNC_IDX_WRITES
count
4.0
QOS_DB2_POOL_SYNC_READS
count
4.0
QOS_DB2_POOL_SYNC_WRITE_TIME
ms
4.0
QOS_DB2_POOL_SYNC_WRITES
count
4.0
QOS_DB2_POOL_WRITE_TIME
ms
4.0
QOS_DB2_POST_THRESHOLD_HASH_JOINS
count
4.0
QOS_DB2_POST_THRESHOLD_SORTS
count
4.0
QOS_DB2_PREFETCH_WAIT_TIME
ms
4.0
QOS_DB2_REM_CONS_IN
count
4.0
QOS_DB2_REM_CONS_IN_EXEC
count
4.0
QOS_DB2_ROLLBACK_SQL_STMTS
count
Database Rollbacks
4.0
QOS_DB2_ROWS_DELETED
count
4.0
QOS_DB2_ROWS_INSERTED
count
4.0
QOS_DB2_ROWS_SELECTED
count
4.0
QOS_DB2_ROWS_UPDATED
count
4.0
QOS_DB2_SEC_LOG_USED_TOP
Bytes
4.0
QOS_DB2_SEC_LOGS_ALLOCATED
count
4.0
QOS_DB2_SELECT_SQL_STMTS
count
4.0
QOS_DB2_SINCE_LAST_BACKUP
hours
4.0
QOS_DB2_SORT_HEAP_ALLOCATED
count
4.0
QOS_DB2_SORT_OVERFLOWS
count
4.0
QOS_DB2_STATIC_SQL_STMTS
count
4.0
QOS_DB2_TOT_LOG_USED_TOP
Bytes
4.0
QOS_DB2_TOTAL_CONS
count
Database Connects
4.0
QOS_DB2_TOTAL_HASH_JOINS
count
4.0
QOS_DB2_TOTAL_HASH_LOOPS
count
4.0
QOS_DB2_TOTAL_PAGES
count
4.0
QOS_DB2_TOTAL_SEC_CONS
count
4.0
QOS_DB2_TOTAL_SORT_TIME
ms
4.0
QOS_DB2_TOTAL_SORTS
count
4.0
QOS_DB2_TS_DATA_PARTITIONING
4.0
QOS_DB2_TS_STATUS
status
Tablespace Status
4.0
QOS_DB2_UID_SQL_STMTS
count
4.0
QOS_DB2_USABLE_PAGES
count
4.0
QOS_DB2_USABLE_PAGES_PCT
4.0
QOS_DB2_USED_PAGES
count
4.0
QOS_DB2_USED_PAGES_PCT
4.0
QOS_DB2_X_LOCK_ESCALS
count
4.0
This section contains the QoS metric default settings for the db2 probe.
Alert Metric
Warning
Threshold
Warning
Severity
Error
Threshold
Error
Severity
Description
app_acc_curs_blk
500
Warning
Monitors the number of times that a request for an I/O block was
accepted
app_agents_stolen
500
Warning
app_appl_idle_time
60
Warning
app_asoc_agents_top
500
Warning
app_avrg_direct_read_time
500
Warning
app_avrg_direct_write_time
500
Warning
app_avrg_sort_time
500
Warning
app_binds_precomplies
500
Warning
app_cat_cache_heap_full
500
Warning
app_cat_cache_hit_rho
75
Warning
app_cat_cache_inserts
500
Warning
app_cat_cache_lookups
500
Warning
app_cat_cache_overflows
500
Warning
app_commit_sql_stmts
500
Warning
app_ddl_sql_stmts
500
Warning
app_deadlocks
Warning
app_direct_read_reqs
500
Warning
app_direct_read_time
500
Warning
app_direct_reads
500
Warning
app_direct_write_reqs
500
Warning
app_direct_write_time
500
Warning
app_direct_writes
500
Warning
app_dynamic_sql_stmts
500
Warning
app_failed_sql_stmts
500
Warning
app_hash_join_overflows
500
Warning
app_hash_join_small_overflows
500
Warning
app_int_auto_rebinds
500
Warning
app_int_commits
500
Warning
app_int_deadlock_rollbacks
500
Warning
app_int_ rollbacks
500
Warning
app_int_rows_deleted
500
Warning
app_int_rows_inserted
500
Warning
app_int_rows_updated
500
Warning
app_lock_escals
Warning
app_lock_timeouts
500
Warning
app_lock_wait_time
1000
Warning
app_lock_waits
90
Warning
app_locklist_util
50
Warning
app_locks_held
100
Warning
app_num_agents
500
Warning
app_num_asoc_agents
500
Warning
app_open_loc_curs
500
Warning
app_open_loc_curs_blk
500
Warning
Monitors the number of local blocking cursors currently open for this
application
app_open_rem_curs
500
Warning
app_open_rem_curs_blk
500
Warning
app_pct_sort_overflows
25
Warning
app_pkg_cache_hit_rto
75
Warning
app_pkg_cache_inserts
500
Warning
app_pkg_cache_lookups
500
Warning
app_pool_data_I_reads
500
Warning
app_pool_data_p_reads
500
Warning
app_pool_data_writes
500
Warning
app_pool_hit_ratio
85
Warning
app_pool_index_l_reads
500
Warning
app_pool_index_p_reads
500
Warning
app_pool_index_writes
500
Warning
app_pool_read_time
500
Warning
app_pool_write_time
500
Warning
app_rej_curs_blk
500
Warning
app_rollback_sql_stmts
500
Warning
app_rows_deleted
500
Warning
app_rows_inserted
500
Warning
app_rows_read
500
Warning
app_rows_selected
500
Warning
app_rows_updated
500
Warning
app_rows_written
500
Warning
app_select_sql_stmts
500
Warning
app_sort_overflows
50
Warning
Monitors the total number of sorts that ran out of sort heap
app_static_sql_stmts
500
Warning
app_sys_cpu_time
500
Warning
Monitors the total system CPU time (in seconds) used by database
manager agent process
app_total_hash_joins
500
Warning
app_total_hash_loops
500
Warning
app_total_sort_time
50
Warning
Monitors the total elapsed time (in ms) for all sorts that have been
executed
app_total_sorts
Warning
app_uid_sql_stmts
500
Warning
app_uow_elapsed_time
500
Warning
app_uow_lock_wait_time
500
Warning
Monitors the total amount of elapsed time this unit of work has spent
waiting for locks in ms
app_uow_log_space_used
500
Warning
Monitors the amount of log space (in bytes) used in the current
UOW
app_usr_cpu_time
500
Warning
Monitors the total user CPU time (in seconds) used by database
manager agent process
app_x_lock_escals
Warning
Monitors number of times that (x) locks have been escalated from
several row locks to one exclusive table lock
bp_avg_direct_read_time
65
Warning
bp_avg_direct_write_time
65
Warning
bp_direct_read_reqs
85
Warning
bp_direct_read_time
85
Warning
bp_direct_reads
85
Warning
bp_direct_write_reqs
85
Warning
bp_direct_write_time
85
Warning
bp_direct_writes
85
Warning
bp_files_closed
85
Warning
bp_pool_async_data_read_reqs
85
Warning
bp_pool_async_data_reads
85
Warning
bp_pool_async_data_writes
85
Warning
bp_pool_async_index_reads
85
Warning
bp_pool_async_index_writes
85
Warning
bp_pool_async_read_time
85
Warning
bp_pool_async_write_time
85
Warning
bp_pool_avg_async_read_time
15
Warning
bp_pool_avg_async_write_time
25
Warning
bp_pool_avg_write_time
65
Warning
bp_pool_data_from_estore
85
Warning
bp_pool_data_l_reads
100000
Warning
bp_pool_data_p_reads
85
Warning
bp_pool_data_to_estore
85
Warning
bp_pool_data_writes
85
Warning
bp_pool_hit_ratio
85
Warning
bp_pool_index_from_estore
85
Warning
bp_pool_index_l_reads
85
Warning
bp_pool_index_p_reads
85
Warning
bp_pool_index_to_estore
85
Warning
bp_pool_index_writes
85
Warning
bp_pool_read_time
10
Warning
bp_pool_sync_idx_reads
65
Warning
bp_pool_sync_idx_writes
65
Warning
bp_pool_sync_reads
65
Warning
bp_pool_sync_write_time
65
Warning
bp_pool_sync_writes
65
Warning
bp_pool_write_time
85
Warning
db_active_sorts
800
Warning
db_agents_top
Warning
db_appl_section_inserts
Warning
db_appl_section_lookups
Warning
db_appls_cur_cons
25
Warning
db_appls_in_db2
25
Warning
db_avg_direct_read_time
15
Warning
db_avg_direct_write_time
25
Warning
db_avg_sort_heap
100
Warning
db_avg_sort_time
10
Warning
db_binds_precomplies
25
Warning
db_cat_cache_heap_full
25
Warning
db_cat_cache_hit_rto
50
Warning
db_cat_cache_inserts
25
Warning
db_cat_cache_lookups
25
Warning
db_cat_cache_overflows
25
Warning
db_commit_sql_stmts
25
Warning
db_connect_time
0.1
Warning
db_connections_top
25
Warning
db_coord_agents_top
Warning
db_ddl_sql_stmts
25
Warning
db_deadlocks
Warning
db_direct_read_reqs
25
Warning
db_direct_read_time
25
Warning
db_direct_reads
25
Warning
db_direct_write_reqs
25
Warning
db_direct_write_time
25
Warning
db_direct_writes
25
Warning
db_dynamic_sql_stmts
25
Warning
db_failed_sql_stmts
25
Warning
db_files_closed
Warning
db_hash_join_overflows
Warning
Monitors # of times hash join data exceeded the available sort heap
space
db_hash_join_small_overflows
Warning
Monitors # of times hash join data exceeded the available sort heap
space by less than 10%
db_heap_top
100000
Warning
db_int_auto_rebinds
25
Warning
db_int_commits
25
Warning
db_int_deadlock_rollbacks
25
Warning
db_int_rollbacks
25
Warning
db_int_rows_deleted
25
Warning
db_int_rows_inserted
25
Warning
db_int_rows_updated
25
Warning
db_lock_escals
Warning
db_lock_list_in_use
124000
Warning
db_lock_timeouts
Warning
db_lock_wait_time
Warning
db_lock_waits
Warning
db_locks_held
80
Warning
db_locks_waiting
Warning
db_logs_read
25
Warning
db_log_util_rto
25
Warning
db_log_writes
25
Warning
db_num_assoc_agents
Warning
db_pct_hjs_overflows
10
Warning
db_pct_sort_overflows
10
Warning
db_pkg_cache_inserts
25
Warning
db_pkg_cache_lookups
25
Warning
db_pool_async_data_read_reqs
100
Warning
db_pool_async_data_reads
100
Warning
db_pool_async_data_writes
100
Warning
db_pool_async_index_reads
Warning
db_pool_async_index_writes
100
Warning
db_pool_async_read_time
100
Warning
db_pool_async_write_time
100
Warning
db_pool_avg_async_read_time
15
Warning
db_pool_avg_async_write_time
25
Warning
db_pool_avg_write_time
25
Warning
db_pool_data_from_estore
100
Warning
db_pool_data_l_reads
10
Warning
db_pool_data_p_reads
10
Warning
db_pool_data_to_estore
Warning
db_pool_data_writes
10
Warning
db_pool_drty_pg_steal_clns
100
Warning
db_pool_drty_pg_thrsh_clns
100
Warning
db_pool_hit_ratio
75
Warning
db_pool_index_from_estore
Warning
db_pool_index_l_reads
10
Warning
db_pool_index_p_reads
10
Warning
db_pool_index_to_estore
Warning
db_pool_index_writes
10
Warning
db_pool_lsn_gap_clns
100
Warning
db_pool_read_time
10
Warning
db_pool_sync_idx_reads
10
Warning
db_pool_sync_idx_writes
10
Warning
db_pool_sync_reads
75
Warning
db_pool_sync_idx_write_time
15
Warning
db_pool_sync_writes
10
Warning
db_pool_write_time
10
Warning
db_prefetch_wait_time
25
Warning
db_rollback_sql_stmts
25
Warning
db_rows_deleted
25
Warning
db_rows_inserted
25
Warning
db_rows_selected
25
Warning
db_rows_updated
25
Warning
db_sec_log_used_top
25
Warning
db_sec_logs_allocated
25
Warning
db_select_sql_stmts
25
Warning
db_since_last_backup
24
Warning
db_sort_heap_allocated
Warning
db_sort_overflows
800
Warning
db_static_sql_stmts
25
Warning
db_status
Warning
db_tot_log_used_top
25
Warning
db_total_cons
25
Warning
db_total_hash_joins
Warning
db_total_hash_loops
Warning
Monitors # of times single partition of hash join was larger than the
available sort heap space
db_total_sec_cons
25
Warning
db_total_sort_time
800
Warning
db_total_sorts
Warning
db_uid_sql_stmts
25
Warning
db_x_lock_escals
Warning
i_agents_created_empty_pool
Warning
i_agents_created_ratio
50
Warning
i_agents_from_pool
20
Warning
i_agents_registered
150
Warning
i_agents_registered_top
150
Warning
i_agents_stolen
Warning
i_agents_waiting_on_token
10
Warning
i_agents_waiting_top
10
Warning
Monitors max. # of agents ever waiting for token since DB2 start
i_agents_dbalive
Warning
i_comm_private_memory
1000000
Warning
i_con_local_dbases
Warning
i_coord_agents_top
Warning
i_gw_cons_wait_client
Warning
i_gw_cons_wait_host
Warning
i_gw_cur_cons
Warning
i_gw_total_cons
Warning
i_idle_agents
Warning
i_local_cons
Warning
i_local_cons_in_exec
Warning
i_max_agent_overflows
Warning
i_pct_active_connections
Warning
i_piped_sorts_accepted
Warning
i_piped_sorts_rejected
Warning
i_piped_sorts_requested
Warning
i_post_threshold_hash_joins
Warning
i_post_threshold_sorts
10
Warning
i_rem_cons_in
Warning
i_rem_cons_in_exec
Warning
i_sort_heap_allocated
8000
Warning
ts_data_partitioning
15
Warning
ts_free_pages
1000
Warning
ts_free_pages_pct
10
Warning
ts_max_used_pages
10000
Warning
ts_max_used_pages_pct
95
Warning
ts_status
2114060287
Major
ts_total_pages
3500
Warning
ts_usable_pages
2500
Warning
ts_usable_pages_pct
10
Warning
ts_used_pages
2500
Warning
ts_used_pages_pct
85
Warning
The DCIM probe collects energy and power data from data center devices and provides this information to the CA UIM message bus. The DCIM
probe UI is used to view the QoS monitors and to configure alarms.
Each DCIM probe monitors devices as configured in the DCIM Administrator (dcimadmin) monitoring portlet. DCIM Administrator determines the
data center devices to monitor and the specific device data to collect.
More information:
For complete instructions about deploying, accessing, and using the DCIM probe within the DCIM for UIM environment, go to CA DCIM
for Unified Infrastructure Management.
Contents
Verify Prerequisites
Create and Configure File and Directory Monitoring Profiles
Create and Configure File Integrity Profiles
Add Files for Integrity Monitoring
Alarm Thresholds
Verify Prerequisites
Verify that required hardware and software is available and any installation consideration is met before you configure the probe. For more
information, see dirscan (File and Directory Scan) Release Notes.
Note: You can also select and modify the sample profile available in the probe.
4. Click Submit.
A node with the specified profile name is created under the Profiles node.
5. Select Active to activate the profile in the Profile Name node.
6. Specify the directory with the files to be monitored.
You can also select a directory using the Browse button.
7. Specify the regular expression pattern of the files to be monitored.
Example: *.txt for all files with the txt extension. Refer Using Regular Expressions for Files section in the dirscan Regular
Expressions article for more information.
8. Select Recurse Into Subdirectories to include the files that are present in the subdirectories of the specified directory.
The Exclude Directories Pattern field is enabled.
9. Specify the regular expression pattern of the subdirectory names that will be excluded from monitoring.
Example: %B for all directories with the full name of a month. Refer Using Regular Expressions for Directories section in the dirscan
Regular Expressions article for more information.
10. Click Save to save the configuration.
Note: You can also select and modify the sample integrity profile available in the probe.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
dirscan Node
Integrity Profiles Node
<Integrity Profile Name> Node
Profiles Node
<Profile Name> Node
Operator Reversal after Migration
dirscan Node
This node allows you to configure the general properties of the probe. You can also view the list of alarm messages and their properties.
Navigation: dirscan
dirscan > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
dirscan > General Configuration
This section allows you to configure the general properties of the probe.
Check Interval (seconds): specifies the time interval at which the directories are scanned. You can reduce this interval to generate alarms
faster (as next interval takes lesser time) but it increases the system load. You can also increase this interval to reduce the system load
but it generates alarms later (as next interval takes more time).
Default Message Level: specifies the level of the alarm message.
Log Level: specifies the level of details that are written to the log file.
Default: 0-Fatal
Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when debugging.
Log Size (KB): specifies the size of the file to which the internal probe messages are stored.
User Name (including domain name): defines the user name to access the directory or the file. The user name must include the domain
name.
dirscan > Message Pool
This section allows you to view the list of alarm messages and their properties.
Identification Name: indicates the name of the alarm message.
Token: indicates the checkpoint that the probe sets.
Error Alarm Text: specifies the text of the alarm message.
Clear Alarm Text: specifies the text of the message that is issued after the erroneous values are back within the threshold limits.
Error Severity: indicates the severity of the alarm.
Subsystem string/ID: indicates the ID of the sub system which issues the alarm.
Note: The integrity profile is added as a child node under the Integrity Profiles node.
Profiles Node
This node allows you to create a monitoring profile for the File and Directory Scan probe.
Navigation: dirscan > Profiles
Set or modify the following values as required:
Profiles > Add Profile > Add
This section allows you to create and configure a monitoring profile for the File and Directory Scan probe.
Profile Name: Defines the name of the monitoring profile.
Note: The integrity profile is added as a child node under the Profiles node.
Note: The performance counters are visible in a tabular form. You can select any one counter in the table and can configure its
properties.
Note: When specifying read response time, the response time is calculated from reading the first one Megabyte of the file. If no file
specified, the largest file in the directory is read for generating the QoS for the File Read Response Time. You can watch for the
shortest, longest, or and individual file in the directory.
Message: specifies the alarm message that is generated when the file size breaches the threshold condition.
Command: specifies the action to be performed when the specified condition is not met.
Size Of: specifies the size of file that can be configured to use the size of the newest file, the oldest file, or each individual file, when more
than one file match the pattern.
Contents
Verify Prerequisites
Create and Configure File and Directory Monitoring Profiles
Create and Configure File Integrity Profiles
Manage Patterns for File Integrity
Create Schedules
Example of Creating a Profile
Verify Prerequisites
Verify that required hardware and software is available and any installation consideration is met before you configure the probe. For more
information, see dirscan (File and Directory Scan) Release Notes.
Note: You can also select and modify the sample profile available in the probe.
The New Profile window opens. The dialog contains some general properties and three main tabs - Scan Directory, Alarm Messages,
Quality of Service messages.
2. Specify a name and description for the profile.
3. Select Active to activate the profile immediately after creation.
4. Select the required schedule from the Schedule drop-down list.
This field contains the list of already configured schedules. Refer Create Schedules for more information.
Note: If a schedule is attached to at least one profile, the schedule cannot be deleted.
5. Specify the directory to be monitored in the Scan Directory tab > Directory field.
You can select a directory using the Browse button.
6. Specify the regular expression pattern of the files to be monitored.
Example: *.txt for all files with the txt extension. Refer Using Regular Expressions for Files section in the dirscan Regular
Expressions article for more information.
7. Select Recurse Into Subdirectories to include the files that are present in the subdirectories of the specified directory.
The Exclude Directories Pattern field is enabled.
8. Specify the regular expression pattern of the subdirectory names that are excluded from monitoring.
Example: %B for all directories with the full name of a month. Refer Using Regular Expressions for Directories section in the dirscan
Regular Expressions article for more information.
9. Configure the required alarms in the Alarm Messages tab.
You can also click Fetch current values to fetch the current values and start monitoring based on them.
10. Select the required Quality of Service (QoS) messages to be generated in the Quality of Service messages tab.
11. Click OK to create the profile.
1.
Note: You can also select and modify the sample profile available in the probe.
The New Profile window opens. The dialog contains some general properties and three main tabs - File Integrity, Alarm message,
Quality of Service message.
2. Specify a name and description for the profile.
You can also schedule the profile. Refer Manage Schedules for more information.
3. Select Active to activate the profile immediately after creation.
4. Select the required schedule from the Schedule drop-down list.
This field contains the list of already configured schedules. Refer Create Schedules for more information.
Note: You can also add a pattern with a regular expression to monitor all matching files in the directory. Refer Manage
Patterns for File Integrity for more information.
6. Browse for the file, select it, and click OK to add it to be monitored.
7. Click << Recalculate checksum to generate and save a checksum of the file at the current moment.
A message is displayed that the checksum has been recalculated. An error message is displayed if the file does not exist or is not
accessible by the credentials that are specified in the Setup window.
8. Configure the file integrity alarm in the Alarm Messages tab.
9. Select the File Integrity Quality of Service (QoS) message to be generated in the Quality of Service messages tab.
10. Click OK to create the profile.
Note: You can also recalculate the checksum of a file in an existing profile. You can then monitor the future integrity of the file
that is based on the current state.
A message is displayed that the checksum has been recalculated. An error message is displayed if the file does not exist or is not
accessible by the credentials that are specified in the Setup window.
6. Click OK to accept the pattern.
Note: You can right-click in the File Integrity tab and select Edit Pattern or Delete Pattern to edit or delete the pattern. If you edit a
pattern, the Pattern [strPattern] window is displayed with the pattern. Follow Step 2 - 6 of Manage Patterns for File Integrity to edit the
pattern.
Create Schedules
The Schedules tab enables you to specify the time duration when the probe generates alarms and QoS for the monitored process.
Follow these steps:
1. In the Setup window, click the Schedules tab.
2. Right-click and select Add.
The New Schedule dialog opens.
3. Specify a name for the schedule.
4. Specify the Start and End of the schedule in the Range section.
5. Specify the frequency of the schedule.
6. Specify whether to generate alarms or QoS only for a day or within a time frame on specific days.
7. Click OK.
The schedule is created.
Probe Interface
Tool Buttons
Profile List
Setup Window
General Tab
Schedule Tab
Profiles
Scan Directory Tab
Alarm Messages Tab
Quality of Service Messages Tab
Integrity Profiles
File Integrity Tab
Alarm Message Tab
The Quality of Service Message Tab
Message Pool Manager
Message Properties Window
Probe Interface
The GUI consists of four tool buttons at the top and a list of profiles. All defined monitoring profiles appear in the list.
Tool Buttons
General Setup
Opens the Setup window for the probe and allows you to modify the general probe parameters. Refer General Setup for details.
Create a New Profile
Creates a profile defining a destination folder and files to be monitored.
Create a New Integrity Profile
Creates a profile to verify the integrity of particular files. Profiles of this type appear in the profile list with a different icon.
Message Pool Manager
This option allows you to customize the alarm text, and you can also create your own messages.
Profile List
The probe GUI contains a list of profiles. The default configuration contains two sample profiles. Right-clicking the window makes the following
menu options available:
New Profile
Creates a directory scan profile.
New Integrity Profile
Creates a file integrity profile.
Edit
Edits the currently selected profile.
Delete
Setup Window
Clicking the General Setup button opens the Setup window, allowing you to modify the general probe parameters. In this screen, you can also
find the Schedules tab in which you can schedule the profile execution according to your requirement.
General Tab
The General tab allows you to modify the general probe parameters.
The fields in the window are explained as follows:
Check interval (seconds)
Indicates the time interval at which the defined directories are scanned. You can reduce this interval to generate alarms faster (as next
interval takes lesser time) but it increases the system load. You can also increase this interval to reduce the system load but it generates
alarms later (as next interval takes more time).
Default message level
Indicates the default message level for alarm messages. This value can be overridden in each alarm condition.
Log Level
Sets the level of details written to the log file. Log as little as possible during normal operation to minimize disk consumption, and
increase the amount of detail when debugging.
Log Size
Sets the size of the probes log file to which probe-internal log messages are written. The default size is 100 KB. When this size is
reached, the contents of the file are cleared.
Run as user
Specifies a valid user name and password to allow access to required files and directories. Note that the user name also must include the
Windows domain name (<domain name>/<user name>).
Schedule Tab
The Schedules tab enables you to specify the time duration in which alarms and QoS for the monitored process should be generated.
New Schedule Window
The fields in the window are explained as follows:
Name
Specifies a name for the schedule.
Range
Specifies the time duration for which the alarms/QoS should be generated.
Notes:
To start alarm generation immediately, select option Start Now.
To start alarm generation at a specific date-time in future, select the option Start at and then select the required date-time.
To continue alarm generation indefinitely, select the option No End.
To stop the alarm generation after specific number of occurrences, select End after_ occurrences and enter the number of
occurrences. If the processes probe restarts, the count for number of occurrences is reset.
To stop the alarm generation at a specific date-time in future, select the option End by and then select the required date-time.
Pattern
Specifies the pattern in which the alarms/QoS are generated. Select a pattern (Secondly, Minutely, Hourly, Daily) and then specify the
value at which the pattern is repeated.
Restrict To
Specifies whether to generate alarms or QoS only for a day or within a time frame on specific days.
Note: The following points need to be taken into consideration while choosing the appropriate time range for the alarms/QoS:
If you select Start Now option and set Restrict To settings to a later day/time, the alarms/QoS are generated on the day/time
specified in Restrict To field.
If you set Restrict To option to, say 3:00 AM to 4:59 AM, and set Start at option to a later time (say 4:00 PM for same day),
then the alarms/QoS are generated from 4:00 PM.
If you select End at option at, say 11:00 AM and set Restrict To settings to an earlier time (say, 10:30 AM), then alarms/QoS
cannot be generated after 10:30 AM.
If you select End at option at, say 5:30 PM, and set Restrict To settings to a later time (say, 6:30 PM), then alarms/QoS
cannot be generated after 5:30 PM.
The shortest time frame, based on the combination of all the filters applied, are taken into consideration for generating alarms/QoS
messages.
Profiles
The New Profile window contains some general properties and the following tabs:
Scan Directory
Alarm messages
Quality of Service messages
When creating or editing a profile, the following values must be specified:
Name
Specifies the name of the profile.
Active
Determines whether the profile should be active or not.
Description
Specifies a short description of the profile.
Schedule
Selects the schedule of execution for the profile.
Scan Directory Tab
If the probe runs on a Windows system, remote shares can be monitored when the directory is specified with the format //<computer
name>/<share name> or //<computer name>/<share name>/<directory>. When this syntax is used, two additional fields appear: User and
Password. The user name is specified as <domain>\<user>. You can use the Fetch current values button in the Alarm messages tab to verify
that the specified combination of values is valid. If not specifying anything, the user and password specified under the Setup tab is used.
Notes:
Multiple directories can be specified separated with a ';' sign.
Remote shares should be placed first, to be able to specify user and password.
Browse
Opens the Select Directory window where you can select a directory or a file.
Notes:
The Browse function cannot work when more than one directory is specified.
Select a directory and press the OK button to fill in the Directory field for you.
Select a file and press OK to fill in both the Directory and Pattern fields.
Pattern
Specifies a pattern, which is matched with the files found in the specified directory. When verifying, only the files matching this pattern are
included. Refer dirscan Regular Expressions for more information.
Recurse into subdirectories
Determines if the scanning process also should include files residing in sub-directories of the specified directory.
Alarm Messages Tab
Directory check
Verifies if the specified directory is present.
Directory age
Verifies if the directory has changed between each scan. Such a change would happen if a file is created or removed from the directory.
Note: This option monitors only the specified directory even if the Recurse into subdirectories option is set.
Alarm Tabs
The following fields are common for all the alarm tabs:
Fetch current values
Verifies the current value of the monitored target. This might be helpful to do before defining the alarm threshold in the Watch field.
Watch
Specifies the condition you want the measured value to meet. If the specified condition is breached, the selected alarm message selected
is generated.
Note: The value can be a checkbox to be selected or a combination of an operator, a value, or the unit of measurement.
Alarm message
Selects an alarm message and a clear message from the drop-down list for each alarm. These messages are available in the message
pool, where you can also create your own messages or edit existing ones.
Automatic action > Command
Defines the action that is performed when the specified condition is not met.
You can specify the arguments of the command which can include the following variables:
$watcher: Returns the name of the monitoring profile.
$description: Returns the description of the monitoring profile.
$file: Returns the name of the file or the file name pattern, which the probe is monitoring in the monitoring profile. If there are more
than one file, it returns the array of file names and the probe executes the command for that number of times.
$size: Returns the size of the file or directory, which the probe is monitoring. If there are more than one file, it returns the array of file
size and the probe executes the command for that number of times.
$unit: Returns the unit of file size, which is configured in the monitoring profile of the probe. This variable also converts the size of the
file or directory in this unit. For example, you are monitoring the file greater than 100 bytes and the probe identifies a file of 1 KB. In
this case, the probe returns the file size as 1024 Bytes.
$limit: Returns the threshold limit of the file or directory size, which is configured in the monitoring profile.
$directory: Returns the complete path of the directory, which is configured in the monitoring profile of the probe.
Notes:
Age of file can be configured to use the age of the newest file, the oldest file or each individual file found when more than one
file match the pattern.
Size of file can be configured to use the smallest file, largest file or each individual file.
Read response time can be configured to use the shortest response time, longest response time, or each individual response
times.
Enable the properties for which you want Quality of Service message to be generated.
The following values can be checked:
Directory exists
Number of matching files
Age of newest matching files
Space used by matching files in KiloBytes
Read response time of largest file in Bytes per Second
Integrity Profiles
The New Integrity Profile and Integrity Profile: Profile Name windows contain some general properties and the following tabs:
File Integrity
Alarm message
Quality of Service message
When creating or editing a profile, the values must be specified as described in the Profiles section.
File Integrity Tab
This tab allows you to specify one or more files to be monitored, verifying if any changes have taken place. Right-clicking in the list opens a
browser enabling you to add or delete files to be monitored. If one of the files defined here is modified, an alarm message is sent. You can also
add, edit, or delete patterns.
Insert File...
Opens the Select File dialog to add a file to be monitored for integrity monitoring. You can also access a network directory by specifying
the user and password details.
Refer Scan Directory Tab section for more information.
Remove File
Removes the file from the list of files to be monitored for integrity monitoring.
Add Pattern
Opens the Pattern [New] window box to add all files matching a regular expression from a directory. Refer dirscan Regular
Expressions for more information.
Edit Pattern
Opens the Pattern [strPattern] window box to edit the regular expression to match files from a directory.
Remove Pattern
Removes the pattern from the list of patterns to be monitored.
The Alarm message tab allows you to specify the alarm message, in the Message ID field, and action, in the Command field, to be taken when a
file change is detected. The value in the Command field can be configured as described in the Profiles > Alarm Message Tab > Automatic
action > Command section.
The Quality of Service Message Tab
This tab allows you to select if a Quality of Service message should be sent on each check interval.
dirscan Metrics
The article describes the metrics that can be configured for the File and Directory Scan (dirscan) probe.
Contents
QoS Metrics
Alert Metrics Default Settings
QoS Metrics
The following table describes the checkpoint metrics that can be configured using the dirscan probe.
Monitor Name
Units
Description
Version
QOS_DIR_AGE
Seconds
The duration that the file has existed in the monitored directory
3.1
QOS_DIR_EXISTS
Boolean
3.1
QOS_DIR_SPACE
Kilobytes
The total size of all files in the specified directory matching the specified pattern.
3.1
QOS_DIR_NUMBER
Number
The total number of files in the specified directory matching the specified pattern.
3.1
QOS_DIR_RESPONSE_TIME
Bytes/Second
3.1
The alarms for some QoS change after migration using threshold_migrator, when they use the standard static thresholds instead of probe specific
alarms. The old and new alarm messages are listed in the table below.
Monitor Name
Original Alarm
Migrated Alarm
QOS_DIR_AGE
QOS_DIR_SPA
CE
QOS_DIR_RES
PONSE_TIME
QOS_DIR_NU
MBER
Monitor Name
Warning
Threshold
Warning
Severity
Error
Threshold
Error
Severity
Description
FileSizeAlarm
None
None
None
Major
The size of the monitored file is outside of the specified threshold value.
FileNumberAlarm
None
None
None
Major
FileSpaceAlarm
None
None
None
Major
FileDeltaSpaceAlarm
None
None
None
Major
FileAgeAlarm
None
None
None
Major
The duration that the file(s) are present in the monitored directory is outside
of the specified threshold value.
ResponseTimeAlarm
None
None
None
Major
The response time of the monitor is outside of the specified threshold value.
FileIntegrityAlarm
None
None
None
Major
DirectoryCheckAlarm
None
None
None
Major
FileError
None
None
None
Major
DirAge
None
None
None
Major
The duration that the monitored directory is present is outside of the specified
threshold value.
Type of Regular
Expression
Explanation
* or ?
Standard
*a
Custom
Matches all files with extensions that end with the letter a. For a file without an extension, the last
letter of the filename must be a.
*.txt
Standard
*.t*
Custom
Matches all files in the directory with an extension that starts with a t.
?.*
Standard
Matches all files in the directory with a single character filename and of any extension.
NewFile?.*
Custom
Matches all files in the directory with a letter or number after NewFile and of any extension.
Examples: NewFile1, NewFilex etc.
[a]
Standard
Matches all files in the directory with the letter a in the filename or the extension.
?[a-c]section.*
Custom
Matches all files in the directory that start with the letters a, b, or c, followed by the word section in
the filename, and of any extension.
Example: bsection.xlsx
Explanation
%B
Standard
31%B
Custom
Matches all directories with the number 31 followed by the full name of a month.
Example: 31March
%B %Y
Standard
Matches all directories with the full name of a month followed by an year.
Example: January 2005.
%y/%m/%d
Custom
More Information:
See the CA Unified Infrastructure Management site for information on how to Discover Systems to Monitor.
Note: Even without any discovery_agent probes deployed, the discovery_server probe is still needed to generate the data required by
other components.
More Information:
See the CA Unified Infrastructure Management site for information on how to Discover Systems to Monitor.
More information:
diskstat (iSeries Disk Monitoring) Release Notes
diskstat AC Configuration
This article describes the configuration concepts and procedures to set up the iSeries Disk Monitoring (diskstat) probe. The probe is configured by
creating profiles to monitor the disks on IBM iSeries systems. You can also configure the profiles to set the properties and conditions to generate
alarms and the QoS messages.
The following diagram outlines the process to configure the probe.
Contents
Verify Prerequisites
Set Up General Properties
Create a Profile
Configure Profile Properties
Configure Disk Monitors
Using Regular Expressions
Alarm Thresholds
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see diskstat (iSeries Disk
Monitoring) Release Notes.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Note: Reduce this interval to generate alarms and QoS frequently. A shorter interval can also increase the system load.
3.
Create a Profile
You can create a profile from the list of currently available disks on the IBM iSeries system. Each profile represents a disk. Using this profile, you
can define the monitoring criteria of the disks which generate the alarms and QoS for the probe.
Follow these steps:
1. Click the Options (icon) next to the Profiles node in the navigation pane.
2. Select Add Profile.
The Add Profile dialog appears.
Note: The profile is automatically populated with the disk details. See Configure Profile Properties for more information on how
to configure the profile properties to configure alarms for the disks.
"hardware problem": identifies a hardware failure within the disk subsystem that does not affect the function or performance of
the disk unit.
"rebuilding parity protection": identifies that the parity protection of the disk unit is being rebuilt.
"not ready": identifies the disk unit as not ready.
"write protected": identifies the disk unit as write protected.
"busy": identifies the disk unit as busy.
"not operational": identifies the disk unit as not operational.
"state not recognized": identifies that the disk unit has returned a status that is not recognizable by the system.
"not accessible": identifies the disk unit as not accessible.
"read+write protected": identifies the disk unit as read/write protected.
Read Requests: indicates the number of read requests per second.
Read KB/s: indicates the data read in KB per second.
Write Requests: indicates the number of write requests per second.
Write KB/s: indicates the data written in KB per second.
To specify these values, select either an existing pattern or a new pattern in the associated When matching field. You can use regular
expressions to specify the pattern. For more information, see Using Regular Expressions.
3. Click Save to save the configuration.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
diskstat IM Configuration
This article describes the configuration concepts and procedures to set up the iSeries Disk Monitoring (diskstat) probe. The probe is configured by
creating profiles to monitor the disks on IBM iSeries systems. You can also configure the profiles to set the properties and conditions to generate
alarms and QoS.
The following diagram outlines the process to configure the probe.
Contents
Verify Prerequisites
Set Up General Properties
Create a Profile
Configure Profile Properties
Configure Disk Properties
Configure Alarms
Configure QoS
Using Regular Expressions
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see diskstat (iSeries Disk
Monitoring) Release Notes.
You can set up the logging and interval details of the probe. If you do not specify these values, the default values are used.
Follow these steps:
1. Click the Setup tab.
2. Update the following information, as required:
Log Level: specifies the level of details that are written to the log file.
Default: 0-Fatal
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Note: Reduce this interval to generate alarms and QoS frequently. A shorter interval can also increase the system load.
Create a Profile
You can create a monitoring profile to define the monitoring criteria of the disks which generate the alarms and QoS for the probe.
Follow these steps:
1. Click the Profiles tab.
2. Right click in the tab and select New.
The Profile window appears.
Note: The profile is automatically populated with the disk details. See Configure Profile Properties for more information on how
to configure the profile properties to configure alarms for the disks.
Configure Alarms
Configure QoS
Configure Alarms
You can configure the properties of the profile to generate alarms for specified conditions.
Configure QoS
You can select the disk properties for which Quality of Service messages can be sent by the probe.
Follow these steps:
1. Click the Profiles tab.
2. Double click a profile to open the Profile dialog.
3. Click the Quality of Service tab and select the desired disk properties on which the Quality of Service messages can be sent by the
probe. For example, if you select the Total IO requests in requests per second option, the probe reports the sum of read requests and
write requests as a Quality of Service message.
4. Click OK to save the configuration.
Explanation
* or ?
Standard
*a
Custom
a?
Standard
Matches all two letter values that start with the letter a
*t*
Custom
diskstat Metrics
This article describes the metrics that can be configured for the iSeries Disk Monitoring (diskstat) probe.
Units
Description
Version
QOS_AS400_DISK_USAGE
Percent
1.0
QOS_AS400_DISK_BUSY
Percent
1.0
QOS_AS400_DISK_UNIT_CONTROL
State
1.0
QOS_AS400_DISK_READ_REQUESTS
Request/sec
1.0
QOS_AS400_DISK_READ_KB
KB/sec
1.0
QOS_AS400_DISK_WRITE_REQUESTS
Request/sec
1.0
QOS_AS400_DISK_WRITE_KB
KB/sec
1.0
QOS_AS400_DISK_TOTAL_REQUESTS
Request/sec
1.0
QOS_AS400_DISK_TOTAL_KB
KB/sec
1.0
Warning Threshold
Warning Severity
Error Threshold
Error Severity
Description
Busy
None
None
None
Minor
Busy
Disk
None
None
None
Clear
IOKb
None
None
None
Minor
Total IO requests in KB
IORequests
None
None
None
Minor
Incoming IO requests
NoDisk
None
None
None
Major
No disk found
ReadKb
None
None
None
Minor
ReadRequests
None
None
None
Minor
Unit Control
None
None
None
Major
Usage
None
None
None
Minor
WriteKb
None
None
None
Minor
WriteRequests
None
None
None
Minor
Contents
Verify Prerequisites
Set Up General Properties
Create a Profile
Configure Profile Properties
Configure Disk Monitors
Using Regular Expressions
Alarm Thresholds
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see diskstat (iSeries Disk
Monitoring) Release Notes.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Note: Reduce this interval to generate alarms and QoS frequently. A shorter interval can also increase the system load.
Create a Profile
You can create a profile from the list of currently available disks on the IBM iSeries system. Each profile represents a disk. Using this profile, you
can define the monitoring criteria of the disks which generate the alarms and QoS for the probe.
Follow these steps:
1. Click the Options (icon) next to the Profiles node in the navigation pane.
2. Select Add Profile.
The Add Profile dialog appears.
Note: The profile is automatically populated with the disk details. See Configure Profile Properties for more information on how
to configure the profile properties to configure alarms for the disks.
A regular expression (regex for short) is a special text string for describing a search pattern. Constructing regular expression and pattern matching
requires meta characters. The probe supports Perl Compatible Regular Expressions (PCRE). You can use the wild card operator * in the probe to
filter the disks that are included in a profile. The probe matches the patterns to select the applicable disks for monitoring. The following fields in
the profile name node support regular expressions.
When matching (ASP Number)
When matching (Disk Type)
When matching (Disk Model)
When matching (Resource Name)
When matching (Disk Unit Number)
The probe monitors the disks that fulfill the criteria for all these fields.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
diskstat Node
Disks Node
Profiles Node
<Profile Name> Node
diskstat Node
The diskstat node lets you view the probe and alarm message details, and configure the log properties.
Navigation: diskstat
Set or modify the following values, as needed:
diskstat > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
diskstat > Setup Configuration
This section lets you configure the log properties and the log file size of the diskstat probe.
Log Level: specifies the level of details that are written to the log file. Log as little as possible during normal operation to minimize disk
consumption, and increase the amount of detail when debugging.
Default: 0 - Fatal
Log Size(KB): specifies a maximum size of the probe log file.
Default: 100 KB
Check Interval in seconds: specifies the time interval after which the probe fetches the disk data. Reduce this interval to generate
alarms faster, as the next interval takes lesser time but it can increase the system load. You can also increase this interval to generate
alarms later and reduce the system load.
Default: 60 seconds
The Disks node contains a list of all disks which are available on the IBM iSeries system. The parameters that are used to recognize a specific
disk are as follows:
Resource Name: identifies the unique resource name of the disk.
ASP Number: indicates the number of the Auxiliary Storage Pool (ASP) to which the disk is assigned.
Note: The probe monitors only the disks assigned to an ASP.
Enables you to create a monitoring profile with information from the currently selected disk. The profile is available under the Profiles node.
Profiles Node
This node is used to create a monitoring profile. You can create multiple monitoring profiles with different criteria to monitor the disks. All
monitoring profiles are displayed under the Profiles node.
<Profile Name> Node
The profile name node lets you define the monitoring criteria of the disks which generate the alarms and QoS for the probe.
Note: The fields in this node are automatically populated if you create the profile using the Disks node.
Note: Several options are selected when a profile is created from the Actions drop-down list. If you create a profile using the Add
Profile option, only the ASP Number field is selected.
Default: ReadKb
Profile Name > Disk Write Requests Monitor
This section lets you configure the Disk Write Requests Monitor for generating QoS and alarms.
Write Requests(>=): specifies the maximum threshold value for disk write requests.
Message: specifies the alarm message when the disk write requests reach the specified limit.
Profile Name > Disk Write KB Monitor
This section lets you configure the Disk Write KB Monitor for generating QoS and alarms.
Write KB(>=): specifies the maximum threshold value for disk write KB.
Message: specifies the alarm message when the disk write KB reach the specified limit.
Default: WriteKb
Profile Name > Disk IO Requests Monitor
This section lets you configure the Disk IO Requests Monitor for generating QoS and alarms.
Total IO Requests(>=): specifies the maximum threshold value for total disk IO requests.
Message: specifies the alarm message when the disk IO requests reach the specified limit.
Profile Name > Disk Total IO KB Monitor
This section lets you configure the Disk Total IO KB Monitor for generating QoS and alarms.
Total IO KB(>=): specifies the maximum threshold value for total disk IO KB.
Message: specifies the alarm message when the sum of disk data read and disk data write reach the specified limit.
Default: IOKb
Profile Name > Disk Unit Control Monitor
This section lets you configure the Disk Unit Control Monitor for generating QoS and alarms.
Unit Control Not Matching: specifies that an alarm is generated when the disk unit control (disk status) is not as specified.
Message: specifies the alarm message when the disk unit control does not match the specified status.
Default: UnitControl
Profile Name > No Disk Found
This section lets you configure the No Disk Monitor for generating QoS and alarms.
No Disk Found: specifies that an alarm is generated when no disk is found which matches the criteria specified in the Disk Properties se
ction.
Contents
Verify Prerequisites
Set Up General Properties
Create a Profile
Configure Profile Properties
Configure Disk Properties
Configure Alarms
Configure QoS
Using Regular Expressions
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see diskstat (iSeries Disk
Monitoring) Release Notes.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Note: Reduce this interval to generate alarms and QoS frequently. A shorter interval can also increase the system load.
Create a Profile
You can create a monitoring profile to define the monitoring criteria of the disks which generate the alarms and QoS for the probe.
Note: The profile is automatically populated with the disk details. See Configure Profile Properties for more information on how
to configure the profile properties to configure alarms for the disks.
You can monitor the disks by adding disk properties such as Disk unit number, Disk type, and Disk model. To specify these values, select either
an existing pattern or create a new pattern in the associated When matching field. You can Use Regular Expressions to specify the pattern.
Follow these steps:
1. Click the Profiles tab.
2. Double click a profile to open the Profile dialog.
3. Set or modify the following information under the Disk Properties tab:
a.
You can configure the properties of the profile to generate alarms for specified conditions.
Follow these steps:
1. Click the Profiles tab.
2. Double click a profile to open the Profile dialog.
3. Click the Alarm tab and specify the alarms that are generated when a threshold is breached. For example, you can select the No disk
found option and specify the alarm message that is used when a disk matching the specified criteria is not found.
4. Click OK to save the configuration.
Configure QoS
You can select the disk properties for which Quality of Service messages can be sent by the probe.
Follow these steps:
1. Click the Profiles tab.
2. Double click a profile to open the Profile dialog.
3. Click the Quality of Service tab and select the desired disk properties on which the Quality of Service messages can be sent by the
probe. For example, if you select the Total IO requests in requests per second option, the probe reports the sum of read requests and
write requests as a Quality of Service message.
4. Click OK to save the configuration.
target string.
The diskstat probe uses regular expressions to search a substring for the selected disk parameter. For example, to search for a disk named DD,
enter /DD/ as the value for When matching field.
You can use regular expressions in the probe to filter the disks that are included in a profile. The probe matches the patterns to select the
applicable disks for monitoring. The following fields in the profile name node support regular expressions:
When matching (ASP Number)
When matching (Disk Type)
When matching (Disk Model)
When matching (Resource Name)
When matching (Disk Unit Number)
The probe monitors the disks that fulfill the criteria for all these fields.
The following table describes some examples of regex and pattern matching for files using the diskstat probe.
Regular Expression
Explanation
* or ?
Standard
*a
Custom
a?
Standard
Matches all two letter values that start with the letter a
*t*
Custom
Setup Tab
Profiles Tab
Profile Window
Disk Properties Tab
Alarm Tab
Quality of Service Tab
Disks Tab
Setup Tab
The Setup tab allows you to view the probe and alarm message details and configure the log properties.
Set or modify the following values, as needed:
Check interval
This section provides information about the check interval of the probe.
Perform check each ... seconds: specifies the time interval at which the disk data in the system is fetched and checked. Reduce
this interval to generate alarms faster, as the next interval takes lesser time but it can increase the system load. You can also
increase this interval to generate alarms later and reduce the system load.
Log Level: specifies the level of details that are written to the log file. Log as little as possible during normal operation to minimize disk
consumption, and increase the amount of detail when debugging.
Default: 0-Fatal
This tab allows you to view, create, modify, or delete monitoring profiles. You can create multiple profiles with different criteria to monitor the
disks. You can also activate or deactivate profiles in this tab. You can right click in this tab to use the following options:
New: creates a new profile using the Profile window.
Edit: modifies an existing profile using the Profile window.
Delete: deletes the selected profile.
Profile Window
The Profile window allows you to configure a monitoring profile of the probe. This window lets you define the monitoring criteria of the disks which
generate the alarms and QoS for the probe. The fields in this window are automatically populated if you create the profile using a disk from the Di
sks tab.
Disk Properties Tab
This tab lets you monitor the disks by adding disk matching parameters such as Disk unit number, Disk type, and Disk model. The probe
evaluates the matching criteria and selects all disks matching the criteria specified in disk properties.
Set or modify the following values, as needed:
Note: All the disk parameters support regular expressions. For more information, see Using Regular Expressions.
ASP number: indicates the number of auxiliary storage pools(ASP) to which the disk is assigned. The probe monitors only those disks
that are assigned to an ASP.
Disk type: indicates the type of disk specification.
Disk model: indicates the disk model specification.
Resource Name: indicates the unique resource name of the disk.
Disk unit number: indicates the disk unit number within the ASP.
Note: Mirrored disks have the same unit number.
Alarm Tab
This tab enables you to send alarm message when an alarm condition arises.
Set or modify the following values, as needed:
No disk found
Specifies that an alarm message should be generated if no disk is found which matches the criteria specified in the Disk properties tab.
The Message field determines the alarm message to be used for this alarm situation.
Usage >=: specifies that an alarm message is generated if the disk usage reaches the specified limit.
Busy >=: specifies that an alarm message is generated if the disk busy reaches the specified limit. Disk busy is the percentage of
elapsed time for which the disk is up and running.
Unit control not matching: specifies that an alarm message is generated if the disk unit control (disk status) is not as specified.
The following are possible values:
"": There is no unit control value.
"active" (1): The disk unit is active.
"failed" (2): The disk unit has failed.
"other unit failed" (3): Some other disk unit in the disk subsystem has failed.
"hardware performance degrade" (4): There is a hardware failure within the disk subsystem that affects performance, but does not
affect the function of the disk unit.
"hardware problem" (5): There is a hardware failure within the disk subsystem that does not affect the function or performance of the
disk unit.
"rebuilding parity protection" (6): The disk unit's parity protection is being rebuilt.
"not ready" (7): The disk unit is not ready.
"write protected" (8): The disk unit is write protected.
"busy" (9): The disk unit is busy.
"not operational" (10): The disk unit is not operational.
"state not recognized" (11): The disk unit has returned a status that is not recognizable by the system.
"not accessible" (12): The disk unit cannot be accessed.
"read+write protected" (13): The disk unit is read/write protected.
Read requests>=: specifies that an alarm message is generated if disk read requests value reaches the specified limit.
Read KB>=: specifies that an alarm message is generated if the disk data read reaches the specified limit.
Write requests>=: specifies that an alarm message is generated if disk write requests reaches the specified limit.
Write KB>=: specifies that an alarm message is generated if disk data written reaches the specified limit.
Total IO requests>=: specifies that an alarm message is generated if the sum of disk read requests and disk write requests reaches the
specified limit.
Total IO KB>=: specifies that an alarm message is generated if the sum of disk data read and disk data writtten reaches the specified
limit.
Quality of Service Tab
This tab enables you to select the disk properties for which Quality of Service messages can be sent.
Set or modify the following values, as needed:
Disk usage in %: This checkbox indicates the probe that the disk usage is sent as a QoS message.
Disk busy in %:This checkbox indicates the probe that the disk busy is sent as a QoS message.
Unit control: This checkbox indicates the probe that the Unit control (disk state) is sent as a QoS message.
Read requests in requests per second: This checkbox indicates the probe that the disk read requests are sent as QoS message.
Data read in KB per second: This checkbox indicates the probe that the disk data read is sent as QoS message.
Write requests in requests per second: This checkbox indicates the probe that the disk write requests are sent as QoS message.
Data written in KB per second: This checkbox indicates the probe that the disk data written is sent as QoS message.
Total IO requests in requests per second: Reports the sum of read requests and write requests as a Quality of Service message.
Total data IO in KB per second: Report the sum of data read and data written as a Quality of Service message.
Disks Tab
The Disks tab contains a list of all disks assigned to an ASP currently in the system.
You can right click a disk in this tab to use the following options:
Refresh: retrieves a fresh list of disks from the monitored system.
Create Profile: enables you to create a monitoring profile for the selected disk. The profile is available in the Profiles tab.
Advanced
Alarm Messages
Forwarding
Navigation: distsrv
Set or modify the following values that are based on your requirement. When finished, click Save to keep your changes, or Discard.
Probe Information
This section provides read-only information about the probe name, start time of the probe, probe version, and the vendor who created the
probe.
General Configuration
This section allows you to configure the log properties and timeout settings for the Package Distribution Server probe.
Log level: Specifies the level of details in the log file.
Default: 0 - Fatal
Note: Recommendation is to select a lower log level during the normal operation and minimize the disk consumption. You
can increase the log level while debugging.
Log Size (KB): The default size of the log file is 100 KB. This field allows you to change the size of the log file according your needs.
Archive Folder: Specifies the directory where the archive packages are stored. Directories are relative to the Nimsoft installation
directory. The default directory is archive.
Note: You can change this parameter if you are running out of space and want the packages to be stored on a different
disk. The packages in the archive are not automatically moved, you must move them manually to the new location.
Retry Attempts: Defines the number of times the server should attempt distribution.
Retry Timeout (seconds): Defines the time-out in seconds for distribution retries.
Advanced
This section allows you to fine-tune the use of the distsrv probe.
Navigation: distsrv > Advanced
Advanced
Log Finished Distribution: Check this box to log finished installations to files in subdirectories located in probes\service\distsrv\jobs.
This provides a history of your installations. Default is on.
Alarm on Finished Distribution: Check this box to send an alarm message when an installation is finished. Default is off. The alarm
message that is sent is hard-coded and will send one of the following messages:
Alarm Messages
This section lists the default alarm messages issued when a distribution completes, and allows you to configure alarm attributes.
Navigation: distsrv > Alarm Messages
Click on the message you want to modify to select it, then update its properties in the fields located below. Click New to add a new message.
Click Delete to remove an existing message.
When finished, click Save to keep your changes, or Discard any changes.
Message Definitions
Message Name: The name of the alarm message
Message Text: The complete alarm message to be sent. This field allows the following variables:
$job_description - Set on distribution creation "Created by: ..." when created from Infrastructure Manager.
$job_id - set on distribution, creating normally "system" or "system-n" when the job is created from Infrastructure Manager.
$package_name
$package_version
$result - result string from the distribution.
$robot - target robot.
$status - return code from the installation, corresponds to the"Result code" under the "Installations" tab.
Alarm Level: The alarm severity level
Subsystem: The originating subsystem ID
Default Messages for Alarms:
Error indicates that the distribution failed
OK indicates that the distribution was successful
NoUpdate will be issued when you specify 'update' when creating the job, and the package has not been distributed to the Robot before.
The distribution will be aborted.
Forwarding
This screen allows you to configure when to forward updates and allow forwarding of licensing information either with the package or when a
change is detected.
Note: Using this option does not change the forwarding on interval setting. This setting handles changes in the remote
archive(s).
Forwarding Profiles
Define profiles for distributing select groups of packages to destination hubs. Click New to add a profile. Click Delete to remove an
existing profile. When configuring a profile, you can configure these attributes:
Hub Destination: Select the destination hub from the drop-down list.
Profile Active: Select this box to make the profile active.
Profile Type: Choose the type of profile from the drop-down list:
Specific: Forward only those packages checked in the packages list
Update: Forward only those packages already present on the remote distribution server
All: Forward all packages
Licenses: Functions in much the same way as the other forwarding types. This forwarding type may only be needed when a
dashboard login is used, and where the licenses that are used that are not connected to a specific package. Package specific
licenses can more efficiently be forwarded together with the package.
All Versions: Allows you to forward all versions of the package(s) specified. (Otherwise only the most recent version will be
forwarded). Note that the removal of a version of a package will not be reflected on the destination distribution server.
Archive directory
This tab allows you to define the directory where the archive packages are stored. Directories are relative to the UIM installation directory. The
default directory is archive.
Note: You can change this parameter if you are running out of space and want the packages to be stored on a different disk. The
packages in the archive are not automatically moved, you must move them manually to the new location.
Retry
This tab allows you to set parameters to define what the distribution server should do when a distribution is not successfully completed.
Attempts: Defines the number of times the server should attempt distribution.
Delay: The minimum time (in minutes) between distribution attempts.
Forwarding
This tab allows you to configure when to forward updates and allow forwarding of licensing information either with the package or when a change
is detected.
Forwarding active: Select this option if you want to turn on forwarding of license information to other distribution servers within the same
domain.
Forward interval: Define the interval when licenses can be forwarded and package versions are compared to determine if packages are
due for forwarding.
Always forward license with package: Select this option if you want the licenses always to be forwarded with the corresponding
packages.
Immediately perform forwarding when a change is detected: Select this option when you want to forward the license immediately
when a package is added or changed in the archive.
Note: Using this option does not change the forwarding on interval setting. This setting handles changes in the remote
archive(s).
Advanced settings
This tab allows you to fine tune the use of the distsrv probe.
Log finished installations: Logs finished installations to files in sub-directories of probes\service\distsrv\jobs. This provides a history of
your installations.
Alarm on finished installations: Sends an alarm message when an installation is finished. The alarm message is hard-coded and will
send one of the following messages:
information - if the forwarding operation completed successfully.
minor - if update is specified and the package was not present on the target robot.
major - if the distribution fails.
Use remote distsrv on distribution: Enables the distsrv probe to transfer a package to a distsrv probe on another hub. This is
especially useful when distributing to several robots residing under a hub on a different net/subnet, as the package will be distributed to
the remote hub only once.
Accept remote distributions: Allows other distsrv instances to use this distsrv probe for remote distribution.
Use local archive for accepted remote distributions: When a distribution request is received from another distsrv, the package is
distributed from the local archive. If this option is not selected, the package will be requested from the originating distsrv.
Note: This option should be used in conjunction with the package forwarding mechanism.
Keep history: Specifies how long to keep information about finished installations. Default is 7 days.
CRC error retry count: When individual files are transferred during distribution, their consistency is checked. If this fails, the file is
transferred again. After the number of retries have been used unsuccessfully, the installation fails. The package might be distributed
again according to what is specified in the Retry tab.
Block size: The largest amount of data to be transferred per transaction. The upper limit is currently set at 32000 bytes. Lowering this
value will slow down the overall transfer but each block transfer would be faster.
Messages
This tab lists the default alarm messages issued when a distribution completes.
Right click in the message section to add a new message, update message properties or delete a message. The alarm message properties
screen appears.
The fields are:
Name: The name of the alarm message.
Text: The complete alarm message to be sent. This field allows the following variables:
$job_description - Set on distribution creation "Created by: ..." when created from Infrastructure Manager.
$job_id - set on distribution, creating normally "system" or "system-n" when the job is created from Infrastructure Manager.
$package_name
$package_version
$result - result string from the distribution.
$robot - target robot.
$status - return code from the installation, corresponds to the"Result code" under the "Installations" tab.
Level: The alarm severity level.
Subsystem: The originating subsystem ID.
Default for alarm situation:
Error indicates that the distribution failed.
OK indicates that the distribution was successful.
NoUpdate will be issued when you specify 'update' when creating the job, and the package has not been distributed to the Robot
before. The distribution will be aborted.
Click OK to save your changes.
Note: This is the same information that appears in the Archive node in Infrastructure Manager.
The following commands are available when you right click in the package list:
New Package
Starts the package editor to let you define a new distributable package, which is placed in the archive. Note that the package editor can
also be started directly from Infrastructure Manager.
Important! Package names cannot contain a dot "." in the filename. Use an underscore "_" or dash "-" instead.
Edit Package
Starts the package editor to let you modify an existing package. Note that the package editor can also be started directly from
Infrastructure Manager.
Remove Package
Removes the selected package. Note that this function is also available directly from Infrastructure Manager.
Copy package
A new package is created based on the selected package with the same contents. Note that if the package contains a probe, the probe is
not given the new name.
Rename package
A new name is given to the selected package. Note that if the package contains a probe, the probe is not given the new name.
Configure archived configuration
This option lets you modify the configuration files (.cfx files) distributed with the probe package.
Note: The configure archived configuration option was introduced in distsrv version 4.7.x, and the first probe package enabled to take
advantage of this feature is the logmon probe.The option is also available in the version of Infrastructure Manager delivered with
Nimsoft Server 3.60 or newer, where you right-click the probe package and select Configure archived configuration.
Select the cfx file you would like to configure.The probe GUI displays. Update the options and click Apply when you are finished. The cfx
file in teh probe package is now updated and the package can be distributed to the robots with the modified cfx file.
Convert old package to new format
If you had an earlier version of Nimsoft installed (version 1.5), this option allows you to convert a package created with the old version of
Nimsoft to the current package format. The old packages must be converted to the new format if you still want to be able to distribute
them.
Important! Package names cannot contain a dot "." in the filename. Use an underscore "_" or dash "-" instead.
Important! Adding a forwarding record to the distsrv probe requires you to apply the changes, wait five minutes and then restart the
NAS probe on the remote hub.
Note: The removal of a version of a package will not be reflected on the destination distribution server.
Advanced features
The maximum number of packages to be forwarded simultaneously is set to 10, but can be changed by changing the /setup/max_inst option in
the probe configuration file (distsrv.cfg). You can modify this option using Raw Configure within Infrastructure Manager.
Note: The distribution server also stores the UIM licenses. These can be maintained, using Infrastructure Manager.
More information:
Note: You can also select and modify the sample profile available in the probe.
4. Test the profile in the Profile-<Profile Name> node to verify the connection.
5. Configure the required monitors to generate alarms.
6. Activate the profile and save the configuration to start monitoring.
Creating Profile(s)
A monitoring profile is created to define the DNS server to be monitored by the probe. The monitoring profile defines the conditions and threshold
values for generating QoS and alarms. You can also edit the monitoring parameters, which are based on changing business requirements
respective to the changes in monitoring requirements.
Follow these steps:
1. Click the Options icon next to the dns_response node in the navigation pane.
2. Click the Add New Profile option.
3. Update the field information in the Add New Profile dialog and click Submit.
The profile is saved and you can configure the profile properties for monitoring the DNS server.
Delete Profile
You can delete the monitoring profile that is no longer in use. You can also deactivate your monitoring profile, so that when a requirement comes
again, you are not required to configure all the parameters again.
Follow these steps:
1. Click the Options icon next to the Profile-profile name node in the navigation pane.
2. Click the Delete Profile option.
3. Click Save.
The selected profile is removed from the dns_response node.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
dns_response Node
Profile-<Profile Name> Node
<Profile Name> Node
Network Server Node
dns_response Node
The dns_response node displays the probe information and lets you configure the log properties of the probe. You can set the time interval for
checking the Domain Name Server (DNS). You can also view the list of alarm messages in the message pool.
Navigation: dns_response
dns_response > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the vendor who created the probe.
Note: This node is named as the Profile-profile name throughout this document as the profile name is user-configurable.
This node represents the new profile that is created to monitor the DNS response.
Note: This node is user-configurable. Hence, this node is referred to as the profile name node throughout this document.
The Network Server node is used to configure the monitoring properties of the probe. You can specify various conditions to generate the QoS
and alarms.
Navigation: dns_response > Profile-profile name > profile name > Network Server
Network Server > General Profile Configuration
This section is used to configure the profile-specific monitoring parameters.
Description: defines a brief description of the profile. This description is for information only and is not used for any processing.
Host/Domain: defines a hostname or domain to be checked.
Reverse Lookup IP: defines an IP address for performing the reverse lookup with the forward lookup from a single profile.
Name Server: allows you to define the new name server for the current profile.
Port: specifies the port number for performing the reverse lookup with the forward lookup from a single profile.
Default: 1
Protocol: specifies the protocol for performing the reverse lookup with the forward lookup from a single profile.
Default: TCP
Server Lookup Type: specifies the server lookup type that the probe supports.
Default: Normal IP Address (A)
Server Lookup Class: specifies the server lookup class that the probe supports.
Number of Retries: specifies the retry attempts for resolving the host or domain name and Reverse Lookup IP. The separate retries are
attempted for the Host/Domain and the Reverse Lookup IP.
Timeout (Seconds): specifies the time limit for each request. If you have multiple retries, the cumulative timeout is multiplied by the
number of retries.
Network Server > Messages for Alarm Timeout Situations
This section displays a list of alarms for preconfigured timeout situations.
Network Server > Alarm and QoS for Forward Lookup
This section is used to configure alarms and QoS for Forward Lookup. If the threshold value is exceeded, an alarm is issued.
Enable Forward Lookup Alarm: allows you to start receiving the Forward Lookup alarms.
Default: Not Selected
Forward Lookup alarm Threshold (Milliseconds): specifies the time limit before issuing the Forward Lookup alarm.
Enable Forward Lookup Warning: allows you to start receiving the Forward Lookup warnings.
Default: Not selected
Forward Lookup warning Threshold (Milliseconds): specifies the time limit before issuing the Forward Lookup warning.
Source used for QoS and Alarm messages (Allow source override): defines the source of alarm and QoS. If this field is blank, the QoS
source is the hostname and alarm source is the IP address of the system.
Network Server > Lookup Failed
This section allows you to generate alarms when the probe failed to perform the Lookup action.
Network Server > Alarm Time
This section allows you to generate alarms when the response time is greater than the alarm threshold limit.
Network Server > Warning Time
This section allows you to generate alarms when the response time is greater than the warning threshold limit.
Network Server > Parse Error
This section allows you to generate alarms when the probe is not able to read the response time.
Network Server > DNS Fail
This section allows you to generate alarms when the DNS Server fails to response the probe.
Network Server > Alarm and QoS for Reverse Lookup
This section is used to configure alarms and QoS for Reverse Lookup. If the threshold value is exceeded, an alarm is issued.
Enable Reverse Lookup Alarm: allows you to start receiving the Reverse Lookup alarms.
Default: Not selected
Reverse Lookup alarm Threshold (Milliseconds): specifies the time limit before issuing the Reverse Lookup alarm.
Enable Reverse Lookup Warning: allows you to start receiving the Reverse Lookup warnings.
Default: Not selected
Reverse Lookup warning Threshold (Milliseconds): specifies the time limit before issuing the Reverse Lookup warning.
Source used for QoS and Alarm messages (Allow source override): defines the source of alarm and QoS. If the field is blank, the QoS
source is the hostname and alarm source is the IP address of the system.
Note: You can also specify a profile-specific name server while creating or configuring profile(s).
4. Create a profile with the domain or host details for the probe.
Refer Configure Profile(s) for more information.
Note: You can also select and modify the sample profile available in the probe.
Note: UDP will work on port 53 and TCP will work on any port.
Test (button)
This button is only available if the probe is running. This button executes a test of the name server listed through the probe. Use the Test
button to check a specified nameserver and not the system default nameserver.
Note the different icons:
The icon next to the Test button is initially a question mark.
If the test is successful, the icon changes to an OK sign.
If the icon turns red (error), you should check your nameserver entry.
Run interval (sec)
Specifies how often the profiles should be checked. The interval is counted in seconds
Configure Profile(s)
You can create a new profile and configure its properties to define the specific checks that the dns_response probe should run.
Follow these steps:
1. Click the Create A New Profile button to open the Profile dialog. You may also right-click in the left pane and select New Profile.
Note: You can also right click an existing profile and select Copy to create a new copy of the old profile.
General Tab
Advanced Tab
Messages Tab
Message Pool Manager
General Tab
This tab contains the general information for the profile. It allows you to set up alarms and to override the default name server if you wish to use
another name server for this lookup.
Advanced Tab
In the Advanced tab, you can set the type of lookup to perform, the number of retries, and the timeout for the command, and turn on Quality of
Service messages.
Messages Tab
The Messages tab lets you select an alarm message for the different alarm situations.
DNS lookup failure
DNS lookup time above alarm threshold
DNS lookup time above warning threshold
DNS does not respond to request.
Unable to read response time
The alarm messages for each alarm situation are stored in the Message Pool. Using the Message Pool Manager, you can customize the alarm
text as well as create your own messages. You can select one of these messages on the Messages tab in the drop-down menus in the Profile di
alog.
Note: When upgrading from probe versions prior to 1.23, the alarm messages must be manually cleared before upgrading. The probe
may not function properly if the messages are not cleared before upgrade.
dns_response Troubleshooting
This section contains some troubleshooting points for the dns_response probe.
Ensure that the passDnsTypeAasAny key is set to yes for a successful DNS query using the Server lookup type as A. If the key value is set to
no, the probe sends all the DNS query as Any.
Follow these steps:
1. Open the Raw Configure window of the probe.
2. Select the setup section.
3. Modify the value of the passDnsTypeAasAny key to yes.
dns_response Metrics
The section describes the metrics that can be configured for the DNS Response Monitoring (dns_response) probe.
Contents
QoS Metrics
Alert Metrics Default Settings
QoS Metrics
The following table describes the checkpoint metrics that can be configured using the dns_response probe.
Monitor Name
Units
Description
Version
QOS_DNS_RESPONSE
Milliseconds
DNS Response
1.6
QOS_DNS_RESPONSE_REVERSE
Milliseconds
DNS Response
1.6
Error Threshold
Error Severity
Description
LookupFailure
None
Major
Alarms to be issued when DNS lookup of target failed on nameserver for profile.
TimeAlarm
None
Major
Alarms to be issued when DNS lookup time of the target server is breaching the alarm threshold.
TimeWarn
None
Warning
Alarms to be issued when DNS lookup time of the target server is breaching the warning threshold.
DNSFailure
None
Major
Alarms to be issued when DNS server for profile does not respond to requests.
ParseError
None
Major
Alarms to be issued when unable to read response time from nameserver for profile.
More information
e2e_appmon (E2E Application Response Monitoring) Release Notes
e2e_appmon AC Configuration
The e2e_appmon probe is a remote probe, configured to monitor response time and availability of the client applications. Each monitoring profile
can execute multiple scripts. You can also define thresholds for generating alarms when the script execution time exceeds the specified limit. The
QoS messages are also configured for saving the response time of the application.
The following diagram outlines the process to configure the e2e_appmon probe to monitor applications.
Contents
Verify Prerequisites
Specify a Directory
Create a Profile
Alarm Thresholds
Create a Deployable Script Package
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see E2E Application Response
Monitoring (e2e_appmon) Release Notes.
Specify a Directory
This section describes about how to enable the probe to run the compiled NimRecorder scripts on the target system.
Follow these steps:
1. Open the e2e_appmon probe.
2. Specify the username and password for the target system where the probe runs the NimRecorder script. The script has the same access
level as the user.
3. Specify the path of the directory with the TaskExec.exe. This TaskExec.exe file is used for executing the compiled scripts on both 32-bit
and 64-bit platforms.
You can also click Browse to navigate to the executable file.
4. Specify the path of the directory with the compiled script files.
You can also click Browse to navigate to the executable file.
Note: The default relative paths of the Command and .ROB File Directory fields do not work. Remove these default paths and
configure the absolute paths manually.
Create a Profile
A monitoring profile specifies the NimRecorder script and its execution properties, which the probe runs on the target remote system. You can
create more than one profile and can monitor multiple applications response.
Follow these steps:
1. Click the Options (icon) next to the Profiles node in the navigation pane.
2. Select the Add New Profile option.
3. Define the Profile Name and click Submit.
4. Update the sections mentioned below and click Save.
Run Properties
This section lets you specify the NimRecorder script and its execution properties.
1. Specify the compiled script file, which the profile runs.
2. Define the arguments for executing the scripts.
3. Define the maximum execution time of the script.
On Timeout
This section lets you configure the actions when the NimRecorder script execution time exceeds the limit.
1. Select Dump Screen on Timeout to capture the snapshot of the application when the script execution time exceeds the limit.
2. Select Kill Process on Timeout to terminate the script execution process.
On Error Return
This section lets you configure the alarm when the NimRecorder script generates an error after execution.
1. Select Expected Return Values=0 to set 0 as the return code of the script.
2. Select Dump Screen on Error to capture and save the snapshot of the application when the error occurs.
3. Select Alarm Message, which matches with the alarm text and severity.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Note: Leave the Path/Dependent Files section blank when there are no dependent files.
7. Select the Validate Package Name option from the Actions drop-down list.
This option prevents you from creating a duplicate package with the same name and version.
OR
8. Select the Publish to Archive option from the Actions drop-down list for creating a package.
The probe creates a deployable package in the Archive directory and displays a success message on the probe GUI.
e2e_appmon Node
<Profile Name> Node
Variables Node
Script Node
e2e_appmon Node
The e2e_appmon node lets you view the probe information and configure the properties which are applicable to all monitoring profiles of the
probe.
Navigation: e2e_appmon
Set or modify the following values if needed:
e2e_appmon > Probe Information
This section provides information about the probe name, probe version, probe start time, and the probe vendor.
e2e_appmon > Log Level Configuration
This section is used to configure log properties of the probe.
Log Level: specifies the detail level of the log file.
Default: 0 - Normal
Log File Size (KB): defines the maximum size of the log file.
Default: 100
e2e_appmon > Run as User
This section is used for configuring the user details for providing necessary privileges to execute the NimRecorder script.
Name: defines the user name of the target system where the probe runs the NimRecorder script. The script has the same access level as
the user.
Default: administrator
Password: defines the password for authorizing the specified user name.
User Check to Prevent Script to be Run from the Wrong User: prevents the NimRecorder script execution by any other user, except the
one that is specified in the probe.
Default: Not selected
Reset Registry Settings Right After the User is Logged on: resets the registry settings after the user logs in successfully. Enabling this
option makes the target system less vulnerable to malicious attacks over a network. Also, when the remote desktop connection (RDP) is
used, a legal notice is displayed. However, ensure the presence of an automatic login to the system through the registry settings.
Default: Not selected
Note: This option aborts the login process on a slow processing system.
Note: The default relative paths of the Command and ROB File Directory fields do not work. Remove these default paths and
configure the absolute paths manually.
Last Started: identifies the date and time value when the NimRecorder script was last executed.
Running: indicates the NimRecorder script status, whether the script is currently executing.
Times Used (Last Run in Seconds): identifies the time that is consumed during last NimRecorder script execution.
Return Code (Last Run): identifies the return code of the NimRecorder script after last execution.
Time Started: identifies the number of times the script is executed after the probe is activated last.
Time Killed: identifies the number of times the probe has killed the NimRecorder script.
Time Failed to Start: identifies the number of failed NimRecorder script executions.
Maximum Run Time: identifies the maximum run time of the NimRecorder script among all executions.
This section lets you activate the QoS on NimRecorder script execution time. You can also configure dynamic and static thresholds on this QoS.
profile name > Alarm on Start Error
This section lets you view the alarm details when the NimRecorder script fails to start on the target system.
profile name > Alarm on Interval Breach
This section lets you view the alarm details when the start time of the NimRecorder script breaches the maximum start time threshold limit.
profile name > Alarm on Process Kill
This section lets you view the alarm details when the probe has to kill the NimRecorder script execution process.
profile name > Alarm on Disable
This section lets you view the alarm details when the probe has to disable the NimRecorder script execution due to any reason.
Note: The Alarm on Start Error, Alarm on Interval Breach, Alarm on Process Kill, and Alarm on Disable are configurable through
the e2e_appmon node. You can only view the alarm details at profile level.
Variables Node
The Variables node lets you define variables, which are used in multiple NimRecorder scripts. For example, you can put a global user name and
password in the variable value which can be used in multiple scripts.
Navigation: e2e_appmon > Robot Name > Variables
Set or modify the following values if needed:
Variables > Variables
This section lets you view the list of variables in the grid. Select any variable from the list and edit the variable value. You can also select the Dele
te option of the grid to remove the variable from the list.
Note: Use the Options icon next to the Variables node in the navigation pane to add variables.
Script Node
The Script node lets you create independent and deployable NimRecorder script packages. These packages can be deployed on the target robot
(similar to other probe packages) for monitoring. You can add multiple scripts and their dependent files to a deployable package.
Important! Refer to the e2e_appmon Script Considerations article, which provides information to create and deploy the script
packages.
Note: Use the Validate Package Name option from the Actions drop-down list and verify that the package name meets the
naming conventions.
Version: defines the script package version.
Default: 1.0
Description: defines a short description about the package functionality.
List of Scripts: lets you select and move a script from the Available list to the Selected list for creating the script package. The list is
available only when the probe is activated. You can add more than one script to the package file.
The script execution settings are inherited from the associated profile. If more than one profile is associated to a script, the script
execution settings are inherited from the first monitoring profile. If there is no monitoring profile, the default execution settings are used.
Script > Path/Dependent Files
This section lets you add the dependent files, which you want to deploy on the target system with the script package. For example, the script is
referring a .dll file during the execution. Use the New button for adding more than one file to the list.
Rob File: specifies the ROB file for which you can define a dependent file.
Path/Dependent Files: defines the dependent file name and path. Use the browse button for navigating to the correct path.
Note: Use the Save option from the Actions drop-down list for saving the dependent file details.
e2e_appmon IM Configuration
The e2e_appmon probe is a remote probe, configured to monitor response time and availability of the client applications. Each monitoring profile
can execute multiple scripts. You can also define thresholds for generating alarms when the script execution time exceeds the specified limit. The
QoS messages are also configured for saving the response time of the application.
The following diagram outlines the process to configure the e2e_appmon probe to monitor applications.
Verify Prerequisites
Specify a Directory
Create a Profile
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see E2E Application Response
Monitoring (e2e_appmon) Release Notes.
Specify a Directory
This section describes about how to enable the probe to run the compiled NimRecorder scripts on the target system.
Follow these steps:
1. Open the e2e_appmon probe configuration interface.
2. Specify the username and password for the target system where the probe runs the script. The script has the same access level as the
user.
3. Specify the path of the directory with the TaskExec.exe. This TaskExec.exe file is used for executing the compiled scripts on both 32-bit
and 64-bit platforms.
You can also click the Browse (...) button to navigate to the executable file.
4. Specify the path of the directory with the compiled script files.
You can also click Browse (...) to navigate to the executable file.
Note: The default relative paths of the Command and .ROB File Directory fields do not work. Remove these default paths and
configure the absolute paths manually.
Create a Profile
A monitoring profile specifies the NimRecorder script and its execution properties, which the probe runs on the target system. You can create
more than one profile and can monitor multiple applications response.
Follow these steps:
1. Navigate to the Status tab.
2. Right-click in the tab and select Add profile from the context menu.
3. Enter a Name of the profile.
4. Select Active to activate the profile.
On Timeout
This section lets you configure the actions when the NimRecorder script execution time exceeds the limit.
Follow these steps:
1. Select Dump screen on timeout to save a snapshot of the application when the script execution time exceeds the limit.
2. You can also select Kill processes on timeout to terminate the associated processes with the script on the timeout.
1.
Note: The On error return tab is enabled only if Expect return value = 0 field is selected.
2. Enable Alarm message to generate alarms when the script returns an error.
3. Select a specific alarm message to be generated when the script returns an error.
Setup Tab
Status Tab
Run Properties Tab
Scheduling Tab
Advanced Tab
Messages Tab
Variables Tab
Scripts Tab
Setup Tab
The Setup tab is used to configure general settings of the probe, which are common for all profiles.
The tab contains the following fields:
Run as user
Defines the login user name and password of the target computer where the probe runs the NimRecorder script.
User check to prevent scripts to be run from the wrong user
Prevents the NimRecorder script from being executed on the target system by unauthorized users.
Reset registry settings right after the user is logged on
Resets the registry settings after successful login of the user. Enable this option to make the target system less vulnerable to malicious
attacks over a network. Ensure the presence of the automatic logon settings in the registry.
Note: This option aborts the login process on a slow processing system.
Log Level
Sets the level of details that are written in the log file.
Note: Select a lower log level during the normal operation and minimize the disk consumption. You can increase the log level
while debugging.
Log Size
Defines the size of the log file for the probe for writing the log messages.
Default: 100 KB
Note: When the specified size is reached, the content of the file is cleared.
Run properties
Default run interval (seconds)
Defines the frequency at which the script runs.
Default max run time (seconds)
Defines the time duration the script is allowed to use on one run. This value can be overridden for each of the profiles that are defined
under the Status tab.
Command
Defines the path of the directory where the TaskExec.exe file is located. The default location of the TaskExec.exe file is \program
files (x86)\Nimsoft\e2e_scripting\bin. This TaskExec.exe file executes the compiled NimRecorder scripts on both 32-bit and 64-bit
platforms.
.ROB File Directory
Defines the path of the directory where the compiled NimRecorder script files are stored.
Default: \program files (x86)\Nimsoft\e2e_scripting\scripts
Note: The default relative paths of the Command and .ROB File Directory fields do not work. Remove these default paths
and configure the absolute paths manually.
Suspend
Suspends the execution of NimRecorder scripts. However, it is still possible to retrieve the last executed status and view the screenshots
of the script.
Status Tab
The Status tab is used to configure monitoring profiles for the e2e_appmon probe. This tab also displays a list of defined monitoring profiles and
allows you to edit and delete the profiles. Right-click in the window to add, edit, or delete the profiles. Right-click the profile under the Status tab
and select the View screendumps option, to view the screen dump for the profile.
Running
Displays the running state of the NimRecorder script. Yes means running and blank means not running.
Time used (last run)
Displays the time duration used in last execution of the NimRecorder script.
Return code (last run)
Displays the status code returned after last execution of the NimRecorder script.
Times started
Displays the number of times the NimRecorder script is executed after the e2e_appmon probe is activated last time.
Times killed
Displays the number of times the NimRecorder script is killed after the e2e_appmon probe is activated last time.
Times failed to start
Displays the number of times the NimRecorder script has failed to start after the e2e_appmon probe is activated last time.
Max. run time
Displays the maximum time for running a profile.
The Profile properties dialog appears on selecting the Add profile or Edit profile option. The Profile properties dialog contains the following
field:
Name
Displays the name of the profile.
The Profile properties dialog contains the following sub tabs:
Run properties
Scheduling
Advanced
Run Properties Tab
The Run properties tab is used to specify the NimRecorder script for the profile. This tab is also used to configure the runtime environment for
the script for handling timeout and error situations.
The Run properties tab contains the following fields:
Compiled script (.rob file)
Specifies a NimRecorder script, a ROB file, from the drop-down list. You can select the files from the directory as specified in the ROB
file directory field of the Setup tab.
Arguments
Defines the parameter required to run the script.
Max. run time
Defines the maximum time for which the NimRecorder script is allowed to run.
Note: This value overrides the default value that is specified in the Setup tab.
Specifies the ROB file (a compiled NimRecorder script) from the drop-down list, which runs when the script times out or returns an error.
This script is used to clean up all the previous actions to run other scripts.
Send QoS on total run time
Generates the QoS data that is related to the script run time.
Note: The NimRecorder script itself is configured to send QoS data using the developer version of the probe.
Scheduling Tab
The Scheduling tab is used to schedule the run time of the NimRecorder script. This tab also allows you to configure when the script can or
cannot run.
The Scheduling tab contains the following fields:
Run Interval
Specifies the interval between two consecutive executions of each profile. The interval can either be on every probe interval or be at a
specified time interval.
Only run on
Provides the following options for restricting the NimRecorder script execution:
In time ranges
Specifies a comma-separated list of time ranges. For example, 10:05-11:30, 12:34-16:00
Weekdays
Specifies a comma-separated list of weekdays or range of weekdays. For example, mon, thu-sat
Days of months
Specifies a comma-separated list of month dates. For example, 2-5,14-16,21
Do not run on
Defines a comma-separated list of dates (in day.month format) when the script does not run. For example, 5.1, 9.1
Advanced Tab
The Advanced tab allows you to select the source of the QoS and Alarm messages.
Note: This machine name appears as the source in QoS and alarm messages.
Override with
Defines a custom hostname, which appears as the source in QoS and alarm messages.
Messages Tab
The Messages tab is used to maintain a pool of alarm messages. These alarm messages are used across the monitoring profiles of the
e2e_appmon probe. By default, there are five alarm messages. You can add, edit, and delete messages to this message pool.
In the drop-down lists of the Alarm message setup grid, you can specify the alarm message to be issued for four different alarm conditions:
Alarm on start error
Alarm on interval breach
Alarm on process kill
Disable after (specific number of) errors and send a message
Alarm on unexpected returned value
The following options appear on right-clicking the message list:
Add message: This option enables you to define new alarm messages.
Message properties: This option enables you to edit one of the existing alarm messages.
Remove messages: This option enables you to delete alarm messages.
On selecting the Add message or Message properties option, the Message properties dialog appears.
The Message properties dialog contains the following fields:
Name
Defines a unique name of the message. This name is used to refer to the message from the profiles.
Default for
Specifies the alarm situations to be selected as the default alarm message for a specific type of alarm message.
Text
Defines the alarm message text. The following variables can be used:
$profile: profile name.
$failed: number of consecutive failures.
$sec: seconds delay on interval breach
$error: error message on start error.
$return - actual return code from last run
$expected_return - expected return code after successful run of the script.
Level
Specifies the severity level assigned to the alarm messages.
Subsystem
Defines the subsystem_ID of the alarms, the watcher generates. A string or the subsystem id is managed by the nas probe.
Variables Tab
The Variables tab lets you define variables that is used in the NimRecorder script. For example, if you are using a password in the script and
want to encrypt and protect the password being presented in the raw form. In such cases, define the password as a variable and select the Encry
pt option.
The Variables tab contains the following field:
Quality of Service Variables
The probe sends QoS messages on the NimRecorder script run time. The probe exports the QoS to the script also enables the script to
send QoS.
Three options Add variable, Variable properties, and Remove variable appear when you right-click the Variables list. On selecting the Add
variable or Variable properties option, the Variable properties dialog displays. The Variable properties dialog is used to add or modify the
variable properties.
The Variable properties dialog contains the following fields:
Variable name
Defines the variable name, which is unique for each variable.
Crypted
Encrypts the variable value and prevents the same being displayed in the human readable format.
Variable value
Defines a value to be assigned to the variable.
Scripts Tab
The Scripts tab allows you to create independent and deployable NimRecorder script packages. These packages can be deployed on the target
robot (similar to other probe packages) for monitoring. You can add more than one NimRecorder script and their dependent files to one
deployable package.
Important! Refer to the e2e_appmon Script Considerations article, which provides information to create and deploy the script
packages.
The selected profile settings are used to execute the script. Double-click the profile name for editing the profile properties of the script
while selecting the profile.
If no profile is associated with a script, then such script displays in red color and default settings are used for its execution. You can
double-click the script name to define the script properties.
Path/Dependent Files
Allows you to browse the dependent files (for example, .bmp files) required to execute the script. You can select more than one file of the
same directory in one attempt. This field remembers the last navigation path.
Add Files
Allows you to add the path of dependent files in the Paths list-box.
Paths List Box
Displays the path list of dependent files to be included, while generating the package. You can right-click any of the dependent files and
can select the Delete to remove the selected file from the list.
Publish to Archive
Publishes the script package to the Archive directory of the primary hub.
is created with the ROB file name in the configuration file after deploying the package.
A profile and a script cannot have the same name on the target robot, otherwise either the profile name, or the script name is overridden.
The robot does not display any warning message while deploying the probe.
The dependent files can only be added from the C:\ drive. This process is in line with how current script path is configured in the probe.
All the files are copied to the folders according to the robot on which script is recorded, after deploying the package.
The robots, on which script package is deployed, must be running the e2e_appmon probe. The Distribution Manager deploys the script
package without any validation.
You cannot deploy only one script from a package (in case the package contains more than one script) to the robot.
e2e_appmon API
The e2e_appmon developer edition allows the developer to include checkpoints within the NimRecorder script. This functionality enables the
probe to measure intermediate times of each process with the total runtime of the script.
The package contains the CA Nimsoft API to be used with the e2e_appmon probe. The API allows the NimRecorder to use the functions in the
script. The functions enable the programmer to access alarm and quality of service functions.The alarms and functions are acessed from the
Nimsoft API in their in-house developed scripts.
This API contains core Nimsoft functions (for sending Alarms and QoS) and other supporting functions to make scripting easier.
While using the e2e_appmon probe:
Your script must begin with Include Nimsoft-functions.src.
The Nimsoft-functions.src file must be in the same directory as the script.
The file nimmacro.dll must be in the same directory as Nimsoft-functions.src.
SDK_appmon offers advanced scripting by offering the following functionalities:
Gathering multiple QoS points to identify the bottlenecks
Pre and post run cleanup
Synchronization
Error handling
Tuning your script to minimize the resource usage
Hiding or encrypting the passwords
Working with Citrix
Contents
nimGetEnv
nimGetEnvEx
nimStartRun
nimProcessExist
nimWaitForWebContents
nimWaitForWebContentsEx
nimWaitForWindow
nimWaitForWindowText
nimAlarm
nimAlarmSimple
nimInit
nimEnd
nimSetVariable
nimQoSSendTimer
nimQoSSendNull
nimQoSSendValue
nimQoSSendValueStdev
nimQoSStart
nimQoSStop
nimQoSReset
nimQoSGetTimer
nimInitWithMax
nimSetCi
nimLogin
nimActivateTotRule
nimDeactivateTotRule
nimGetEnv
nimGetEnv$(var$,def$)
Parameters
String
var$
String
def$
Returns
String environment value
Description
Returns the contents of the environment variable referred to in var$. If the variable is not in the environment, by default it returns the value of def$.
Example
home$=nimGetEnv("HOMEDRIVE", "C:")
nimGetEnvEx
nimGetEnvEx$(var$,def$,ask,pass).
Parameters
String
var$
String
def$
Number
ask
Number
pass
Returns
String value
Description
Returns the contents of the environment variable referred to in var$. If the variable is not in the environment, it defaults to def$. If ask is set to 1,
the dialog prompts you to input the string. If the pass is set to 1, the dialog masks out the input string.
Example
user$= nimGetEnvEx("Test-user","admin",1,0)
pass$= nimGetEnvEx("Test-pass","admin","",1,1)
nimStartRun
nimStartRun(cmd$)
Parameters
String
cmd$
The cmd$ to be executed from the Run entry on the Start menu.
Returns
0: OK.
1: Failed to get the Run window.
Description
Allows you to execute the cmd$ from the Run entry on the Start menu. The caller ensures that the command to be executed uses <doublequote>
and other special keys. This is to ensure that the command understandable to SendKeys.
nimProcessExist
nimProcessExist(procname$,killproc)
Parameters
String
procname$
Number
killproc
Terminates the specified process soft (if 1), and hard (if 2).
Returns
0: process not found.
1: process found.
Description
Locate named process (procname) and terminates the process if (killproc) is (1=soft,2=hard).
Note: The method is deprecated and in wintask 2.6a the function killapp() was added and does the same thing natively.
nimWaitForWebContents
nimWaitForWebContents(pageid$,contents$,load_timeout)
Parameters
String
pageid$
String
contents$
Number
load_timeout
Returns
0: timeout, no match.
1: match.
Description
Waits for matching contents of the web page for (load_timeout) seconds.
nimWaitForWebContentsEx
nimWaitForWebContentsEx(pageid$,contents$,content_fail$,load_timeout)
Parameters
String
pageid$
String
contents$
String
content_fail$
The failure match to be applied as failure match when the content specified fails to match.
Number
load_timeout
Returns
0: timeout, no match.
1: match.
Description
Waits for matching contents of the web page for (load_timeout) seconds. Applies a failure match too when the content fails to match.
nimWaitForWindow
nimWaitForWindow(winid$,load_timeout)
Parameters
String
winid$
Number
load_timeout
The number of seconds to wait for the specified window to appear before timeout.
Returns
0: timeout, no match
1: match
Description
Waits for a window matching (winid$) to appear within (load_timeout) seconds.
nimWaitForWindowText
nimWaitForWindowText(winid$,textstr$,load_timeout)
Parameters
String
winid$
String
textstr$
Number
load_timeout
Returns
0: timeout, no match
1: match
Description
Waits for a window matching (winid$) containing the text string (textstr$) to appear within (load_timeout) seconds.
nimAlarm
nimAlarm(severity,msg$,supp$,subsys$)
Parameters
Number
severity
String
msg$
String
supp$
Suppression ID.
String
subsys$
The id of the subsystem, identifying the module that the alarm is related to.
Returns
0 = OK
Description
Sends alarm message containing severity level, message text, checkpoint id, and subsystem id.
Example
nimAlarm(5,"critical alarm","script-name","E2E-appmon")
nimAlarmSimple
nimAlarmSimple(severity,msg$)
Parameters
Number
severity
String
msg$
Returns
0 = OK.
Description
Sends alarm message containing severity level and message text. The rest is provided by the global variables suppression_id$ and subsystem$.
Example
nimAlarm(5,"critical alarm")
nimInit
nimInit()
Parameters
None
Returns
0 = OK
Description
Initializes the Nimsoft SDK components. Must be run if using QoS.
nimEnd
nimEnd()
Parameters
None.
Returns
0 = OK
Description
Unloads the Nimsoft components and releases memory.
nimSetVariable
nimSetVariable(variable$, value$)
Parameters
None
Returns
0 = OK
Description
Sets the global variable named (variable$) to a new value. Allows you to override the suppression-id or subsystem.
Example
nimSetVariable("suppression-id","script")
nimQoSSendTimer
nimQoSSendTimer(target$)
Parameters
String
target$
Returns
0 = OK.
Description
Sends the recorded timer for the specified target.
Example
nimQoSSendTimer("citrix login")
nimQoSSendNull
nimQoSSendNull(target$)
Parameters
String
target$
Returns
0 = OK
Description
Sends a NULL (invalid data) for the specified target.
Example
nimQoSSendNull("citrix login")
nimQoSSendValue
nimQoSSendValue(target$,value)
Parameters
String
target$
Number
value
Returns
0 = OK.
Description
Sends the recorded value (when not using timers) to the specified target.
Example
nimQoSSendValue("xxx",69)
nimQoSSendValueStdev
nimQoSSendValueStdev(target$,value$,stdev$)
Parameters
String
target$
String
value$
String
stdev$
Returns
0 = OK.
Description
Sends the value and standard deviation (when not using timers).
nimQoSStart
nimQoSStart()
Parameters
None
Returns
Nothing
Description
Starts the QoS timer.
nimQoSStop
nimQoSStop()
Parameters
None
Returns
Nothing
Description
Stops the QoS timer.
nimQoSReset
nimQoSReset()
Parameters
None
Returns
Nothing
Description
Resets the QoS timer.
nimQoSGetTimer
nimQoSGetTimer()
Parameters
None
Returns
Timer in milliseconds
Description
Returns the stored (last) QoS Timer. If the nimQoSStop function has notbeencalled, then the time between nimQoSStart was called and the
current time is returned.
nimInitWithMax
nimInitWithMax(qos_name$, source$, description$, long_unit$, short_unit$, max_value)
Parameters
String
qos_name$
String
source$
String
description$
String
long_unit
String
short_unit
Abbr. ex ms.
Int
max_value
Returns
0 = OK.
Description
This function allows you to set various QoS parameters.
Example
nimInitWithMax(QOS_E2E_EXECUTION, QOS_source1, description$, Milliseconds, ms, 100)
nimSetCi
nimSetCi(type$, name$, remotename$, Metric$)
This function is used to add an additional device and metric information to alarms or quality of service message sent afterwards.
Parameters
String
type$
The CI type Id
String
name$
String
remotename$
If the object is remote, the remotename is the hostname of the object owner. An empty string denotes the local host.
String
Metric$
The metric identifies the required property of the monitored object. For example, response time, usage or other
performance parameters of the application.
Example
nimSetCi("3.21", "e2eappmon", "", "3.21:1") nimInit()
This API is required to be called before nimInit().
Setting nimSetCi before nimInit ensures that same Device Id/Metric Id are generated for all alarms and QoS.
For modifying Metric Id of Alarm, call the above API with modified parameters before nimAlarmSimple API.
nimLogin
nimLogin()
Parameters
String
username$
String
password$
Returns
None
Description
This function allows you to login to the CA UIM where the probe is installed.
nimActivateTotRule
nimActivateTotRule(time$,windows$,autoClear$,clearTime$)
Parameters
Number
time$
The time duration, in seconds, when a metric must remain over threshold before an alarm is sent.
Number
windows$
The time duration, in seconds, in the sliding window in which metrics are monitored for threshold violations.
Number
autoClear$
The state that defines whether the auto-clear functionality is enabled. The value must be either 0 or 1.
String
clearTime$
The time, in seconds, used in the auto-clear timer. If no alarms are sent in the set time period, the alarm is
automatically cleared.
Returns
None
Description
This function activates the Time Over Threshold (TOT) rule in the probe from your script.
Example
nimActivateTotRule(120,600,0,0)
nimDeactivateTotRule
nimDeactivateTotRule(,time$,windows$,autoClear$,clearTime$)
Parameters
Number
time$
The time duration, in seconds, when a metric must remain over threshold before an alarm is sent.
Number
windows$
The time duration, in seconds, in the sliding window in which metrics are monitored for threshold violations.
Number
autoClear$
The state that defines whether the auto-clear functionality is enabled. The value must be either 0 or 1.
String
clearTime$
The time, in seconds, used in the auto-clear timer. If no alarms are sent in the set time period, the alarm is
automatically cleared.
Returns
None
Description
This function deactivates the TOT rule in the probe from your script.
Example
nimDeactivateTotRule(180,1200,0,0)
e2e_appmon Troubleshooting
This section contains some troubleshooting points for the e2e_appmon probe.
Solution:
Instead of deploying the e2e_appmon probe, deploy the e2e_appmon_dev (1.91 or later) probe. The NimRecorder is deployed automatically.
OR
For the e2e_appmon probe (standard edition), install the NimRecorder manually.
Follow these steps:
1. Go to Start > All Programs > Nimsoft Monitoring > E2E Scripting and select the Uninstall NimRecorder.
The Script Executor is removed.
2. Go to [Nimsoft Installation drive] > Nimsoft > probes > Application > e2e_appmon > Install folder.
3. Double-click the nimrecorder.msi.
Note: The name of .msi installer file is based on the NimRecorder version. For example, it is nimrecorder51.msi fo
r installing the NimRecorder 5.1.
Drop an email to info@wintask.com for troubleshooting your script-related issues and support on NimRecorder 5.1.
e2e_appmon Metrics
The following section describes the metrics that can be configured for the E2E Application Response Monitoring (e2e_appmon) probe.
Contents
QoS Metrics
Alert Metrics Default Settings
QoS Metrics
The following table describes the checkpoint metrics that can be configured using the E2E Application Response Monitoring (e2e_appmon) probe
.
Monitor Name
Units
Description
Version
QOS_E2E_EXECUTION
Milliseconds
v1.0
Warning Threshold
Warning Severity
Error Threshold
Error Severity
Description
Disabled
None
None
None
Major
Killed
None
None
None
Minor
ReturnError
None
None
None
Minor
StartError
None
None
None
Minor
TimeBreached
None
None
None
Warning
This section describes the functions that must be added to your script file to generate custom metrics. Refer e2e_appmon API article for the
description.
1. nimLogin("username$","password$")
2. nimInit2()
Note: The nimInit2() function is used when generating custom QoS. Refer nimInit() function in the e2e_appmon API article for
the description.
3. nimQoSStart()
4. Start a web based application to monitor the execution time. For example, StartBrowser("IE", "about:blank",3)
5. nimQoSStop()
6. nimQoSSendTimer(qos$(step))
7. nimAlarm(severity$,qos$(step) + "user-defined message","user-defined script","e2e_appmon")
include "NimBUS-functions"
nimSetCi("3.21","e2eappmon_test","","3.21:6")
nimActivateTotRule(120,600,0,0)
Note: This is a necessary step otherwise the TOT alarm interval will reset everytime the script is run.
6. In another script that generates custom QoS and alarms, ensure that the nimLogin() function is executed in the beginning. Refer the
attached ScriptA.
ScriptA.src
Important! Ensure that the parameters provided in the nimSetCi() function is same in both the scripts so that same metric ids
are generated for a custom metric.
7. From the probe GUI, create and set APPMON_RUN_TOT = 1 in the Variables section.
Note: If this variable is set to or edited to 0, the probe does not generate TOT alarms.
The e2e_appmon probe is a remote probe, configured to monitor response time and availability of the client applications. Each monitoring profile
can execute multiple scripts. You can also define thresholds for generating alarms when the script execution time exceeds the specified limit. The
QoS messages are also configured for saving the response time of the application.
The following diagram outlines the process to configure the e2e_appmon probe to monitor applications.
Contents
Verify Prerequisites
Specify a Directory
Create a Profile
Alarm Thresholds
Create a Deployable Script Package
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see E2E Application Response
Monitoring (e2e_appmon) Release Notes.
Specify a Directory
This section describes about how to enable the probe to run the compiled NimRecorder scripts on the target system.
Follow these steps:
1. Open the e2e_appmon probe.
2. Specify the username and password for the target system where the probe runs the NimRecorder script. The script has the same access
2.
level as the user.
3. Specify the path of the directory with the TaskExec.exe. This TaskExec.exe file is used for executing the compiled scripts on both 32-bit
and 64-bit platforms.
You can also click Browse to navigate to the executable file.
4. Specify the path of the directory with the compiled script files.
You can also click Browse to navigate to the executable file.
Note: The default relative paths of the Command and .ROB File Directory fields do not work. Remove these default paths and
configure the absolute paths manually.
Create a Profile
A monitoring profile specifies the NimRecorder script and its execution properties, which the probe runs on the target remote system. You can
create more than one profile and can monitor multiple applications response.
Follow these steps:
1. Click the Options (icon) next to the Profiles node in the navigation pane.
2. Select the Add New Profile option.
3. Define the Profile Name and click Submit.
4. Update the sections mentioned below and click Save.
Run Properties
This section lets you specify the NimRecorder script and its execution properties.
1. Specify the compiled script file, which the profile runs.
2. Define the arguments for executing the scripts.
3. Define the maximum execution time of the script.
On Timeout
This section lets you configure the actions when the NimRecorder script execution time exceeds the limit.
1. Select Dump Screen on Timeout to capture the snapshot of the application when the script execution time exceeds the limit.
2. Select Kill Process on Timeout to terminate the script execution process.
On Error Return
This section lets you configure the alarm when the NimRecorder script generates an error after execution.
1. Select Expected Return Values=0 to set 0 as the return code of the script.
2. Select Dump Screen on Error to capture and save the snapshot of the application when the error occurs.
3. Select Alarm Message, which matches with the alarm text and severity.
Cleanup on Timeout or Error Return
This section lets you specify another NimRecorder script in the Run .Rob File field, which runs when the profile script times out or returns an
error.
Source Used for Quality of Service and Alarm Messages
This section lets you specify another NimRecorder script, which runs when the profile script times out or returns an error.
1. Select Override With to enable the Source field for specifying the custom QoS and Alarm source.
2. Enter the Source to define the custom QoS and Alarm source.
The probe starts executing the script on the target system for generating alarms and QoS messages.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Note: Leave the Path/Dependent Files section blank when there are no dependent files.
7. Select the Validate Package Name option from the Actions drop-down list.
This option prevents you from creating a duplicate package with the same name and version.
OR
8. Select the Publish to Archive option from the Actions drop-down list for creating a package.
The probe creates a deployable package in the Archive directory and displays a success message on the probe GUI.
e2e_appmon Node
<Profile Name> Node
Variables Node
Script Node
e2e_appmon Node
The e2e_appmon node lets you view the probe information and configure the properties which are applicable to all monitoring profiles of the
probe.
Navigation: e2e_appmon
Set or modify the following values if needed:
e2e_appmon > Probe Information
This section provides information about the probe name, probe version, probe start time, and the probe vendor.
Note: This option aborts the login process on a slow processing system.
Note: The default relative paths of the Command and ROB File Directory fields do not work. Remove these default paths and
configure the absolute paths manually.
The profile name node represents the actual monitoring profile of the probe. This node lets you configure the profile-specific properties for
monitoring and displays all the monitoring profiles.
Navigation: e2e_appmon > Robot Name > Profiles > profile name
Set or modify the following values if needed:
profile name > Profiles
This section lets you activate or deactivate the monitoring profile.
profile name > Run Properties
This section lets you configure the NimRecorder script and its execution properties.
Compiled Script (ROB File): specifies the compiled script file, which the profile runs. The path of the ROB Files Directory field in the Run
Properties section of the e2e_appmon node is used for fetching list of compiled scripts.
Arguments: defines the parameters for executing the scripts.
Maximum Run Time (Seconds): defines the maximum execution time of the script. The probe generates an alarm when the execution
time exceeds this limit.
profile name > On Timeout
This section lets you configure the actions when the NimRecorder script execution time exceeds the limit.
Dump Screen on Timeout: captures the snapshot of the application when the timeout occurs.
Default: Not selected
Kill Process on Timeout: terminates the script execution process.
Default: Selected
profile name > On Error Return Alarm
This section lets you view the alarm details when the NimRecorder script generates an error after execution.
profile name > On Error Return
This section lets you configure the alarm when the NimRecorder script generates an error after execution.
Expected Return Values=0: expects 0 as the return code of the script. The probe generates the alarm, otherwise.
Default: Not selected
Dump Screen on Error: captures and saves the snapshot of the application when the error occurs.
Default: Not selected
Alarm Message: specifies the alarm message name, which identifies the alarm text and severity.
profile name > Cleanup on Timeout or Error Return
This section lets you specify another NimRecorder script, which runs when the profile script times out or returns an error. This script is used to
Note: The Alarm on Start Error, Alarm on Interval Breach, Alarm on Process Kill, and Alarm on Disable are configurable through
the e2e_appmon node. You can only view the alarm details at profile level.
The Variables node lets you define variables, which are used in multiple NimRecorder scripts. For example, you can put a global user name and
password in the variable value which can be used in multiple scripts.
Navigation: e2e_appmon > Robot Name > Variables
Set or modify the following values if needed:
Variables > Variables
This section lets you view the list of variables in the grid. Select any variable from the list and edit the variable value. You can also select the Dele
te option of the grid to remove the variable from the list.
Note: Use the Options icon next to the Variables node in the navigation pane to add variables.
The Script node lets you create independent and deployable NimRecorder script packages. These packages can be deployed on the target robot
(similar to other probe packages) for monitoring. You can add multiple scripts and their dependent files to a deployable package.
Important! Refer to the e2e_appmon Script Considerations article, which provides information to create and deploy the script
packages.
Note: Use the Validate Package Name option from the Actions drop-down list and verify that the package name meets the
naming conventions.
Version: defines the script package version.
Default: 1.0
Description: defines a short description about the package functionality.
List of Scripts: lets you select and move a script from the Available list to the Selected list for creating the script package. The list is
available only when the probe is activated. You can add more than one script to the package file.
The script execution settings are inherited from the associated profile. If more than one profile is associated to a script, the script
execution settings are inherited from the first monitoring profile. If there is no monitoring profile, the default execution settings are used.
Script > Path/Dependent Files
This section lets you add the dependent files, which you want to deploy on the target system with the script package. For example, the script is
referring a .dll file during the execution. Use the New button for adding more than one file to the list.
Rob File: specifies the ROB file for which you can define a dependent file.
Path/Dependent Files: defines the dependent file name and path. Use the browse button for navigating to the correct path.
Note: Use the Save option from the Actions drop-down list for saving the dependent file details.
Verify Prerequisites
Specify a Directory
Create a Profile
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see E2E Application Response
Monitoring (e2e_appmon) Release Notes.
Specify a Directory
This section describes about how to enable the probe to run the compiled NimRecorder scripts on the target system.
Follow these steps:
1. Open the e2e_appmon probe configuration interface.
2. Specify the username and password for the target system where the probe runs the script. The script has the same access level as the
user.
3. Specify the path of the directory with the TaskExec.exe. This TaskExec.exe file is used for executing the compiled scripts on both 32-bit
and 64-bit platforms.
You can also click the Browse (...) button to navigate to the executable file.
4. Specify the path of the directory with the compiled script files.
You can also click Browse (...) to navigate to the executable file.
Note: The default relative paths of the Command and .ROB File Directory fields do not work. Remove these default paths and
configure the absolute paths manually.
Create a Profile
A monitoring profile specifies the NimRecorder script and its execution properties, which the probe runs on the target system. You can create
more than one profile and can monitor multiple applications response.
Follow these steps:
1. Navigate to the Status tab.
2. Right-click in the tab and select Add profile from the context menu.
3. Enter a Name of the profile.
4. Select Active to activate the profile.
Specify Run Properties
This section lets you define the NimRecorder script and its execution properties.
Follow these steps:
1. Select a script to run, from the Compiled script (.rob file) field.
2. Define the argument to run the script.
3. Specify the Max. run time value (in seconds) for which the script is allowed to run.
4. Select Expect return value = 0 to determine the return value from the scripts. The field also determines whether the script is executed
successfully.
The On error return tab, is enabled if Expect return value = 0 field is selected. The On error return, displays the error has occurred in the
script.
On Timeout
This section lets you configure the actions when the NimRecorder script execution time exceeds the limit.
Follow these steps:
1. Select Dump screen on timeout to save a snapshot of the application when the script execution time exceeds the limit.
2. You can also select Kill processes on timeout to terminate the associated processes with the script on the timeout.
On Error Return Tab
This section lets you configure the alarm, when the NimRecorder script generates an error after execution.
Follow these steps:
1. Select Dump error on screen to save a snapshot of the application when the script does not return an expected result.
Note: The On error return tab is enabled only if Expect return value = 0 field is selected.
2. Enable Alarm message to generate alarms when the script returns an error.
3. Select a specific alarm message to be generated when the script returns an error.
Cleanup on timeout or error return
This section lets you specify another NimRecorder script, which runs when the profile script times out or returns an error.
Follow these steps:
1. Select a script in the Run.ROB file field, which runs when the script is timed out returns an error.
2. Schedule the script run time in the Scheduling tab.
3. Select Send QoS on total run time to generate QoS data for the script run time.
A profile is created to start executing the script on the target system for generating alarms and QoS.
Setup Tab
Status Tab
Run Properties Tab
Scheduling Tab
Advanced Tab
Messages Tab
Variables Tab
Scripts Tab
Setup Tab
The Setup tab is used to configure general settings of the probe, which are common for all profiles.
The tab contains the following fields:
Run as user
Defines the login user name and password of the target computer where the probe runs the NimRecorder script.
User check to prevent scripts to be run from the wrong user
Prevents the NimRecorder script from being executed on the target system by unauthorized users.
Reset registry settings right after the user is logged on
Resets the registry settings after successful login of the user. Enable this option to make the target system less vulnerable to malicious
attacks over a network. Ensure the presence of the automatic logon settings in the registry.
Note: This option aborts the login process on a slow processing system.
Log Level
Sets the level of details that are written in the log file.
Note: Select a lower log level during the normal operation and minimize the disk consumption. You can increase the log level
while debugging.
Log Size
Defines the size of the log file for the probe for writing the log messages.
Default: 100 KB
Note: When the specified size is reached, the content of the file is cleared.
Run properties
Default run interval (seconds)
Defines the frequency at which the script runs.
Default max run time (seconds)
Defines the time duration the script is allowed to use on one run. This value can be overridden for each of the profiles that are defined
under the Status tab.
Command
Defines the path of the directory where the TaskExec.exe file is located. The default location of the TaskExec.exe file is \program
files (x86)\Nimsoft\e2e_scripting\bin. This TaskExec.exe file executes the compiled NimRecorder scripts on both 32-bit and 64-bit
platforms.
.ROB File Directory
Defines the path of the directory where the compiled NimRecorder script files are stored.
Default: \program files (x86)\Nimsoft\e2e_scripting\scripts
Note: The default relative paths of the Command and .ROB File Directory fields do not work. Remove these default paths
and configure the absolute paths manually.
Suspend
Suspends the execution of NimRecorder scripts. However, it is still possible to retrieve the last executed status and view the screenshots
of the script.
Status Tab
The Status tab is used to configure monitoring profiles for the e2e_appmon probe. This tab also displays a list of defined monitoring profiles and
allows you to edit and delete the profiles. Right-click in the window to add, edit, or delete the profiles. Right-click the profile under the Status tab
and select the View screendumps option, to view the screen dump for the profile.
The tab contains the following fields:
Name
Displays the name of the profile.
ROB file
Displays the name of the NimRecorder script (a ROB file) run by the probe using different profiles.
Last started
Displays the last time that the probe was started.
Running
Displays the running state of the NimRecorder script. Yes means running and blank means not running.
Time used (last run)
Displays the time duration used in last execution of the NimRecorder script.
Return code (last run)
Displays the status code returned after last execution of the NimRecorder script.
Times started
Displays the number of times the NimRecorder script is executed after the e2e_appmon probe is activated last time.
Times killed
Displays the number of times the NimRecorder script is killed after the e2e_appmon probe is activated last time.
Times failed to start
Displays the number of times the NimRecorder script has failed to start after the e2e_appmon probe is activated last time.
Max. run time
Displays the maximum time for running a profile.
The Profile properties dialog appears on selecting the Add profile or Edit profile option. The Profile properties dialog contains the following
field:
Name
Displays the name of the profile.
The Profile properties dialog contains the following sub tabs:
Run properties
Scheduling
Advanced
Run Properties Tab
The Run properties tab is used to specify the NimRecorder script for the profile. This tab is also used to configure the runtime environment for
the script for handling timeout and error situations.
The Run properties tab contains the following fields:
Compiled script (.rob file)
Specifies a NimRecorder script, a ROB file, from the drop-down list. You can select the files from the directory as specified in the ROB
file directory field of the Setup tab.
Arguments
Defines the parameter required to run the script.
Max. run time
Defines the maximum time for which the NimRecorder script is allowed to run.
Note: This value overrides the default value that is specified in the Setup tab.
Note: The NimRecorder script itself is configured to send QoS data using the developer version of the probe.
Scheduling Tab
The Scheduling tab is used to schedule the run time of the NimRecorder script. This tab also allows you to configure when the script can or
cannot run.
The Scheduling tab contains the following fields:
Run Interval
Specifies the interval between two consecutive executions of each profile. The interval can either be on every probe interval or be at a
specified time interval.
Only run on
Provides the following options for restricting the NimRecorder script execution:
In time ranges
Specifies a comma-separated list of time ranges. For example, 10:05-11:30, 12:34-16:00
Weekdays
Specifies a comma-separated list of weekdays or range of weekdays. For example, mon, thu-sat
Days of months
Specifies a comma-separated list of month dates. For example, 2-5,14-16,21
Do not run on
Defines a comma-separated list of dates (in day.month format) when the script does not run. For example, 5.1, 9.1
Advanced Tab
The Advanced tab allows you to select the source of the QoS and Alarm messages.
Note: This machine name appears as the source in QoS and alarm messages.
Override with
Defines a custom hostname, which appears as the source in QoS and alarm messages.
Messages Tab
The Messages tab is used to maintain a pool of alarm messages. These alarm messages are used across the monitoring profiles of the
e2e_appmon probe. By default, there are five alarm messages. You can add, edit, and delete messages to this message pool.
In the drop-down lists of the Alarm message setup grid, you can specify the alarm message to be issued for four different alarm conditions:
Alarm on start error
Alarm on interval breach
Alarm on process kill
Disable after (specific number of) errors and send a message
Alarm on unexpected returned value
The following options appear on right-clicking the message list:
Add message: This option enables you to define new alarm messages.
Message properties: This option enables you to edit one of the existing alarm messages.
Remove messages: This option enables you to delete alarm messages.
On selecting the Add message or Message properties option, the Message properties dialog appears.
The Message properties dialog contains the following fields:
Name
Defines a unique name of the message. This name is used to refer to the message from the profiles.
Default for
Specifies the alarm situations to be selected as the default alarm message for a specific type of alarm message.
Text
Defines the alarm message text. The following variables can be used:
$profile: profile name.
$failed: number of consecutive failures.
$sec: seconds delay on interval breach
The Variables tab lets you define variables that is used in the NimRecorder script. For example, if you are using a password in the script and
want to encrypt and protect the password being presented in the raw form. In such cases, define the password as a variable and select the Encry
pt option.
The Variables tab contains the following field:
Quality of Service Variables
The probe sends QoS messages on the NimRecorder script run time. The probe exports the QoS to the script also enables the script to
send QoS.
Three options Add variable, Variable properties, and Remove variable appear when you right-click the Variables list. On selecting the Add
variable or Variable properties option, the Variable properties dialog displays. The Variable properties dialog is used to add or modify the
variable properties.
The Variable properties dialog contains the following fields:
Variable name
Defines the variable name, which is unique for each variable.
Crypted
Encrypts the variable value and prevents the same being displayed in the human readable format.
Variable value
Defines a value to be assigned to the variable.
Scripts Tab
The Scripts tab allows you to create independent and deployable NimRecorder script packages. These packages can be deployed on the target
robot (similar to other probe packages) for monitoring. You can add more than one NimRecorder script and their dependent files to one
deployable package.
Important! Refer to the e2e_appmon Script Considerations article, which provides information to create and deploy the script
packages.
same directory in one attempt. This field remembers the last navigation path.
Add Files
Allows you to add the path of dependent files in the Paths list-box.
Paths List Box
Displays the path list of dependent files to be included, while generating the package. You can right-click any of the dependent files and
can select the Delete to remove the selected file from the list.
Publish to Archive
Publishes the script package to the Archive directory of the primary hub.
More Information:
For complete instructions about deploying, accessing, and using the CA ecoMeter probe within the CA ecoMeter for UIM environment,
go to CA ecoMeter for UIM.
More information:
email_response (Email Response Monitoring) Release Notes
email_response Metrics
Configure a Node
This procedure explains the process of configuring a particular section of a node. Each section within a node allows you to configure the
properties of the probe for monitoring the Internet mail response.
Follow these steps:
1. Select the appropriate navigation path.
2. Update the field information and click Save.
2.
The specified section of the probe is configured. The probe is now ready to monitor the Internet mail response.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Manage Profiles
You can add a monitoring profile which is displayed as a child node under the Profiles node. You can then configure the profile to monitor the
Internet mail response.
Follow these steps:
1. Click the Options icon beside the Profiles node.
2. Click the Add New Profile option.
3. Update the field information and click Submit.
The profile is saved and you can configure the profile properties to monitor the Internet mail response.
email_response Node
<Profile Name> Node
Profile-<Profile Name> Node
email_response Node
In the email_response node, you can view the details of the Email Response probe and can configure the log properties. You can also view the
details of the alarm messages that are defined on the Email Response probe and create a monitoring profile.
Navigation > email_response
Set or modify the following values based on your requirements:
email_response > Probe Information
This section allows you to view the name of the probe, probe version, start time of the probe and the vendor who created the probe.
email_response > Setup
This section allows you to configure the log properties of the probe.
Log Level: Specifies the depth of the details for which the log has to be maintained.
Default: 0-Fatal
Log Size (KB): Specifies the size of the file in which the internal log messages of the Email Response probe are saved.
Protocol debug information to log file: Provides the option to save the protocol debug information in the log file.
Default: Not Selected
email_response > Messages
This section allows you to view the alarm messages that are defined on the Email Response probe. You can also view the properties of
the alarm messages.
Text: Identifies the content of the alarm message.
Level: Indicates the level of alarm which is raised.
Default for: Indicates the default message for the message name.
Subsystem: Identifies the subsystem id of the alarms generated by the Email Response probe.
i18nToken: Identifies the predefined alarms.
email_response > Options > Add new Profile
This section allows you to create and activate a monitoring profile.
Profile Name: Defines the name of the new profile.
Host Name: Defines the name of the mail server to which the test mail is directed.
Port: Specifies the port number that is used by the profile.
Note: This node is called profile name throughout this document as it is user-configurable.
This node allows you to view and activate the details of the monitoring profile.
Navigation > email_response Node > Profile-profile name
Note: The profile appearing as a child node under the email_response node is user-configurable. Hence, the node is referred to as
Profile-profile name node throughout this document.
Set or modify the following values that are based on your requirement:
Profile-profile name > Profile Information
This section allows you to view and configure the profile information.
Profile Name: Defines the name of the monitoring profile.
Host Name: Defines the name of the host to which the test mail is directed.
This Profile properties dialog appears when you double-click a profile or by right-clicking in the profile list and selecting Add profile or Edit
profile.
Name: The name of the profile.
Active: Select the check box to activate the profile.
Mailbox definition
Host type: Either the imap or the pop3 (the probe uses one of these protocols when reading mails).
Host name: The name of the mail server to which the test mail will be directed.
User name: Login name with a valid mail account on the mail server.
Password: Password for a user with a valid mail account on the mail server.
Note: If the user logon name is different from an alias defined in Exchange General, you will be denied to log on.
Security: Provides you three choices for communication between the probe and the server:
Try negotiate: The probe asks what kind of encrypting services the server supports. If the server supports TLS, this will be used.
Otherwise no encryption will be used.
Force SSL: This option allows SSL communication between the probe and the server only. If SSL is not supported, the probe will
not communicate with the server at all.
No TLS: This option allows no TLS communication between the probe and the server, even if the server supports TLS.
Don't Validate Certificate: Select this check box if you do not want to validate the certificate.
There are four tabs, which are Setup, Alarm messages, Quality of Service messages, and SMTP setup.
Setup
The fields in the Setup tab are explained below:
Target User: The login users mail account on the mail server.
Return address (from field) on send: Specify the e-mail address of the sender of the test mail.
Mail Intervals: You must specify the interval for:
Read interval: How often the probe tries to read mail from the mail-server.
Send interval: How often the probe sends a mail to the mail-server.
Consider lost after: Maximum allowed roundtrip time before the mail (from it was sent until it was received again) is considered as
lost.
Bounce Messages: When this option is checked, the probe returns mail-messages send by other mail probes.
Alarm Messages
The fields in the Quality of Service Messages tab are explained below:
Mail send time: When this option is checked, a QoS message with the mail send time (ms) is generated each time a mail-message is
sent.
Mail roundtrip time: When this option is checked, a QoS message with the mail roundtrip time (ms) is generated on each mail sent.
SMTP Setup
email_response Metrics
The following table describes the QoS metrics that can be configured using the Email Response Monitoring (email_response) probe.
Monitor Name
Units
Description
Version
QOS_NIM_INTERNET_MAIL_ROUNDTRIP
Milliseconds
1.0
QOS_NIM_INTERNET_MAIL_SEND
Milliseconds
1.0
Note: The emailgtw probe does not generate any QoS metrics. Therefore, there are no probe checkpoint metrics to be configured for
this probe.
More information
emailgtw (Email Gateway Monitoring) Release Notes
emailgtw AC Configuration
The Email Gateway Monitoring (emailgtw) probe converts alarms into emails according to the configurations defined in the Alarm Server (nas). At
the specified interval, the probe sends these emails to the recipients defined in the profiles.
The following diagram outlines the process to configure the probe to send emails for alarms.
Contents
Verify Prerequisites
Configure General Properties
Manage Server Credentials
Create a Profile
Set Outlook Data File for Online Mode
Verify Prerequisites
Verify that required hardware, and software information is available before you configure the probe. For more information, see emailgtw (Email
Gateway Monitoring) Release Notes.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Log size (KB): specifies the maximum size of the probe log file in kilobytes. When this size is reached, new log file entries are added
and the older entries are deleted.
Default: 100
Report Interval: specifies the interval after which the probe checks whether an alarm report file exists. If a file is found and the Repor
t Recipient option is selected in the profile, it is sent to the recipients defined in the profile.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
2. Go to the Email Message section and specify the following field values:
From Address: defines the email address from which the alarm notification is sent to the specified recipients.
Default: emailgtw@nimsoft.com
Subject: defines the subject of the alarm notification email. You can use only 100 characters in this field.
Note: To increase the subject length, you can add a subject_length parameter in the raw configure section of the probe.
The value must be an integer and the length can only be increased up to 998 characters. If the subject length exceeds 998
characters, then the message is rejected.
(Optional) Message Format: specifies the formatting option for the alarm notification email.
Default: HTML
Template: defines the format of the emails. The template is template.html if HTML is selected in the Message Format field. The
template changes to template.txt if Text is selected in the Message Format field. Both these template files are located in the <CA
UIM Installation Directory>/Probes/gateways/emailgtw directory.
Default: template.html
(Optional) Group Recipients: sends a single group email to all the recipients in the group. All the recipients appear in the To line in
the email.
Default: Not selected
Email on Assignment: converts assigned alarms into emails and sends to the email addresses defined in the profiles.
Default: Not selected
Note: For some reason, if the emailgtw probe is not available, it cannot receive and handle the assigned alarms. To avoid
missing these alarms during this duration, perform the following tasks on the hub probe:
a. Set up a queue under the Queues tab.
b. Type attach and enter the subject as EMAIL, alarm_assign.
If the emailgtw probe and the hub probe are on different robots, the attach queue must be created on the hub of the
primary robot.
Locale: converts Japanese or Simplified Chinese text present in emails into readable format. The emails with Japanese or Simplified
Chinese text display correctly only on the Microsoft Outlook client.
Default: None
3. (Optional) Go to the Alarm Settings section to configure the following settings for situations when the email server is inaccessible.
Send Alarm: sends an alarm if the probe fails to access the email server.
Default: Selected
Subsystem: defines the subsystem originating the alarm.
Default: 1.1.12, which translates to Mail in the Subsystems section of the nas probe.
Severity: defines the severity level of the alarm
Default: major
4. (Optional) Go to the Backup Email section to send a blind carbon copy (BCC) of all the alarm emails. Specify the following field values:
Send Backup Email: enables you to monitor the emails that are sent by the emailgtw probe. All the emails that are sent are copied
(BCC) to this email address.
Default: Not selected
Backup Email Address: defines the email address to which the emails are sent.
Default: Disabled
Note: SMTP server selection does not support IPv6 format. You can create connection to the email server using the host name.
Note: The Linux robot where the probe is deployed must have OpenSSL certificate installed to use the TLS functionality to
connect to the SMTP server.
5. Click Actions > Test Server Settings to verify the email server response.
You can use an Exchange server on a Windows robot to send emails. You can create a MAPI profile to enable and run the probe using the
Exchange server. A test user must already exist in the Exchange server before you configure the probe. The data file must be in online mode in
the Outlook settings. For more information, see the Set Data File for Online Mode section.
You can use either of the following options:
Pre-configured profile: Use this option when you want to use the profile that is created with the Outlook configuration on your system.
User and server: Use this option when you want to use some other user credentials.
Follow these steps:
1. Navigate to emailgtw > Mail Server node.
2. Select Exchange from the Server Type drop-down.
3. Select either Pre-configured profile or User and server as the Config Type.
4. Specify the following details of the selected configuration:
Profile Name or Server name: specifies the pre-configured Outlook profile name when you select Pre-configured profile as the Con
fig Type.
The field changes to Server Name and requires the Exchange server name when you select User and server as the Config Type.
Domain Name: defines the domain name of the Exchange server.
Username: defines the username for the Exchange server authentication.
Password: defines the password for the Exchange server authentication.
5. Click Actions > Test Server Settings to verify the mailbox availability.
When the probe starts, the probe connects to the defined Exchange server with the given user credentials. These MAPI profiles are ALWAYS
removed when the probe stops or deactivates.
Create a Profile
Create one or more profiles to define the recipients. The probe sends the alarm converted emails to the email addresses defined in the profiles, at
the specified interval.
Follow these steps:
1. Navigate to the emailgtw > Profiles node.
2. Click the Options (icon) next to the Profiles node in the navigation pane.
The Add Profile dialog displays.
3. Specify the following field values:
Profile Name: name of the profile.
Email Address(es): defines the email address of the recipient. You can specify multiple recipients, separated with a comma, if the Gr
oup Recipients option is selected in the emailgtw > Email Message section. If there are multiple recipients in a profile, each email is
sent to all recipients in the To field.
Note: For Exchange server, the probe supports more than 1024 characters if Group Recipients check box is selected in
the Email Message section. However, for SMTP server, the probe supports up to 1024 characters.
Report Recipients: sends the alarm report as an email at regular intervals, as defined in the Report Interval field in emailgtw node.
Default: Not selected
Note: The auto-operator in the NAS sends a report email from the emailgtw probe containing all alarms when the following
conditions are met. The email is sent to the recipient defined in the profile at the specified Report interval.
The recipient field in the auto-operator in the NAS is blank.
The Report Recipient option is selected for the profile.
(Optional) Subject: specifies the subject of the emails. The subject defined here overrides the Subject defined in the emailgtw node.
(Optional) HTML Template: specifies the template file that defines the email format. The template defined here overrides the Templat
e defined in the emailgtw node.
4. Click Submit to create the profile.
Note: To delete a profile, click the Options (icon) on the Profile Name node and select Delete Profile, and Save the configuration.
emailgtw IM Configuration
The Email Gateway Monitoring (emailgtw) probe converts alarms into emails according to the configurations defined in the Alarm Server (nas). At
the specified interval, the probe sends these emails to the recipients defined in the profiles.
The following diagram outlines the process to configure the probe to send emails for alarms.
Contents
Verify Prerequisites
Configure General Properties
Manage Server Credentials
Create a Profile
Create or Edit a Template
Set Outlook Data File for Online Mode
Verify Prerequisites
Verify that required hardware, and software information is available before you configure the probe. For more information, see emailgtw (Email
Gateway Monitoring) Release Notes.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Log size: specifies the maximum size of the probe log file in kilobytes. When this size is reached, new log file entries are added and
the older entries are deleted.
Default: 100 Kb
Report Interval: specifies the interval after which the probe checks whether an alarm report file exists. If a file is found and the Repor
t Recipient option is selected in the profile, it is sent to the recipients defined in the profile.
Default: 300
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Email on Assignment: converts assigned alarms into emails and sends to the email addresses defined in the profiles.
Default: Not selected
Note: For some reason, if the emailgtw probe is not available, it cannot receive and handle the assigned alarms. To avoid
missing these alarms during this duration, perform the following tasks on the hub probe:
a. Set up a queue under the Queues tab.
b. Type attach and enter the subject as EMAIL, alarm_assign.
If the emailgtw probe and the hub probe are on different robots, the attach queue must be created on the hub of the
primary robot.
(Optional) Use HTML Format: allows you to specify HTML format for the emails that are sent. If you do not select this option, the
emails are formatted in the .txt format.
Default: Selected
(Optional) Group Recipients: sends a single group email to all the recipients in the group.
Default: Not selected
2. Specify the following field values of the emails:
From: specifies the email address from where the probe sends the email.
Default: emailgtw@nimsoft.com
Subject: specifies the subject of the alarm notification emails. You can only enter 100 characters in the subject.
Note: To increase the subject length, you can add a subject_length parameter in the raw configure section of the probe.
The value must be an integer and the length can only be increased up to 998 characters. If the subject length exceeds 998
characters, then the message is rejected.
Template: specifies the format in which the email messages are formatted - either the template.html file or the template.txt file.
These files are located in the <CA UIM Installation Directory>/Probes/gateways/emailgtw directory.
This template is the default template for all profiles unless another template is specified in the Profile properties dialog. You can
create a new template or edit the existing template. For more information, see the Create or Edit a Template section.
3. (Optional) Select Backup Email Settings to send a blind carbon copy (BCC) of all the alarm emails to the specified Backup Email
Address. This functionality enables you to monitor the emails that are sent by the probe.
Default: Not selected
4. (Optional) Select Alarm Settingsto configure the following settings for situations where the email server is inaccessible:
Subsystem: defines the subsystem originating the alarm.
Default: 1.1.12, which translates to Mail in the Subsystems section of the nas probe.
Severity: defines the severity level of the alarm.
Default: major
5. (Optional) Select the Locale value to convert Japanese or Simplified Chinese text present in emails into readable format. The emails with
Japanese or Simplified Chinese text display correctly only on the Microsoft Outlook client.
Default: None
Note: The Linux robot where the probe is deployed must have OpenSSL certificate installed to use the TLS functionality to
connect to the SMTP server.
Note: This profile must exist and configured in Outlook before using in the probe. To find the profile name, navigate to
the Control Panel > Mail dialog.
Define the name of the Exchange server if you select User and server.
Domain Name: name of the Exchange server domain
Name: a valid user name of the Exchange server.
Create a Profile
Create one or more profiles to define the recipients. The probe sends the alarm converted emails to the email addresses defined in the profiles, at
the specified interval.
Follow these steps:
1. Click the Add profile button in the toolbar.
The Add Profile dialog displays.
2. Specify the following field values:
Name: name of the profile.
Email: defines the email address of the recipient. You can specify multiple recipients, separated with a comma, if the Group
Recipients option is selected in the Properties > General tab. If there are multiple recipients in a profile, each email is sent to all
recipients in the To field.
Note: For Exchange server, the probe supports more than 1024 characters if Group Recipients check box is selected in
the Email Message section. However, for SMTP server, the probe supports up to 1024 characters.
Subject: specifies the subject of the emails. The subject defined here overrides the Subject defined in the Properties > General tab.
Report recipient: sends the alarm report as an email at regular intervals, as defined in the Report Interval field in emailgtw node.
Default: Not selected
Note: The auto-operator in the NAS sends a report email from the emailgtw probe containing all alarms when the following
conditions are met. The email is sent to the recipient defined in the profile at the specified Report interval.
The recipient field in the auto-operator in the NAS is blank.
The Report Recipient option is selected for the profile.
HTML Template: specifies the template file that defines the email format. The template defined here overrides the Template defined
in the Properties > General tab.
Variable Expansion
You can edit the default template.html or you can create specific templates for a profile. In the template editor or in the Subject field of the Prope
rties dialog, press the '$' key. A drop-down appears with a list of available variables. The following variables are available for use in the subject
and html templates:
$level: the severity level of the alarm.
$level_exp: the name of that severity level, for example, critical.
$level_col: the color code of that severity level, for example, critical = #FF0000.
$prevlevel: the previous severity level of the alarm.
$prevlevel_exp: the name of that severity level, for example, critical.
$prevlevel_col: the color code of that severity level, for example, critical = #FF0000.
$subsys: the alarm subsystem.
$message: the message text.
$nimid: the message ID.
$nimts: the message timestamp, when it was originally sent.
$nimts_exp: the readable form of $nimts.
$source: the IP address of the system that sent the alarm.
$hub: the Hub the robot belongs to.
$robot: the name of the Robot that sent the alarm.
$origin: the system name of the hub in which the robot is present.
$domain: the name of Domain the Robot is in.
$prid: the probe ID.
$hostname: the hostname of the system that sent the alarm.
$hostname_strip: the hostname of the system that sent the alarm without the domain portion (if present).
$sid: the subsystem ID in the NAS.
$supp_key: the suppression key (often contains the checkpoint).
$arrival: the timestamp when the NAS received the alarm.
$arrival_exp: the human readable form of $arrival.
$aots: the timestamp when the NAS AO last checked the alarm.
$aots_exp: the human readable form of $aots.
$supptime: the timestamp of the last suppression of this message.
$supptime_exp: the readable form of $supptime.
$suppcount: the number of times this alarm has been suppressed.
$exported_by: the address of the NAS which exported this message.
$exported_type: the type of export.
$assigned_at: the timestamp when the alarm was assigned to a user.
$assigned_at_exp: the readable form of $assigned_at.
$assigned_to: the user which the alarm was assigned to.
$assigned_by: the user that assigned the alarm.
$ao_argument: the arguments set by the NAS auto-operator profile.
$tz_offset: sending system timezone offset in seconds compared to UTC.
$nimts_tzexp: the readable form of $nimts offset for sending robots timezone giving the local time on the sending robot.
$arrival_tzexp: the readable form of $arrival offset for sending robots timezone giving the local time on the sending robot.
$aots_tzexp: the readable form of $aots offset for sending robots timezone giving the local time on the sending robot.
$supptime_tzexp: the readable form of $supptime offset for sending robots timezone giving the local time on the sending robot.
$assigned_at_tzexp: the readable form of $assigned_at offset for sending robots timezone giving the local time on the sending robot.
$user_tag1
$user_tag2: user-defined tags that are used to define identification properties for computers in CA UIM. The tags are defined under Setu
p > Misc in the configuration dialog of the controller probe.
Contents
Verify Prerequisites
Configure General Properties
Manage Server Credentials
Create a Profile
Set Outlook Data File for Online Mode
Verify Prerequisites
Verify that required hardware, and software information is available before you configure the probe. For more information, see emailgtw (Email
Gateway Monitoring) Release Notes.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Log size (KB): specifies the maximum size of the probe log file in kilobytes. When this size is reached, new log file entries are added
and the older entries are deleted.
Default: 100
Report Interval: specifies the interval after which the probe checks whether an alarm report file exists. If a file is found and the Repor
t Recipient option is selected in the profile, it is sent to the recipients defined in the profile.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
2. Go to the Email Message section and specify the following field values:
From Address: defines the email address from which the alarm notification is sent to the specified recipients.
Default: emailgtw@nimsoft.com
Subject: defines the subject of the alarm notification email. You can use only 100 characters in this field.
Note: To increase the subject length, you can add a subject_length parameter in the raw configure section of the probe.
The value must be an integer and the length can only be increased up to 998 characters. If the subject length exceeds 998
characters, then the message is rejected.
(Optional) Message Format: specifies the formatting option for the alarm notification email.
Default: HTML
Template: defines the format of the emails. The template is template.html if HTML is selected in the Message Format field. The
template changes to template.txt if Text is selected in the Message Format field. Both these template files are located in the <CA
UIM Installation Directory>/Probes/gateways/emailgtw directory.
Default: template.html
(Optional) Group Recipients: sends a single group email to all the recipients in the group. All the recipients appear in the To line in
the email.
Default: Not selected
Email on Assignment: converts assigned alarms into emails and sends to the email addresses defined in the profiles.
Default: Not selected
Note: For some reason, if the emailgtw probe is not available, it cannot receive and handle the assigned alarms. To avoid
missing these alarms during this duration, perform the following tasks on the hub probe:
a. Set up a queue under the Queues tab.
b. Type attach and enter the subject as EMAIL, alarm_assign.
If the emailgtw probe and the hub probe are on different robots, the attach queue must be created on the hub of the
primary robot.
Locale: converts Japanese or Simplified Chinese text present in emails into readable format. The emails with Japanese or Simplified
Chinese text display correctly only on the Microsoft Outlook client.
Default: None
3. (Optional) Go to the Alarm Settings section to configure the following settings for situations when the email server is inaccessible.
Send Alarm: sends an alarm if the probe fails to access the email server.
Default: Selected
Subsystem: defines the subsystem originating the alarm.
Default: 1.1.12, which translates to Mail in the Subsystems section of the nas probe.
Severity: defines the severity level of the alarm
Default: major
4. (Optional) Go to the Backup Email section to send a blind carbon copy (BCC) of all the alarm emails. Specify the following field values:
Send Backup Email: enables you to monitor the emails that are sent by the emailgtw probe. All the emails that are sent are copied
(BCC) to this email address.
Default: Not selected
Backup Email Address: defines the email address to which the emails are sent.
Default: Disabled
Use this procedure if you want to use an SMTP server to send emails.
Note: SMTP server selection does not support IPv6 format. You can create connection to the email server using the host name.
Note: The Linux robot where the probe is deployed must have OpenSSL certificate installed to use the TLS functionality to
connect to the SMTP server.
5. Click Actions > Test Server Settings to verify the email server response.
Configure Exchange Server
You can use an Exchange server on a Windows robot to send emails. You can create a MAPI profile to enable and run the probe using the
Exchange server. A test user must already exist in the Exchange server before you configure the probe. The data file must be in online mode in
the Outlook settings. For more information, see the Set Data File for Online Mode section.
You can use either of the following options:
Pre-configured profile: Use this option when you want to use the profile that is created with the Outlook configuration on your system.
User and server: Use this option when you want to use some other user credentials.
Follow these steps:
1. Navigate to emailgtw > Mail Server node.
2. Select Exchange from the Server Type drop-down.
3. Select either Pre-configured profile or User and server as the Config Type.
4. Specify the following details of the selected configuration:
Profile Name or Server name: specifies the pre-configured Outlook profile name when you select Pre-configured profile as the Con
fig Type.
The field changes to Server Name and requires the Exchange server name when you select User and server as the Config Type.
Domain Name: defines the domain name of the Exchange server.
Username: defines the username for the Exchange server authentication.
Password: defines the password for the Exchange server authentication.
5. Click Actions > Test Server Settings to verify the mailbox availability.
When the probe starts, the probe connects to the defined Exchange server with the given user credentials. These MAPI profiles are ALWAYS
removed when the probe stops or deactivates.
Create a Profile
Create one or more profiles to define the recipients. The probe sends the alarm converted emails to the email addresses defined in the profiles, at
the specified interval.
Follow these steps:
1. Navigate to the emailgtw > Profiles node.
2. Click the Options (icon) next to the Profiles node in the navigation pane.
The Add Profile dialog displays.
3. Specify the following field values:
Profile Name: name of the profile.
Email Address(es): defines the email address of the recipient. You can specify multiple recipients, separated with a comma, if the Gr
oup Recipients option is selected in the emailgtw > Email Message section. If there are multiple recipients in a profile, each email is
sent to all recipients in the To field.
Note: For Exchange server, the probe supports more than 1024 characters if Group Recipients check box is selected in
the Email Message section. However, for SMTP server, the probe supports up to 1024 characters.
Report Recipients: sends the alarm report as an email at regular intervals, as defined in the Report Interval field in emailgtw node.
Default: Not selected
Note: The auto-operator in the NAS sends a report email from the emailgtw probe containing all alarms when the following
conditions are met. The email is sent to the recipient defined in the profile at the specified Report interval.
(Optional) Subject: specifies the subject of the emails. The subject defined here overrides the Subject defined in the emailgtw node.
(Optional) HTML Template: specifies the template file that defines the email format. The template defined here overrides the Templat
e defined in the emailgtw node.
4. Click Submit to create the profile.
Note: To delete a profile, click the Options (icon) on the Profile Name node and select Delete Profile, and Save the configuration.
The Email Gateway Monitoring (emailgtw) probe converts alarms into emails according to the configurations defined in the Alarm Server (nas). At
the specified interval, the probe sends these emails to the recipients defined in the profiles.
The following diagram outlines the process to configure the probe to send emails for alarms.
Contents
Verify Prerequisites
Configure General Properties
Manage Server Credentials
Create a Profile
Create or Edit a Template
Set Outlook Data File for Online Mode
Verify Prerequisites
Verify that required hardware, and software information is available before you configure the probe. For more information, see emailgtw (Email
Gateway Monitoring) Release Notes.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Log size: specifies the maximum size of the probe log file in kilobytes. When this size is reached, new log file entries are added and
the older entries are deleted.
Default: 100 Kb
Report Interval: specifies the interval after which the probe checks whether an alarm report file exists. If a file is found and the Repor
t Recipient option is selected in the profile, it is sent to the recipients defined in the profile.
Default: 300
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Email on Assignment: converts assigned alarms into emails and sends to the email addresses defined in the profiles.
Default: Not selected
Note: For some reason, if the emailgtw probe is not available, it cannot receive and handle the assigned alarms. To avoid
missing these alarms during this duration, perform the following tasks on the hub probe:
a. Set up a queue under the Queues tab.
b. Type attach and enter the subject as EMAIL, alarm_assign.
If the emailgtw probe and the hub probe are on different robots, the attach queue must be created on the hub of the
primary robot.
(Optional) Use HTML Format: allows you to specify HTML format for the emails that are sent. If you do not select this option, the
emails are formatted in the .txt format.
Default: Selected
(Optional) Group Recipients: sends a single group email to all the recipients in the group.
Default: Not selected
2. Specify the following field values of the emails:
From: specifies the email address from where the probe sends the email.
Default: emailgtw@nimsoft.com
Subject: specifies the subject of the alarm notification emails. You can only enter 100 characters in the subject.
Note: To increase the subject length, you can add a subject_length parameter in the raw configure section of the probe.
The value must be an integer and the length can only be increased up to 998 characters. If the subject length exceeds 998
characters, then the message is rejected.
Template: specifies the format in which the email messages are formatted - either the template.html file or the template.txt file.
These files are located in the <CA UIM Installation Directory>/Probes/gateways/emailgtw directory.
This template is the default template for all profiles unless another template is specified in the Profile properties dialog. You can
create a new template or edit the existing template. For more information, see the Create or Edit a Template section.
3. (Optional) Select Backup Email Settings to send a blind carbon copy (BCC) of all the alarm emails to the specified Backup Email
Address. This functionality enables you to monitor the emails that are sent by the probe.
Default: Not selected
4. (Optional) Select Alarm Settingsto configure the following settings for situations where the email server is inaccessible:
Subsystem: defines the subsystem originating the alarm.
Default: 1.1.12, which translates to Mail in the Subsystems section of the nas probe.
Severity: defines the severity level of the alarm.
Default: major
5. (Optional) Select the Locale value to convert Japanese or Simplified Chinese text present in emails into readable format. The emails with
Japanese or Simplified Chinese text display correctly only on the Microsoft Outlook client.
Default: None
send emails. The emailgtw probe supports SMTP and Exchange servers to send emails on a Windows robot. However, you can configure only
one server in the probe at a time. For robots on other operating systems, you can only use the SMTP server.
Configure SMTP Server
Use this procedure if you want to use an SMTP server to send emails.
Follow these steps:
1. Open the Properties dialog.
2. Select SMTP in the Server Selection box.
3. Navigate to the SMTP tab and specify the following email server details.
Primary Mail Server: defines the main SMTP server that is used to send emails. Append ":<portnumber>" after the hostname or
IP address if you want to specify a non-default port number for the SMTP server.
Example: smtp.yourserver.com:26
Username and Password: specifies the login credentials for the SMTP server authentication.
Test button: allows you to verify the email server response. A green indicator means OK and a red indicator indicates that the
server did not respond.
Secondary Mail Server: defines the secondary SMTP server to be used when the primary SMTP server fails.
4. (Optional) Select Ignore TLS if you do not want the probe to attempt a Transport Layer Security (TLS) connection with the primary and
secondary email server. This feature is required because some email servers announce TLS capability even if it is not present, due to a
missing certificate.
Note: The Linux robot where the probe is deployed must have OpenSSL certificate installed to use the TLS functionality to
connect to the SMTP server.
You can use an Exchange server on a Windows robot to send emails. You can create a MAPI profile to enable and run the probe using the
Exchange server. A test user must already exist in the Exchange server before you configure the probe. The data file must be in online mode in
the Outlook settings. For more information, see the Set Outlook Data File for Online Mode section. You can use either of the following options:
Pre-configured profile: Use this option when you want to use the profile that is created with the Outlook configuration on your system.
User and Server: Use this option when you want to use some other user credentials.
Follow these steps:
1. Open the Properties dialog.
2. Select Exchange in the Server Selection box.
3. Navigate to the Exchange tab.
4. Select either Pre-configured profile or User and server option.
5. Specify the following details of the selected configuration:
Profile Name or Server name:
Define the pre-configured Outlook profile name if you select Pre-configured profile.
Note: This profile must exist and configured in Outlook before using in the probe. To find the profile name, navigate to
the Control Panel > Mail dialog.
Define the name of the Exchange server if you select User and server.
Domain Name: name of the Exchange server domain
Name: a valid user name of the Exchange server.
Password: a valid password for the specified user.
6. Click the Test button to verify the email server response. A green indicator means OK and a red indicator indicates that the server did not
respond.
When the probe starts, the probe connects to the defined Exchange server with the provided user credentials. These MAPI profiles are ALWAYS
removed when the probe stops or deactivates.
Create a Profile
Create one or more profiles to define the recipients. The probe sends the alarm converted emails to the email addresses defined in the profiles, at
the specified interval.
Follow these steps:
1. Click the Add profile button in the toolbar.
The Add Profile dialog displays.
2. Specify the following field values:
Name: name of the profile.
Email: defines the email address of the recipient. You can specify multiple recipients, separated with a comma, if the Group
Recipients option is selected in the Properties > General tab. If there are multiple recipients in a profile, each email is sent to all
recipients in the To field.
Note: For Exchange server, the probe supports more than 1024 characters if Group Recipients check box is selected in
the Email Message section. However, for SMTP server, the probe supports up to 1024 characters.
Subject: specifies the subject of the emails. The subject defined here overrides the Subject defined in the Properties > General tab.
Report recipient: sends the alarm report as an email at regular intervals, as defined in the Report Interval field in emailgtw node.
Default: Not selected
Note: The auto-operator in the NAS sends a report email from the emailgtw probe containing all alarms when the following
conditions are met. The email is sent to the recipient defined in the profile at the specified Report interval.
The recipient field in the auto-operator in the NAS is blank.
The Report Recipient option is selected for the profile.
HTML Template: specifies the template file that defines the email format. The template defined here overrides the Template defined
in the Properties > General tab.
You can edit the default template.html or you can create specific templates for a profile. In the template editor or in the Subject field of the Prope
rties dialog, press the '$' key. A drop-down appears with a list of available variables. The following variables are available for use in the subject
and html templates:
Note: The probe only supports basic and NTLM (NT LAN Manager) authentication.
More Information:
ews_response (Microsoft Exchange Server Response Monitoring) Release Notes
Contents
Verify Prerequisites
Configure Global Values
Add Server Details
Create and Configure Profiles
Alarm Thresholds
Verify Prerequisites
Verify that required hardware and software is available and any installation consideration is met before you configure the probe. For more
information, see ews_response (Microsoft Exchange Server Response Monitoring) Release Notes.
Note: The probe deletes the email after scanning the inbox of the user to save the inbox storage space. Each test email is
deleted as soon it is found and the QoS value is calculated.
4. Select Bounce Nimsoft Test Mail from other sources back to Sender to enable the Exchange Client to return incoming test emails to
the sender. You can select this option if you want to monitor the round-trip time between two independent user accounts. This provides a
more realistic response time. The test email is sent from an inbox to itself to calculate the round-trip time.
Default: Not Selected
Note: Test email messages can be received by a user account from another mailbox that is monitored by a separate instance
of the probe. These emails are sent back if this checkbox is selected.
5. Specify the number of messages which are scanned in Stop reading inbox when other than Nimsoft Test Mails Found before
stopping the probe to scan the inbox. An active user can receive many non-test emails. This setting allows you to stop the probe from
scanning the inbox if it does not find the actual test email, thus reducing server load.
Default: 30
6. Select Publish Alarms in the Send Alarm on Unexpected Mail section to generate alarms if emails from other sources appear. An
active user can have other email messages in the inbox. An active user can have other email messages in the inbox. The alarm notifies
you that the account has received a non-test email, if this option is selected.
Default: Not Selected
Note: The probe deletes the email after scanning the inbox of the user to save the inbox storage space. Each test email is
deleted as soon it is found and the QoS value is calculated.
Select Bounce Nimsoft Test Mail from other sources back to Sender to enable the Exchange Client to return incoming test emails
to the sender. You can select this option if you want to monitor the round-trip time between two independent user accounts. This
process provides a more realistic response time. The test email is sent from an inbox to itself to calculate the round-trip time.
Default: Not Selected
Note: Test email messages can be received by a user account from another mailbox that is monitored by a separate
instance of the probe. These emails are sent back if this checkbox is selected.
Specify the number of messages which are scanned in Advanced Inbox Mails before stopping the probe to scan the inbox. An active
user can receive many non-test emails. This setting allows you to stop the probe from scanning the inbox if it does not find the actual
test email, thus reducing server load.
Default: 30
Select Inbox Send Alarm to generate alarms if emails from other sources appear. An active user can have other email messages in
the inbox. An active user can have other email messages in the inbox. The alarm notifies you that the account has received a non-test
email, if this option is selected.
Default: Not Selected
8. Click Submit.
The user is created under the ews_response node.
5. Specify the target email address where the test email is sent in the Profile General Parameters section of the Profile Name node.
The profile is created and configured. You can configure the QoS Messages on mail sent and round-trip time monitors in the Profile
Name node.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
ews_response Node
User-<User Name> Node
<Server Name> Node
<User Name> Node
<Profile Name> Node
ews_response Node
This section contains the configuration details specific to the ews_response probe. In this node, you can view the probe information and can set
up the global properties of the probe.
Navigation: ews_response
Set or modify the following values as required.
ews_response > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
ews_response > General Configuration
This section lets you configure the global properties of the probe.
Log Level: specifies the level of details that are written to the log file.
Default: 0 - Fatal
Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when debugging.
Check Every Second: specifies the interval between each cycle when the inbox is read. The inbox is only read when there are
outstanding emails. You can reduce this interval to generate alarms faster (as next interval takes lesser time) but it increases the load on
the Exchange Server. You can also increase this interval to reduce the load on the Exchange Server but it generates alarms later (as
next interval takes more time).
Default: 15
Note: The probe deletes the email after scanning the inbox of the user to save the inbox storage space. Each test email is
deleted as soon it is found and the QoS value is calculated.
Bounce Nimsoft Test Mail from other sources back to Sender: enables the Exchange client response to return the incoming test email to
the sender. You can select this option if you want to monitor the round-trip time between two independent user accounts. This provides a
more realistic response time. The test email is sent from an inbox to itself to calculate the round-trip time.
Default: Not Selected
Note: Test emails can be received by a user account from another mailbox that is monitored by a separate instance of the
probe. These emails are sent back if this checkbox is selected.
Stop reading inbox when other than Nimsoft Test Mails Found: specifies the number of emails which are scanned before stopping the
probe to scan the inbox. An active user can receive many non-test emails. This setting allows you to stop the probe from scanning the
inbox if it does not find the actual test email, thus reducing server load.
Default: 30
ews_response > Send Alarm on Unexpected Mail
This section lets you generate alarms if other emails are present in the mailbox. An active user can have other emails in the inbox. An active user
can have other emails in the inbox. The alarm notifies you that the account has received a non-test email, if this option is selected.
Default: Not Selected
ews_response > Message Pool
This section lets you view a particular alarm message in detail.
Message Name: identifies the unique name of the message. This name is used to refer to the message from the profiles.
Text: identifies the alarm message text. Use following variables for including run-time information in the alarm message text:
profile
mail_to
used
threshold
diff
Level: identifies the alarm message severity.
Default for: identifies the default message name.
i18n Token: identifies the predefined alarms fetched from the database.
ews_response > Options (icon) > Add New User
This option opens the Add New User window. This window lets you add an Exchange Server user with a valid user name, password, and the
name of the Exchange Server to log in.
The fields in this window are as follows:
Active: activates or deactivates the Exchange Server user.
User Name: specifies the username of the user credentials to access the Exchange Server.
Server Name: specifies the name of the Exchange Server.
Password: specifies the password of the user credentials to access the Exchange Server
Exchange Version: specifies the version of the Exchange Server.
Note: You cannot modify the user account in this section. A new user account with the specified changes is added to the probe
if changes are made in this node and the configuration is saved.
Note: The Mail Send Time QoS value (in milliseconds) is the time taken to connect to the server and execute the send
command.
Note: If the email is received after Consider Mail lost after time, the QoS email for Mail round-trip time is null.
The round-trip time is calculated as (End Time timestamp minus Mail Send timestamp) divided by 1000, where:
End Time timestamp is when the sender receives the test email back from the receiver in Coordinated Universal Time (UT
C).
Mail Send timestamp is the sent time mentioned in the test mail in UTC.
The Mail send time QoS is in milliseconds and Mail round-trip time is in seconds. So, the value is divided by 1000 to convert
it into milliseconds.
Using Alarm Message: specifies the alarm email when the email round-trip time exceeds the threshold.
Default: MailTimeout
Mail Error Message: specifies the alarm email when the probe is unable to send email to the specified Exchange Server user.
Default: Mailerror
Consider Mail Lost after (Seconds): specifies the time before the email is considered as lost. The value is the sum of the interval and the
email lost threshold.
Default: 300
Note: The Mail Lost alarm is generated at the next check interval after the specified Consider Mail Lost after value.
Example: If the value of this field is 300 seconds, and the interval is set at 30 seconds, the alarm is generated at 300 plus the
remaining time for the next scan (0 to 30 seconds) or 300 to 330 seconds.
Lost Mail Message: specifies the alarm email used when the email is not delivered.
Default: MailLost
Accept undelivered Reply as OK: changes the status of undelivered emails as delivered.
Default: Selected
Profile Name > Options (icon) > Delete Profile
This section lets you delete a profile from an Exchange Server user account configured in the probe.
Verify Prerequisites
Configure Global Values
Add Exchange Server Details
Create and Configure Profiles
Verify Prerequisites
Verify that required hardware and software is available and any installation consideration is met before you configure the probe. For more
information, see ews_response (Microsoft Exchange Server Response Monitoring) Release Notes.
Note: The probe deletes the email after scanning the inbox of the user to save the inbox storage space. Each test email is
deleted as soon it is found and the QoS value is calculated.
4. Select Bounce Nimsoft Test Mail from other sources back to Sender to enable the Exchange Client to return incoming test emails to
4.
the sender. You can select this option if you want to monitor the round-trip time between two independent user accounts. This provides a
more realistic response time. The test email is sent from an inbox to itself to calculate the round-trip time.
Default: Not Selected
Note: Test email messages can be received by a user account from another mailbox that is monitored by a separate instance
of the probe. These emails are sent back if this checkbox is selected.
5. Select Send Alarm on unexpected mail to generate alarms if emails from other sources appear. An active user can have other email
messages in the inbox. An active user can have other email messages in the inbox. The alarm notifies you that the account has received
a non-test email, if this option is selected.
Default: 30
6. Specify the number of messages which are scanned in Stop reading inbox when other than Nimsoft Test Mails Found before
stopping the probe to scan the inbox. An active user can receive many non-test emails. This setting allows you to stop the probe from
scanning the inbox if it does not find the actual test email, thus reducing server load.
Default: Not Selected
Note: You can also select an existing user and click Copy User to create a New User with the same properties. You must give
a different profile name while copying a user.
Exchange Version: select the Exchange Server to be used from the drop-down list.
3. Create a monitoring profile under the Profiles tab, where you can define details of user to whom test mail is sent for monitoring.
Refer Create Monitoring Profile for more information.
4. Click the Advanced tab.
5. Select Use Global to use the use the settings from the Setup tab. You can also select Override to enter user-specific values.
6. Click OK to save the User properties.
Notes:
If the mail is received after Consider Mail Lost after Seconds, the QoS message for Roundtrip Time Threshold is null.
Round-trip time is the time taken by a test email to be received back by the user mailbox that sent the email.
Note: The Mail Lost alarm is generated at the next check interval after the specified value.
Example: If the value of this field is 300 seconds, and the check interval is set at 30 seconds, the alarm is generated at 300
plus the remaining time for the next check (0 to 30 seconds) or 300 to 330 seconds.
Consider Mail lost after: define the time (in seconds) before the mail is considered as lost.
Lost mail message: select the alarm message to be used when the mail is not returned.
9. Click Ok.
The profile is created and configured.
Setup Tab
Status Tab
User Properties Window
Profile Properties Window
Message Pool Tab
Message Properties
Setup Tab
The Setup tab defines the general properties for the probe.
Note: The probe deletes the email after scanning the inbox of the user to save the inbox storage space. Each test email is
deleted as soon it is found and the QoS value is calculated.
Note: Test emails can be received by a user account from another mailbox that is monitored by a separate instance of the
probe. These emails are sent back if this checkbox is selected.
Note: If an active Exchange Server user gets email exceeding the specified number, the probe is unable to find its own last
message and consider it lost.
Status Tab
The Status tab lists all the monitoring hosts configured in the probe. You can have more than one entry for each email user you want to send
email with a valid user name, password, and the name of the Exchange Server to log in. For each of these Exchange Server users, one or more
monitoring profiles can be defined. These monitoring profiles define how the test email is sent and received and also specify the alarm and QoS
properties.
You can right-click in the Status tab to display the following options:
New User: creates a new user profile for an Exchange Server. You can click it to open the User Properties window for a new user.
Edit User: updates an existing user profile for an Exchange Server.
Delete: removes the selected user profile for an Exchange Server.
Copy User: copies the existing user profile for an Exchange Server to create a new user.
User Properties Window
The window has the following fields and a Profiles and an Advanced tab.
Active: activates or deactivates the Exchange Server user. Deactivating a user deactivates all the profiles for the user.
Name: defines a valid user name for the Exchange Server user that is preceded by domain name. For example, domain\username
Password: defines the password for the user account.
Server name: defines the name of Exchange Server to be used.
Profiles Tab
The Profiles tab lists all monitoring profiles that are defined for the Exchange Server user. Right-clicking in the list lets you add, edit, or delete
profiles.
Note: The Active check box in User properties dialog must be enabled for the context menu to appear.
You can right-click in the Profiles tab to display the following options:
New Profile: creates a new monitoring profile for the Exchange Server user. Click it to open the Profile Properties window for a new
profile.
Edit Profile: updates an existing monitoring profile.
Delete profile: removes the selected monitoring profile.
Profile Properties Window
The profile properties describe to whom the test email must be sent, the interval between each test message and the alarm and QoS
properties. The email messages can be sent to:
Own mailbox <self>
To an Exchange Server user whose mailbox is monitored by another instance of the probe, with 'bounce' enabled (on the Setup tab).
To an unknown Exchange Server user on a known email server for returning the email.
Send to an Exchange Server user for whom you have created another way of returning the emails to the mailbox being monitored.
Name
Defines the name of the profile
Active
Activates and runs the profile.
Mail to
Defines the target email address for the test email. The default value is <self>, which means the Exchange Server user mailbox. The <se
lf> email address is constructed using the domain name and username in the Name field of User properties dialog (refer Create New
User). For example, domain\username becomes username@domain.com.
Send mail every
Specifies how often a test email is sent.
Quality of Service messages on
Mail send time
Lets you generate a QoS message on each email sent.
Note: The Mail Send Time QoS value (in milliseconds) is the time taken to connect to the server and execute the send
command.
Note: If the email is received after Consider Mail lost after time, the QoS message for Mail roundtrip time is null.
The round trip time is calculated as (End Time timestamp minus Mail Send timestamp) divided by 1000, where:
End Time timestamp is when the sender receives the test email back from the receiver in UTC.
Mail Send timestamp is the sent time mentioned in the test mail in UTC.
The Mail send time QoS is in milliseconds and Mail roundtrip time is in seconds. So, the value is divided by 1000 to
convert it into milliseconds.
The Message properties dialog lets you define new messages and modify existing message details.
Name
Indicates email from the profiles. Each message must be given a unique name. This field remains disabled, while editing the existing
message.
Alarm situation
Specifies the default message for a particular alarm situation. You must leave this field empty if another message is set as the default.
Text
Level
Specifies the severity level assigned to alarm message.
Subsystem
Generates the subsystem_ID of alarms. A string or an Id is managed by the NAS (Alarm Server).
Verifying Prerequisites
Install Exchange Web Services (EWS) Java API
Configure a Node
Manage Users
Delete User
Manage Profiles
Delete Profile
Alarm Thresholds
Verifying Prerequisites
The ews_response probe requires access to the following prerequisites:
User account access to Microsoft Exchange Server Address to monitor connection and send test mails.
EWS Java API for activating the probe.
Ensure the .jar file is accurate and its size is 905 kb.
Configure a Node
This procedure provides information to configure a particular section within a node. Each section within the node lets you configure the properties
of the probe for monitoring Microsoft Exchange Server performance.
Follow these steps:
1. Select the appropriate navigation path.
2. Update the field information and click Save.
3. The specified section of the probe is configured.
The probe is now ready to monitor the Microsoft Exchange Server.
Manage Users
The Server Administrator must add and manage users for monitoring the Exchange Server.
Follow these steps:
1. Click the Options icon next to the ews_response node in the navigation pane.
2. Click the Add New User option.
3. Update the field information and click Submit.
The user is created under the ews_response node.
Delete User
You can delete an existing user when you no longer want it.
Follow these steps:
1. Click the Options icon next to the User-user name node that you want to delete.
2. Click the Delete User option.
The user is deleted.
Manage Profiles
The Server Administrator must create and configure monitoring profiles for monitoring the Exchange Server.
Follow these steps:
1. Click the Options icon next to the ews_response node in the navigation pane.
2. Click the Add New Profile option.
3. Update the field information and click Submit.
The new profile is created under the user name node. Using the profile, you can define how the test mail message is sent and returned
and also specify the alarm and QoS properties.
Delete Profile
You can delete an existing profile when you no longer want it.
Follow these steps:
1. Click the Options icon next to the profile name node that you want to delete.
2. Click the Delete Profile option.
The profile is deleted.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
ews_response Node
User-<User Name> Node
<Server Name> Node
<User Name> Node
<Profile Name> Node
The Microsoft Exchange Server Response Monitoring probe is configured for monitoring the Exchange Server by defining one or more users. You
can add profiles for each user. These monitoring profiles define how the test mail message is sent and returned. These profiles also specify the
alarm and QoS properties.
ews_response Node
This section contains the configuration details specific to the Microsoft Exchange Server Response Monitoring probe. In this node, you can view
the probe information and can configure the setup properties.
Navigation: ews_response
Set or modify the following values as required.
ews_response > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
ews_response > General Configuration
This section lets you configure the properties of the Microsoft Exchange Server Response Monitoring.
Log Level: specifies the level of details that are written to the log file.
Default: 0 - Fatal
Check Every Second: specifies how often the inbox is read. The inbox is only read when there are outstanding mail messages.
Default: 15
Note: The probe deletes the mail after scanning the inbox of the user to save inbox storage space.
Bounce Nimsoft Test Mail from other sources back to Sender: enables the Exchange client response to return incoming test mail to the
sender.
Default: Not Selected
Stop reading inbox when other than Nimsoft Test Mails Found: specifies the number of messages, which are scanned.
Default: 30
ews_response > Send Alarm on Unexpected Mail
This section lets you generate alarms if other mail messages appear. If the test user is an active user, there are other mail messages in
the inbox.
Note: The server name is user-configurable and is referred to as server name in this document.
Note: The user name is user-configurable and is referred to as user name in this document.
Navigation: ews_response > User-user name > server name > user name
Set or modify the following values that are based on your requirement:
user name > User Properties
This section lets you edit the properties of the user, such as the name, password, and exchange server to log in.
user name > Advanced Inbox Settings
This section lets you set the advanced inbox settings parameters, such as inbox check time, inbox send alarm, and advanced inbox alarms.
Inbox Check Time: specifies how often the probe checks the inbox of the user mail box is checked.
Note: The profile name is user-configurable and is referred to as the profile name in this document.
Navigation: ews_response > User-user name > server name > user name > profile name
Set or modify the following values that are based on your requirement:
profile name > Profile General Parameters
This section lets you configure the general properties of the profile.
Name: defines the name of the profile.
Active: activates or deactivates the profile.
Default: Selected
Mail to: defines the target email address where the email is directed.
Default: <self>
profile name > Quality of Service Messages on Mail Sent Time
This section lets you configure the alarm and QoS when an email is sent.
Send mail every second: specifies the interval after which an email is sent.
Default: 300
profile name > Quality of Service Messages on Mail RTT
This section lets you configure the alarm and QoS on email roundtrip time.
Send Alarm: generates alarm if mail roundtrip time exceeds the specified threshold.
Default: Not Selected
Roundtrip Time Threshold (Seconds): specifies the mail roundtrip time.
Default: 120
Note: If the mail is received after Consider Mail Lost after Seconds, the QoS message for Roundtrip Time Threshold is null.
Using Alarm Message: specifies the alarm message when the mail roundtrip time exceeds the threshold.
Default: MailTimeout
Mail Error Message: specifies the alarm message when unable to send mail to the specified user.
Default: Mailerror
Consider Mail Lost after Seconds: specifies the time (in seconds) before the mail is considered as lost.
Default: 300
Lost Mail Message: specifies the alarm message to be used when the mail is not delivered.
Default: MailLost
Accept undelivered Reply as OK: changes the status of undelivered emails as delivered.
Default: Selected
Installation Notes
Prerequisites
Install Exchange Web Services (EWS) Java API
Monitoring System Requirements
How to Enable and Run the Probe
Configure Users
Create New User
Copy Existing User
Installation Notes
When installing the probe, a wizard assists you through the initial probe configuration.
Note: While creating monitoring profiles, the probe assumes that the test mail sent comes to the test users' mailbox. This can be done
in several ways, for instance:
Send mail to the test user directly.
Send mail to a non-existing user on a mail server that issues an 'Unknown user' return mail.
Send mail to another test user whose mailbox is being monitored by another instance of this probe.
Prerequisites
The ews_response probe requires access to the following prerequisites:
User account access to Microsoft Exchange Server Address to monitor connection and send test mails.
EWS Java API for activating the probe.
Configure Users
You can configure the ews_response probe by providing the Microsoft Exchange server details to be monitored with the user name and
password. The probe uses these credentials to access server, send test emails, and monitor the server response time (performance) for
generating alarms and QoS.
Note: Server name must specify the protocol of the exchange web service. For example, http://www.example.com or https://
www.example.com. If not specified, by default it is https://.
Exchange Version
Specifies the Exchange Server to be used from the drop-down list.
2. Specify the user details under Profiles tab, where you can define details of user to whom test mail is sent for monitoring.
3. Click Advanced tab to override global settings that are defined under the Setup tab.
4. Click OK to save the User properties.
Note: Similarly, you can select the Edit User and Delete User options from the Users list to modify and remove the selected user
details.
Note: If an active user gets mail exceeding the specified number, the probe is unable to find its own last message and
consider it lost.
For each of these users, one or more monitoring profiles can be defined. These monitoring profiles define how the test mail is sent and received
and also specify the alarm and QoS properties.
Right-clicking in the Users, displays the following options:
New User: creates new monitoring profile for the user.
Edit User: updates existing monitoring profile.
Delete: removes the selected monitoring profile.
Copy User: copies the existing monitoring profile details to create a new profile.
The Profiles Tab
The Profiles tab lists all monitoring profiles that are defined for the user. Right-clicking in the list lets you add, edit, or delete profiles.
Note: The Active check box in User properties dialog must be enabled for the context menu to appear.
The profile properties describe to whom the test mail must be sent, the interval between each test message and the alarm and QoS properties.
The mail messages can be sent to:
Own mailbox <self>
To a user whose mailbox is monitored by another instance of the probe, with 'bounce' enabled (on the Setup tab).
To an unknown user on a known mail server ) for returning the mail.
Send to a user for whom you have created another way of returning the mails to the mailbox being monitored.
Name
Defines the name of the profile
Active
Activates and runs the profile.
Mail to
Defines the target mail address for the test mail. The default value is <self>, which means the user mailbox. The <self> mail address is
constructed using the domain name and username in the Name field of User properties dialog (refer Create New User). For example,
domain\username becomes username@domain.com.
Send mail every
Specifies how often a test mail is sent.
Quality of Service messages on
Mail send time
Lets you generate a QoS message on each mail sent.
The time taken to connect to the server and execute the send command.
Note: If the mail is received after Consider Mail lost after time, the QoS message for Mail roundtrip time is null.
The Setup tab of the probe GUI defines the global Inbox and Advanced settings for the probe. By default, these settings are for all defined
users. The Advanced tab on the User properties dialog allows you to override these settings for a user.
This tab contains the list of alarm messages defined. These messages can be selected to be sent on error conditions (when defined thresholds
are breached) in specific profiles.
The following commands are available when you right-click in the message list:
Add message: Creates a new message using the Message properties dialog.
Message properties: Lets you edit the selected message using the Message properties dialog.
Remove message: Deletes the selected message. You will be asked to confirm the deletion.
Message Properties
The Message properties dialog lets you define new messages and modify existing message details.
Name
Indicates mail from the profiles. Each message must be given a unique name. This field remains disabled, while editing the existing
message.
Alarm situation
Specifies the default message for a particular alarm situation. You must leave this field empty if another message is to be the default.
Text
Supports variable expansion for the following variables.
Profile
Defines the profile name for sending the test mail.
Mail_to
Defines the recipient of the test mail.
Used
Measures the mail roundtrip time.
Threshold
Defines the alarm threshold for the profile.
Diff
Defines the difference between the alarm threshold and the mail roundtrip time measured.
Note: All the variables are not available for each alarm. Only the supported variable type should be used as per the definition of the
alarm.
Level
Specifies the severity level assigned to alarm message.
Subsystem
Generates the subsystem_ID of alarms. A string or an Id is managed by the NAS (Alarm Server).
ews_response Metrics
This section contains the QoS metrics for the Microsoft Exchange Server Response (ews_response) probe.
Monitor Name
Units
Description
Version
QOS_Exchange_Mail_Send
Milliseconds
v1.1
QOS_Exchange_Mail_Roundtrip
Seconds
v1.1
database connection information must be configured. An existing database is assumed. You can use the same database as used for
QoS data.
An exchange_monitor_reports probe installed on your IIS.
More information:
exchange_monitor (Microsoft Exchange Monitoring) Release Notes
Note: The DAG profiles are only visible for exchange server 2010 and 2013.
The following diagram outlines the process to configure the exchange_monitor probe:
Contents
Verify Prerequisites
Configure a Profile
Alarm Thresholds
Create File Monitoring Profile
View DAG Details
Monitor Mailbox Growth
Collect Reports
Verify Prerequisites
Verify that required hardware and software is available and pre-configuration requirements are met before you configure the probe. For more
information, see exchange_monitor (Microsoft Exchange Monitoring) Release Notes.
Configure a Profile
You can configure a profile to retrieve and use the values that the four probes monitor.
Follow these steps:
Notes:
By default, monitoring is enabled for all types of profiles. You can deselect a monitoring type if you do not want to monitor it.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Note: The DAG profiles will only be visible for exchange server 2010.
When you click the DAG node, the details of all the DAGs present in the current domain is displayed. You can view the name of the DAG, server
names pertaining to DAG, and list of servers that are currently active.
You can click a particular DAG to view details of all servers that are present in the current domain. Details of Number of active databases, number
of passive databases, number of mounted database, number of non-mounted databases, and the status of the server (Up or Down) are
displayed.
You can click a server to view details of database copies residing on that server.
Click a particular database to view details of that database, such as server name, database name, database copy status, mailbox size, content
index state, copy queue length, replay queue length, and last inspected log.
You can view status of all the exchange servers of all DAGs in a domain.
Follow these steps:
1. Click the exchange_monitor node.
2. Select the DAG monitoring enabled checkbox to view DAG details.
Note: Deploy the exchange_monitor probe on every exchange node of DAG Set up.
3. Select the DAG check interval(Seconds) to specify the time interval at which you want to set DAG monitoring.
Note: For DAG Monitoring, check interval should be above 300 seconds depending upon the number of the nodes of DAG.
600 seconds is the default check interval for good performance of the probe for DAG monitoring.
4. Click the DAG node.
The details of all the DAGs present in the current domain are displayed. You can view the name of the DAG, server names pertaining to
DAG, and list of servers that are currently active.
5. Click the desired DAG to view details of all servers that are present in the current domain. Details of Number of active databases, number
of passive databases, number of mounted database, number of non-mounted databases, and the status of the server (Up or Down) are
displayed.
6. Click the desired server to view details of database copies residing on that server.
7. Click the database to view the details of the database, such as, server name, database name, database copy status, mailbox size,
content index state, copy queue length, replay queue length, and last inspected log.
Note: For Exchange Server, Exchange cmd-let identification, HTTP identification, LDAP identification and WMI identification are
required for Mailbox Growth. The data collector for exchange mailbox server role requires Microsoft .NET Framework v2.0 and
Microsoft Powershell v1.0.
Collect Reports
You can configure to collect reports on the data that is stored in database of the exchange server. Log in using HTTP and WMI on the exchange
server or LDAP on Active Directory server, for enabling this section,
Follow these steps:
1. Click the Setup > Report Information node.
2. Select the Gather report information check box.
3. Select the Database size from the drop-down options.
Note: Click the Force Update of Report Information option from the Actions drop-down list to update the report immediately.
Access Protocols
Login using any of the protocols (LDAP, HTTP, and WMI) to gather the data from the exchange server and also generate a report.
Follow these steps:
1. Click the Setup > Report Information > Protocol Name Information node.
2. Enter the field details like server name, base domain, username, password, and other protocol-specific details for the desired Protocol:
HTTP Information
LDAP Information
3. Click the Protocol Name Test option from the Actions drop-down list to test the probe connection to the selected protocol.
Note: You can create a new instance of LDAP Information by clicking the Options (icon) next to the node.
exchange_monitor Node
<DAG> Node
File Monitoring Node
<File Profile Name> Node
Profile Node
<Profile Type> Node
Profile Group node
Setup Node
Report Information
Status Node
exchange_monitor Node
The exchange_monitor node allows you to view the probe information.
Navigation: exchange_monitor Node
Event monitoring enabled: enables the monitoring of the profiles that are based on data from the NTevl probe.
Default: Selected
Event check interval (Seconds): indicates
Default: 60
DAG monitoring enabled: enables the monitoring of the Database Availability Group (DAG) server. You must also enable the DAG
checkpoints.
Note: This feature is available only for Exchange server 2010 and higher.
DAG check interval (Seconds): indicates the time interval (in seconds) between each DAG monitoring
Default: 600
process.
Browse DAG Script File: defines the path of the CheckDatabaseRedundancy.ps1 script. The script is required for
DatabaseRedundancyCount checkpoint, which is only supported on Exchange Server 2010.
File/Dir Monitoring: enables the file or directory monitoring.
User Name: defines the user name for accessing the files if they are present on another network.
Password: defines the password for accessing the files if they are present on another network.
Check Interval (Seconds): specifies the time interval between each file monitoring process.
Default: 30
Mail box size (KB)
Check mail growth: Indicates that the mail box growth on the exchange server is checked.
Check Interval: Indicates the time interval after which the probe checks the mail box growth on the exchange server.
Growth Limit on each iteration: Indicates the maximum limit for mail box growth in an iteration.
Iterations of growth before size alarm: Indicates the number of iterations of the mail box growth exceeds the minimum mailbox size
value.
Check only mailboxes with size greater than: Indicates a conditional value for checking the mail box size.
Mailbox items
Growth Limit on each iteration: Indicates the maximum limit for mail box growth in an iteration.
Iterations of growth before size alarm: Indicates the number of iterations of the mail box growth exceeds the minimum mailbox size
value.
Check only mailboxes with size greater than: Indicates a conditional value for checking the mail box size.
Message > Message definitions
Message Text: identifies the alarm message text that is issued on the message alarm.
Severity: specifies the level of severity of the alarm.
Message Token: selects the alarm message that is issued if the specified threshold value is breached.
Subsystem: identifies the subsystem ID of the alarm that defines the source of the alarm.
<DAG> Node
Navigation: exchange_monitor > DAG
DAG stands for Database Availability Group. You can view the details of the entire DAG setup by using the DAG node.
When you click the DAG node, the details of all the DAGs present in the current domain are displayed. You can view the name of the DAG server
names pertaining to DAG, and list of servers that are currently active.
Mailbox Server Name: displays the name of the Mailbox Server being monitored.
Database Name: displays the name of the database on the server.
Database file path: displays the file path of the required database.
Mailbox database copy status: displays the status of Mailbox database.
Mailbox size: displays the size of the mailbox to be monitored on the server.
Content index state: displays the status of the content index.
Copy queue length: displays the length of the copy queue.
Replay Queue length: displays the length of the replay queue.
Last inspected log: displays the date and time when the log was last inspected.
Note: You can click the Browse button to browse for a directory/location where the monitoring profile is scanned.
Note: The User and Password fields appears when \\IP Address is entered in Directory text box, where systems IP Address
is entered to access the shared folder.
Profile Node
Navigation: exchange_monitor Node > Profile Node
The profile node contains the child nodes for probe types for which the profiles are monitored. These profile types are those that are available in
the probe configuration file.
Note: The monitors are different for probes and the node is referred to as the probe-group name node in the document.
This section represents the monitors for profiles of ntevl, perfmon, processes, and ntservices probe. You can activate various monitors of the
profiles of each of these probes. You can view and configure the monitors of the profile.
Setup Node
This node represents the monitoring properties such as, generating reports on the growth of your mail box and login to exchange server using
LDAP, HTTP, or WMI protocol.
Navigation: exchange_monitor Node > Setup Node
Report Information
This node lets you configure the probe for collecting the reports on the data that are stored in database of the exchange server. For enabling this
section, you must log in using HTTP, LDAP, or WMI protocol.
Navigation: exchange_monitor > Setup > Report Information
Gather report information: gives all the gathered information available for the exchange monitor reports.
Filter stores and database for other servers: enables the probe to locate and report only the monitored databases and stores on the
Exchange Server.
Database size: Select to display the database size of either/both EDB and SLV files.
Note: Click the Force Update of Report Information option from the Actions drop-down to generate the report
instantaneously.
LDAP Information:
Navigation: exchange_monitor > Setup > LDAP Information
AdServer: specifies the adserver.
User: defines the user name of adserver.
Password: specifies the password of the user.
Base Domain: defines the domain of the users.
Config DN: enter the distinguished name of the adserver.
Encrypted Connections (SSL): uses SSL encryption to connect to the LDAP server.
Non Standard Ldap Port: uses user-defined port to establish a connection to the adserver.
Normal Port: specifies the standard LDAP port (Default: 389) to connect to the adserver.
Secure Port: specifies the SSL secure port (Default: 636) being used for the probe to connect to the adserver.
LDAP TimeOut: indicates the LDAP desired timeout to attempt connection to the adserver.
Reuse LDAP Connections: uses the already established connection to the adserver.
Status Node
This node represents the status of all the monitors for all the active profiles.
Navigation: exchange_monitor > Status
This section displays the details of all the status of monitors in a tabular form.
You can set or modify the files as needed:
Name: indicates the monitor name.
Unit: indicates the unit of the measured value.
Group: indicates the logical group to which this monitor belongs. A logical group defines the measurement type.
Type: indicates the probe from which this monitor is retrieved. This can be perfmon, ntservices, processes, and ntevl.
Value: indicates the latest measured value.
At: indicates the time, day, and the month at which the value is measured.
Compare Type: defines the operator for measuring the value.
Message on Alarm: indicates the alarm message text.
Notes:
The DAG profiles are only visible for exchange server 2010 and 2013.
Configuration values and profiles are stored in following locations - /setup, /perfmon, /processes, /services, and /ntevl sections.
If you are running the exchange server in a cluster, the sections are prefixed with /evs/server-<virtual_server_name>. For
example, a virtual server, named exch-grp-01 results in a /setup section that looks like /evs/server-exch-grp-01/setup. This
allows configuration settings to follow each virtual server as it moves around. This is applicable to the 5 sections mentioned
above, except for 2 keys in /setup, which is the log level key and the log file key that are always in the /setup section and never
move with the virtual server.
The following diagram outlines the process to configure the exchange_monitor probe:
Contents
Verify Prerequisites
Configure a Profile
Configure a Profile for File Monitoring
View DAG Details
Monitor Mailbox Growth
Collect Reports
Access Protocols
Verify Prerequisites
Verify that required hardware and software is available and the pre-configuration requirements are met before you configure the probe. For more
information, see exchange_monitor (Microsoft Exchange Monitoring) Release Notes.
Configure a Profile
You can configure the monitoring profiles with the exchange_monitor probe to monitor the required exchange_server:
Follow these steps:
1. Right click on the Status tab and select Profile properties.
2. Enter the Profile Name.
3. Select Active to activate the profile.
4. Enter a short Description of the monitoring profile.
5. In the Value properties tab, select Calculate Average Based on to calculate the average of the QoS value for the number of sample
specified.
6. Enter the sample number on which the average has to be calculated.
7. Select a threshold operator for the alarm, which defines the value of the alarm limit.
8. Define the Alarm limit, which is the threshold value for generating an alarm.
If this value is breached, an alarm is generated.
9. Select a Message the probe generates when the threshold value is breached.
10. Select a Clear Message the probe generates if the threshold is not breached.
11. Click OK.
A monitoring profile is configured.
Scan Directory
You can specify the details of the directory that the probe monitors:.
Follow these steps:
1. Enter the Name of the directory where the files are scanned.
2. Browse for a directory or a location where the monitoring profile is scanned.
3. Select a specific Pattern of files where monitoring profiles are searched.
For more information on regular expressions, see the exchange_monitor Regular Expressions article.
4. Select Recurse into subdirectories to search for profiles in the subdirectories.
5. Select a Pattern where monitoring profiles are searched.
For more information on regular expressions, see the exchange_monitor Regular Expressions article.
Note: The field appears only when Recurse into subdirectories is selected.
Alarm Message
You can specify the alarm message that is generated, if the defined threshold is breached when the probe is monitoring a file.
Number of files
Specify the following details in the Number of Files section to define the alarm and QoS details.
1. Selects the operator and enter the number of files to be monitored.
2. Select the Message ID, which is the alarm message the probe generates if the threshold is breached.
3. Select a clear message from the drop-down.
Size of file
Specify the following details to define the size of the monitored files.
1. Select the threshold operator for the number and size of files to be monitored.
2. Enter the number of files to be monitored.
3.
Note: The DAG profiles are only visible for Exchange Server 2010 and 2013.
Note: The details of all the DAGs present in the current domain are displayed. You can view the name and server names
pertaining to DAG. Also the list of servers that are currently active.
3. Click a required DAG to view details of all servers that are present in the current domain.
4. Click a required server to view details of database copies residing on that server.
5. Click a required database to view its details.
6. Click OK.
7. You can view the DAG files of the required server.
2.
Iterations of growth before size alarm is issued: indicates the number of times of the mailbox growth exceeds the Check Interval valu
e. After the specified limit, mailbox size alarm is issued.
Check only mailboxes with size greater than: indicates a conditional value for checking the mailbox size.
Similarly, enter the field details in the Mailbox items section.
3. Click OK.
A mailbox is created.
Collect Reports
You can configure the probe to collect the reports on the data that is stored in database of the exchange server.
Follow these steps:
1. Click the Setup > Report Information section.
2. Select the Gather report information check box.
3. Select the Database size from the drop-down options.
Note: Click the Force Update of Report Information to generate the report instantly.
Access Protocols
You can configure the probe to collect the reports on the data that is stored in database of the exchange server through access protocols.
Follow these steps:
1. Click the Setup > Report Information > Protocol Name Information .
2. Specify the details for the desired protocol:
The username of the account.
The domain for which the report has to be created.
The password for the account.
3. Click the test button to test their connection.
A report will be generated.
The Infrastructure Manager automatically downloads and installs the configuration interface, when the probe is deployed on a robot.
The Setup Tab
The Setup tab contains three sub-tabs and specifies the general properties for the probe.
Note: Mailbox growth tab will only be enabled when Gather report information is selected in the Report Information tab.
Performance monitoring enabled: enables monitoring of exchange_monitor profiles available of the perfmon probe.
The Performance check interval value is read from the perfmon probe.
Process monitoring enabled: enables monitoring of exchange_monitor profiles available on the processes probe.
The Process check interval value can be specified in seconds.
Service monitoring enabled: enables monitoring of exchange_monitor profiles available on the services probe.
The Service check interval value can be specified in seconds.
Event monitoring enabled: enables monitoring of exchange_monitor profiles available on the ntevl probe.
The Event check interval value can be specified in seconds.
DAG monitoring enabled: enables monitoring of DAG.
Note: Enable DAG related counters as they would not be activated by default. This feature can only be enabled on Exchange
Servers 2010.
Browse DAG script file: specifies path of the CheckDatabaseRedundancy.ps1 script file. This script is required for
DatabaseRedundancyCount counter.
If this script is not present, you can download it from:
http://gallery.technet.microsoft.com/office/8833b4db-8016-47e5-b747-951d28faafe7
Note: The file is also available on the exchange 2013 servers by default. Check in the folder <installation
directory>\Microsoft\Exchange Server\V15\Scripts on the system, where the Exchange Server is installed.
File/Dir Monitoring: enables the file and directory monitoring for all profiles. Click this option to enable the user name, password, and
check interval fields.
User name: displays user name access folders that are shared on the network as well as locally. This user name overrides the user
name in the File Monitoring tab.
Password: displays the password to access folders that have been shared on the network as well as locally. This password overrides
the password in the File Monitoring tab.
Check interval (seconds): selects or specifies the interval (in seconds) between each time the probe checks the Exchange server for
file/directory monitoring.
Report information
For detailed log in information and its parameters defined in the tab, see the section.
Gather report information: instructs the probe to make all gathered information available for the exchange_monitor_reports product.
Default: Disabled.
Note: Specify HTTP, VMI, and LDAP identifications successfully (Test them by clicking the corresponding test buttons to get
green indicators, to enable this option).
Force update of report information: fetches the most current Exchange information.
Default: Not Selected
Filter stores and databases for other servers: instructs the probe to find and report only the databases and stores on the Exchange
Server that is to be monitored. This ensures that databases and stores on other Exchange Servers, but on the same domain are
excluded.
Default: Selected
Database Size: the Exchange Server 2003 database consists of EDB file, SLV file, and Summary.
Reports the database size as a summary of these two files or the size of one of the two files.
WMI Identification (2003 Server) Exchange cmd-let Identification (2007 Server): fetches the report information using WMI and Exchange
cmd-let on Exchange server 2007. The following login information must be specified:
User
Domain
Password
Click the Test button to test the WMI/Exchange cmd-let login parameters specified.
LDAP identification: the report information is gathered from Active Directory (AD) using LDAP. The LDAP log in information must be
specified:
AD Server
User
Note: Specify the username on the format <user name>@<domain name> or as the LDAP DN (Distinguished Name of the
user, for example: CN=Koert Struijk,OU=Users,OU=Nimsoft,DC=nimsoft,DC=no).
Note: Select the Gather report information on the Report information tab to enable and the Mailbox tab.
Check interval: specifies the interval (in seconds) between each time the probe checks the Exchange server for mailbox growth.
Mailbox size in Kilobytes: this parameter requires that the Check mailbox growth option is selected.
Growth limit on each iteration: specifies the maximum allowed mailbox growth limit.
Only mailbox grows bigger than the specified number. These numbers are considered when counting the number of times the size
of a mailbox has increased. (see Iterations of growth before size alarm is issued).
Iterations of growth before size alarm is issued: the mailbox sizes are checked at the specified check interval.
If the size of a mailbox has increased iteratively the number of specified times here, a mailbox size alarm will be issued.
Check only mailboxes with size greater than: check only the mailboxes with size greater than specified size.
The Status Tab
This tab lists the status of all checkpoints for all active profiles.
Note: The colored icon displays the severity level of the current measured values for profiles with a defined alarm threshold.
Group: the checkpoints are segregated into different logical groups, describing the kind of measurement the profile performs (such as
memory, disk, processor).
Type: describes the type of checkpoint depending on which probe the checkpoint is retrieved:
Perf: checkpoints from the perfmon probe
Service: checkpoints from the services probe
Event: checkpoints from the ntevl probe
Process: checkpoints from the processes probe
Value: the most current value measured
Unit: the unit of the measured value, such as MB, %, and Mb/s
At: the time (<month> <day> <time>) the current value was measured.
Alarm limit: the defined threshold value for the profile. A breach of this value will result in an alarm message.
The Message Pool Tab
This tab contains the message pool, which is a list of predefined alarm messages. The messages can be referred to in specific profiles.
Right click for following commands in the message list:
Add Message: creates a new message and opens with the Message properties window.
Message properties: edits the selected Message.
Remove message: deletes the selected Message. You will be asked to confirm the deletion.
Message properties
Name: specifies the unique name of the message. This name is used to refer to the message from the profiles.
Alarm situation: the message can be made the default message for a particular alarm situation. Leave this field empty if there is another
default message.
Text: specifies the alarm message text. Supports variable expansion for the following variables:
Group
Profile
Value
Unit
Limit
Level: specifies the severity assigned level to any alarm message.
Subsystem: specifies the subsystem ID of generated alarms, a string or an ID is managed by NAS.
The Profile Setup Tab
This tab lists all the monitoring profiles that are available in the probe configuration file. You can use the following types of monitoring profiles with
the exchange_monitor probe:
Service: To monitor the exchange services running on a system.
Process: To monitor exchange processes running on a system.
Perf: To monitor the performance counters that are related to exchange.
Ntevl: To monitor the events that are related to exchange.
The File Monitoring Tab
The File Monitoring tab consists of the profiles, which monitor the files present on the local or on a network location. This tab monitors the
profiles based on patterns selected in the profile and sends configured alarms and QoS.
The fields in the above dialog are explained below:
Name: indicates the name of the monitoring profile.
Active: activates the profile.
Note: Active profiles appear in the profile list under the Profile Setup tab.
Note: This field appears when \\IP Address is entered in Directory text box, where systems IP Address is entered to access
the shared folder.
Password: the required password to access a shared folder on network.
Note: This field appears when \\IP Address is entered in Directory text box, where systems IP Address is entered to access
the shared folder.
Alarm Messages: The fields in the Number of Files tab are described here:
Expect this value: Selects the operator and the number of files.
Message ID: Selects the Message ID, populated in the drop-down list from the Message Pool tab.
Message clear: Selects the Message clear, which is populated in the drop-down from the Message Pool tab.
The following fields in the Space used by Files tab:
Expect this Value: selects the operator, the size and number of files.
Message ID: selects the Message ID, populated in the drop-down list from the Message Pool tab.
Message clear: selects the Message, populated in the drop-down list from the Message Pool tab.
The following fields in the Size of File tab:
Watch Size of: selects the size of the file from the below options:
Smallest Files
Largest Files
Individual Files
Quality of Service Messages
Number of Matching Profiles: sends a QoS for the number of matching profiles.
Space Used by Matching Files in Kilobytes: sends a QoS for the number of matching profiles.
The DAG Tab
DAG stands for Data Availability Group. You can view the details of the entire DAG setup by using the DAG tab.
When you click the DAG tab, the details of all the DAGs present in the current domain is displayed. You can view the name of the DAG, server
names pertaining to DAG, and list of servers that are currently active.
A DAG displays the details of all servers present in the current domain. It also shows the details of number of active databases, number of
passive databases, number of mounted database, number of non-mounted databases, and the status of the server (Up or Down) are displayed.
To view details of database copies residing on a particular server, click on that server.
You can view the server name, database name, database copy status, mailbox size, content index state, copy queue length, replay queue length,
and last inspected log.
Click on the data base to view its details.
Note: For DAG Monitoring, check interval should be above 300 seconds depending upon the number of the Nodes of DAG. 600
seconds is the default check interval for optimal performance of the probe for DAG monitoring.
Deploy DAG
Exclude Instances
A monitoring profile can have multiple instances. If you do not want the probe to monitor an instance within a profile, you can exclude the
instance.
The Exclude Instances button appears in the Profile Properties dialog when the following conditions are met while the probe reads the instance
and object keys from the configuration file:
The object is for a perf profile.
The value of the profile instance key is instance = <all>.
This user must be authorized to log into your Active Directory using the LDAP protocol.
This user requires read access to the configurationNamingContext and specifically the administrative group entry in your active directory, where
exchange servers store most information.
To add these access rights to a user, open the Exchange System Manager, right click on your administrative group, which contains the server(s)
you want to access, and choose Delegate control. Add your LDAP user to the Exchange View Only Administrator role.
User name syntaxes in Active Directory (AD).
Four known syntaxes are valid when authenticating against Active Directory using LDAP protocol, simple binds.
1. Full DN, e.g.
CN=Ben Talk,OU=West,OU=Users,DC=domain,DC=com
2. NT Account Name (e.g. NIMCLUS\e1john)
3. User Principal Name (UPN), in the userPrincipalName attribute of the user in Active Directory .
4. Plain username (e.g. user), in the display Name attribute in Active Directory
For option 2 and 3, see the image below.
Right-click the user in MMC snapin Active Directory Users and Computers, and click Properties and choose the Account tab.
For option 1, you need to use another tool, such as ADSIEDIT.MSC, LDP or any other LDAP browser tool. It is not possible to our knowledge, to
view the distinguishedName property of an object through the MMC snapin "Active Directory Users and Computers".
It is recommended to use the NT Account Name or UPN syntax.
Active Directory enforces/ensures that a users' sAMAccountName property is unique within a Active Directory forest. The domain NETBIOS name
is also enforced to be unique.
Note: UPN is not guaranteed to be unique in Active Directory forest. If two or more users have the same UPN, none of them will be
able to authenticate using the UPN name.
The same goes for plain username. It need to be unique within a forest.
WMI User
The probe will execute the "report collector" program as a background process, in the context of this user (i.e. "Run as"). This user context
will be used when connecting to the exchange shares and WMI queries.
You must be a domain user.
The UAC is on the Exchange Server, if the user is not using the domain administrative account,
You must be a member of the default domain group named Domain Users (or equivalent rights).
You must be member of the domain security group named "Exchange Servers".
You must be added to the "local administrators" group on the machine running the exchange server.
You must also be delegated control of the "Exchange View Only administrator" role.
Note: Run your exchange servers in a cluster. You must add the WMI monitoring account to the local administrators group of all those
nodes that can be owners of your virtual exchange servers.
You must Reboot your exchange server in order for this effect to take place.
HTTP (WebDav) user permissions.
Your permission will be required when contacting the exchange server to query for public folder information, with WebDAV (http) protocol
against the IIS (Outlook Web Access (OWA) also uses WebDAV).
You must be a domain user.
You must be a member of the default domain group named Domain Users (or equivalent rights).
You require at least these client permissions (Read permissions, List Contents and Read Property) on the folders in the folder hierarchy.
HTTP users will be used in the WebDAV protocol over http to talk to IIS, where the public folders is hosted and reachable. Data for the "public
folder owner" report is gathered here.
The actual list of public folders is retrieved through WMI.
WMI - the exchange_monitor probes starts a background process in the context of the WMI user (i.e. Run as command). The probe context is
used when accessing the WMI namespaces on the server running the exchange server.
In the WMI namespace root\MicrosoftExchangeV2, Exchange Server 2003 exposes several classes. The data collector retrieves instances of the
following wmi classes.
Exchange_PublicFolder - instances of public folders, and public folder replication information (to which stores are the folders replicated)
Exchange_Server - instance of the server
Exchange_Mailbox - instances of mailboxes (sizes, deleted items, item count)
Exchange_MessageTrackingEntry - message tracking entries
In the WMI namespace root\CIMV2, instances of the following classes are read:
Win32_NTLogEvent - whitespace information
LDAP - or Active Directory - will be queried to read additional information about the Exchange Server configuration. In
the configurationNamingContext - exchange server information and structure of storage groups, stores (both public and mailbox) and db
quotas.
One or more domains can be queried for distribution lists and its members.
Queries are made to retrieve users who belong to the storage groups we found in the configurationNamingContext.
Users mailboxes, mail addresses, delegates, memberOfs, account control, password last set.
It is then all tied together before it is stored and ready to be retrieved from the exchange_monitor_backend probe.
Note: The exchange_monitor_backend probe has reached its end of life and is no longer supported.
The report collector will also try to access the database files directly (the .edb files), and read the physical size of the files. The information about
where the physical files are stored, is part of the information retrieved from configurationNamingContext.
The reports have been divided into 10 sub reports, exchange_monitor point of view. The data collected is controlled and cannot be changed. The
reports are:
Contents
Verify Prerequisites
Configure Exchange Server Path
Configure Profile
Alarm Thresholds
Create File Profile
View DAG Details
Monitor Mailbox Growth
Collect Reports
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see exchange_monitor (Microsoft
Exchange Monitoring) Release Notes.
The following probes must be installed on the exchange server:
Note: The four probes must be deployed on either the exchange server or the local Hub Archive. These probes are automatically
installed from the archive, else the exchange monitor asks to install them manually.
Note: Similarly, define the Microsoft Exchange Server 2013 installation directory path as the exchange_2013_path key value.
Important: The installation path must end with a forward slash (/).
Configure Profile
Install the probe on each Exchange Server to be monitored. The probe detects the version of the server and activates the profiles accordingly.
The probe retrieves and uses values that the following probes monitor and creates monitoring profiles. You can use the following types of
monitoring profiles with the probe:
NTServices
Processes
Perfmon
NTevl
Note: For Exchange Server, Exchange cmd-let identification, HTTP identification, LDAP identification and WMI identification are
required Mailbox Growth and exchange_backend probe. The data collector for exchange mailbox server role requires Microsoft .NET
Framework v2.0 and Microsoft Powershell v1.0.
Note: Monitoring is enabled for all kinds of profiles by default. Deselect the desired monitoring type if you do not want to
monitor it.
2. Set the check interval to indicate the time interval (in seconds) between each profile monitoring process.
3.
3. Click the required monitoring group node (navigation Profile > Profile Type > Profile Group).
You can view the list of all monitors that are defined on the profile in the selected group. The monitors are grouped into different logical
groups, describing what kind of measurement the profile performs. You can also configure the alarm messages that are generated.
The monitoring groups are described as follows:
Event
Monitors from the ntevl probe
Perf
Monitors from the perfmon probe
Cluster
Disk
Information Store
Memory
Message Transfer
Network
Processor
SMTP
Process
Monitors from the processes probe
Service
Monitors from the services probe
4. Select Publish Data and Publish Alarm to generate a QoS when the specified parameter of the profile is breached.
5. Select Calculate Average based on field to calculate the average of the QoS value for the number of sample specified
6. Specify a measured value in the Samples field on which the average has to be calculated.
7. Select a Compare Type value for type of comparison of the threshold value.
8. Define the Alarm limit, which is the threshold value for generating alarm.
9. Select an alarm message and alarm clear message in the Message on alarm and Message clear fields.
10. Click the Save button to save the configuration.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
3.
Note: Click the Browse button to browse for a directory/location where the monitoring profile is scanned.
Pattern: describes pattern of files where monitoring profiles are searched. Refer exchange_monitor Regular Expressions article for
regex details.
Recurse into subdirectories: searches for files in the subdirectories.
Exclude directories Pattern: excludes patterns of files that are not monitored. The field appears only when Recurse into
subdirectories is selected. Refer exchange_monitor Regular Expressions article for regex details.
User: displays the user name that is required for sharing a folder on network.
Password: displays the password required to access a shared folder on network
Note: The User and Password fields appear when \\IP Address is entered in Directory text box, where systems IP Address
is entered to access the shared folder.
8. Enter the following field information in the Space and Size of File section to define the space and size used by files.
Used Space Operator: selects the operator for the file
Used Space Value: selects the space of file
File space Message ID: selects the Message ID that is generated when the threshold is breached for file space. The messages are
populated in the drop-down from the Message Node.
File space Message Clear: selects the file space clear alarm message, which is populated in the drop-down from the Message Node.
File Size Operator: selects operator for file size
File Size Value: selects the size of the file
File Size Unit: selects the unit of the file size
File size Message ID: selects the Message ID that is generated when the threshold is breached for file size.
File size Message Clear: selects clear alarm message for file size.
Watch Size of: selects the size of the file from the following options:
Smallest Files
Largest Files
Individual Files
Note: Deploy the exchange_monitor probe on every exchange node of DAG Set up.
3. Select the DAG check interval(Seconds) to specify the time interval at which you want to set DAG monitoring.
Note: For DAG Monitoring, check interval should be above 300 seconds depending upon the number of the nodes of DAG.
600 seconds is the default check interval for good performance of the probe for DAG monitoring.
4. Click the DAG node.
The details of all the DAGs present in the current domain are displayed. You can view the name of the DAG, server names pertaining to
DAG, and list of servers that are currently active.
5. Click a desired DAG to view details of all servers that are present in the current domain. Details of Number of active databases, number
of passive databases, number of mounted database, number of non-mounted databases, and the status of the server (Up or Down) are
displayed.
6. Click a desired server to view details of database copies residing on that server.
7. Click a database to view the details of the database, such as, server name, database name, database copy status, mailbox size, content
index state, copy queue length, replay queue length, and last inspected log.
Collect Reports
You can configure the probe for collecting the reports on the data that are stored in database of the exchange server. For enabling this
section, log in using HTTP and WMI on the exchange server or LDAP on Active Directory server.
Follow these steps:
1.
Note: Click the Force Update of Report Information option from the Actions drop-down list to update the report immediately.
Access Protocols
Log-in using any of the protocols (LDAP, HTTP, and WMI) to gather the data from the exchange server and also generate a report.
Follow these steps to access a protocol:
1. Click the Setup > Report Information > Protocol Name Information node.
2. Enter the field details like server name, base domain, username, password, and other protocol-specific details for the desired Protocol:
HTTP Information
LDAP Information
WMI Information
3. Click the Protocol Name Test option from the Actions drop-down list to test the probe connection to the selected protocol.
Note: You can create a new instance of LDAP Information by clicking the Options (icon) next to the node.
exchange_monitor Node
<DAG> Node
File Monitoring Node
<File Profile Name> Node
Message Node
Profile Node
<Profile Type> Node
Profile Group node
Setup Node
Mail Growth
Report Information
Protocol Name Node
Status Node
exchange_monitor Node
The exchange_monitor node allows you to view the probe information.
Navigation: exchange_monitor Node
Set or modify the following values as required:
exchange_monitor > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the vendor who created the probe.
exchange_monitor > General Setup
This section lets you configure the probe monitoring properties and enable the performance counters.
Log Level: specifies the detail level of the log file.
Default: 0-Fatal
Log Size(KB)
Default: 100
Perfmon Samples: specifies the number of samples for Perfmon counters.
Default: 1
Performance monitoring enabled: enables the monitoring of the profiles that are based on data from the Perfmon probe.
Default: Selected
Performance check interval (Seconds): indicates the time interval (in seconds) between each profile monitoring process.
Default: 60
Process monitoring enabled: enables the monitoring of the profiles that are based on data from the Processes probe.
Process check interval(Seconds)
Default: 180
Service monitoring enabled: enables the monitoring of the profiles that are based on data from the NTServices probe.
Service check interval(Seconds)
Default: 180
Event monitoring enabled: enables the monitoring of the profiles that are based on data from the NTevl probe.
Event check interval(Seconds)
Default: 60
DAG monitoring enabled: enables the monitoring of the Database Availability Group (DAG) server. You must also enable the DAG
checkpoints.
Note: This feature is available only for Exchange server 2010 and higher.
<DAG> Node
Navigation: exchange_monitor > DAG
DAG stands for Database Availability Group. You can view the details of the entire DAG setup by using the DAG node.
When you click the DAG node, the details of all the DAGs present in the current domain are displayed. You can view the name of the DAG,
server names pertaining to DAG, and list of servers that are currently active.
You can click a particular DAG to view details of all servers that are present in the current domain. Details of Number of active databases, number
of passive databases, number of mounted database, number of non-mounted databases, and the status of the server (Up or Down) are
displayed.
You can click a server to view details of database copies residing on that server.
Click a particular database to view details of that database, such as server name, database name, database copy status, mailbox size, content
index state, copy queue length, replay queue length, and last inspected log.
Note: For DAG Monitoring, check interval should be above 300 seconds depending upon the number of the Nodes of DAG. 600
seconds is the default check interval for optimal performance of the probe for DAG monitoring.
Deploy the exchange_monitor probe on every exchange node of DAG Set up.
Enable the DAG Monitoring by selecting the DAG monitoring enabled check box.
Cluster probe is not required for DAG monitoring.
Deploy exchange_monitor probe on each exchange server. Exchange_monitor probe monitors exchange server on which it is deployed as well as
database copies residing on the server. You can view status of all the exchange servers of all DAGs in a domain. The following diagram
represents DAG architecture:
Note: You can click the Browse... button to browse for a directory/location where the monitoring profile is scanned.
Pattern
Shows pattern of files where monitoring profiles are searched.
Recurse into subdirectories
Searches for profiles in the subdirectories.
Exclude directories Pattern
Excludes patterns of files that are not monitored. The field appears only when Recurse into subdirectories is selected.
User
Describes the user name that is required for sharing a folder on network.
Password
Displays the password required to access a shared folder on network
Note: The User and Password fields appears when \\IP Address is entered in Directory text box, where systems IP Address
is entered to access the shared folder.
Message Node
This node lets you view the list of alarm message that is available on the Exchange Server Monitor probe.
Navigation: exchange_monitor Node > Message Node
Set or modify the following values as required:
Message > Message definitions
This section allows you to view the details of alarm messages of the Exchange Server Monitor probe.
Message Text: Identifies the alarm message text that is issued on the message alarm.
Severity: specifies the level of severity of the alarm.
Message Token: selects the alarm message that is issued if the specified threshold value is breached.
Subsystem: identifies the subsystem ID of the alarm that defines the source of the alarm.
Profile Node
Navigation: exchange_monitor Node > Profile Node
The profile node contains the child nodes for probe types for which the profiles are monitored. These profile types are those that are available in
the probe configuration file.
Note: The monitors are different for probes and the node is referred to as the probe-group name node in the document.
This section represents the monitors for profiles of ntevl, perfmon, processes, and ntservices probe. You can activate various monitors of the
profiles of each of these probes. You can view and configure the monitors of the profile.
Set or modify the following values as required:
Profile Group > <Monitor Name>
This section allows you to view and configure the QoS properties of the profile.
Setup Node
This node represents the monitoring properties such as, generating reports on the growth of your mail box and login to exchange server using
LDAP, HTTP, or WMI protocol.
Navigation: exchange_monitor Node > Setup Node
Mail Growth
This node represents the properties of the probe for monitoring the growth of mail boxes on the exchange server.
Note: The fields in this node are disabled as this functionality depends on group policies and user privileges.
Report Information
This node lets you configure the probe for collecting the reports on the data that are stored in database of the exchange server. For enabling this
section, you must log in using HTTP, LDAP, or WMI protocol.
Navigation: exchange_monitor > Setup > Report Information
Set or modify the following values as required:
Note: Click the Force Update of Report Information option from the Actions drop-down to generate the report instantaneously.
Note: This node is referred to as the protocol name node as it represents 3 different protocols.
Status Node
This node represents the status of all the monitors for all the active profiles.
Navigation: exchange_monitor > Status
Set or modify the following values as required:
Status > Status Viewer
This section displays the details of all the status of monitors in a tabular form.
Name: indicates the monitor name.
Unit: indicates the unit of the measured value.
Group: indicates the logical group to which this monitor belongs. A logical group defines the measurement type.
Type: indicates the probe from which this monitor is retrieved. This can be perfmon, ntservices, processes, and ntevl.
Value: indicates the latest measured value.
At: indicates the time, day, and the month at which the value is measured.
Compare Type: defines the operator for measuring the value.
Message on Alarm: indicates the alarm message text.
Note: Configuration values and profiles are stored in following locations - /setup, /perfmon, /processes, /services, and /ntevl sections. If
you are running the exchange server in a cluster, the sections are prefixed with /evs/server-<virtual_server_name>. For example, a
virtual server, named exch-grp-01 results in a /setup section that looks like /evs/server-exch-grp-01/setup. This allows configuration
settings to follow each virtual server as it moves around. This is applicable to the 5 sections mentioned above, except for 2 keys in
/setup, which is the log level key and the log file key that are always in the /setup section and never move with the virtual server.
The following diagram shows the steps to configure the exchange monitor probe:
Contents
Verify Prerequisites
Configure Exchange Server Path
Configure a Profile
Configure a Profile for File Monitoring
View DAG Details
Monitor Mailbox Growth
Collect Reports
Access Protocols
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see exchange_monitor (Microsoft
Exchange Monitoring) Release Notes.
2.
For 2010 - exchange_2010_path key value.
For 2013 - exchange_2013_path key value.
Important: The installation path must end with a forward slash (/).
Configure a Profile
You can use the following types of monitoring profiles with the exchange_monitor probe to monitor the required exchange_server:
Service: To monitor services running on a system.
Process: To monitor services running on a system.
Perf: To monitor the performance counters that are related to exchange.
Ntevl: To monitor the events that are related to exchange.
Follow these steps:
1. Right click on the Status tab and select Profile properties.
2. Enter the Profile Name.
3. Select Active to activate the profile.
4. Enter a short Description of the monitoring profile.
5. In the Value properties tab, select Calculate Average Based on to calculate the average of the QoS value for the number of sample
specified.
6. Enter the sample number on which the average has to be calculated.
7. Select a threshold operator for the alarm, which defines the value of the alarm limit.
8. Define the Alarm limit, which is the threshold value for generating alarm.
If this value is breached, an alarm is generated.
9. Select a Message the probe generates when the threshold value is breached.
10. Select a Clear Message the probe generates if the threshold is not breached.
11. Click OK.
Scan Directory
You can specify the details of the directory to be monitored by the probe.
Follow these steps:
1. Enter the Name of the directory where the files are scanned.
2. Browse for a directory or a location where the monitoring profile is scanned.
3. Select a specific Pattern of files where monitoring profiles are searched.
3.
Refer exchange_monitor Regular Expressions article for regex details.
4. Select Recurse into subdirectories to search for profiles in the subdirectories.
5. Select a Pattern where monitoring profiles are searched.
Refer exchange_monitor Regular Expressions article for regex details.
Note: The field appears only when Recurse into subdirectories is selected.
Alarm Message
You can specify the alarm message that is generated, if the defined threshold is breached when the probe is monitoring a file.
Number of files:
1. Selects the operator and enter the number of files to be monitored.
2. Select the Message ID, which is the alarm message the probe generates if the threshold is breached.
3. Select a clear message from the drop-down.
Space used by files
1. Select the threshold operator for the number and size of files to be monitored.
2. Enter the number of files to be monitored.
3. Select the message ID which is generated when the threshold is breached.
4. Select the clear alarm message.
Size of file
1. Select the threshold operator for the number and size of files to be monitored.
2. Enter the number of files to be monitored.
3. Select the message ID which is generated when the threshold is breached.
4. Select the clear alarm message.
5. Selects the size of the file to be monitored from the below options:
Smallest Files
Largest Files
Individual Files
2.
Note: The details of all the DAGs present in the current domain are displayed. You can view the name and server names
pertaining to DAG. Also the list of servers that are currently active.
3. Click a required DAG to view details of all servers that are present in the current domain.
4. Click a required server to view details of database copies residing on that server.
5. Click a required database to view its details.
6. Click OK.
7. You can view the DAG files of the required server.
Collect Reports
You can configure the probe to collect the reports on the data that is stored in database of the exchange server.
Follow these steps:
1. Click the Setup > Report Information section.
2. Select the Gather report information check box.
3. Select the Database size from the drop-down options.
Note: Click the Force Update of Report Information to generate the report instantly.
Access Protocols
You can configure the probe to collect the reports on the data that is stored in database of the exchange server through Access Protocols.
Follow these steps:
1. Click the Setup > Report Information > Protocol Name Information .
2. Enter the field details for the desired Protocol:
HTTP Information
2.
User: Enter the username of the account.
Domain: Enter the domain for which the report has to be created.
Password: Enter the password for the account.
Similarly you can configure details for LDAP Identification and HTTP identification - for public folders
3. Click the test button to test their connection.
A report will be generated.
The Setup tab contains three sub-tabs and specifies the general properties for the probe.
Note: Two tabs Report information and Mailbox growth do not apply and are deactivated in this version.
Note: Enable DAG related counters as they would not be activated by default. This feature can only be enabled on Exchange
Servers 2010.
Browse DAG script file
Specifies path of the CheckDatabaseRedundancy.ps1 script file. This script is required for DatabaseRedundancyCount counter.
If this script is not present, you can download it from:
http://gallery.technet.microsoft.com/office/8833b4db-8016-47e5-b747-951d28faafe7
Note: The file is also available on the exchange 2013 servers by default. Check in the folder <installation
directory>\Microsoft\Exchange Server\V15\Scripts on the system, where the Exchange Server is installed.
File/Dir Monitoring
Enables the file and directory monitoring for all profiles. Click this option to enable the user name, password, and check interval fields.
User name
Displays user name access folders that are shared on the network as well as locally. This user name overrides the user name in the File
Monitoring tab.
Password
Displays the password to access folders that have been shared on the network as well as locally. This password overrides the password
in the File Monitoring tab.
Check interval (seconds)
Selects or specifies the interval (in seconds) between each time the probe checks the Exchange server for file/directory monitoring.
Report information
For detailed log in information and its parameters defined in the tab, see the section.
Note: Specify HTTP, VMI, and LDAP identifications successfully (Test them by clicking the corresponding test buttons to get
green indicators, to enable this option).
LDAP Test
Tests the specified LDAP login parameters
If your mailbox users and distribution groups are spread around in multiple domains, specify one Active Directory domain controller for
each domain using the Manage AD Servers dialog
Manage AD Servers
Specifies the LDAP login information for one AD Server.
Click this button to open the Manage AD Servers dialog. It enables you to manage several AD servers.
You can add servers and specify the LDAP login information (as described above) for each of them:
AD Server
User
Password
Base DN (LDAP BaseDN field)
Advanced (button)
Opens the LDAP connection properties dialog.
Connection type
Encrypts the communication between the probe and the Active Directory domain controller.
Non-standard LDAP port
Allows you to change the LDAP port numbers if you do not want to use the standard ports.
LDAP search timeout
Specifies the maximum number of seconds the probe is allowed to use on each LDAP search.
Default: 30 seconds.
HTTP identification
The report information is gathered from the Exchange server, using WebDAV.
Note: Select the Gather report information on the Report information tab to enable and the Mailbox tab.
Check interval
Specifies the interval (in seconds) between each time the probe checks the Exchange server for mailbox growth.
Mailbox size in Kilobytes
This parameter requires that the Check mailbox growth option is selected.
Growth limit on each iteration
Specifies the maximum allowed mailbox growth limit.
Only mailbox grows bigger than the specified number. These numbers are considered when counting the number of times the size
of a mailbox has increased. (see Iterations of growth before size alarm is issued).
Iterations of growth before size alarm is issued
The mailbox sizes are checked at the specified check interval.
If the size of a mailbox has increased iteratively the number of specified times here, a mailbox size alarm will be issued.
Check only mailboxes with size greater than
Check only the mailboxes with size greater than specified size.
The Status Tab
This tab lists the status of all checkpoints for all active profiles.
Name
The name of the checkpoint
Note: The colored icon displays the severity level of the current measured values for profiles with a defined alarm threshold.
Group
The checkpoints are segregated into different logical groups, describing the kind of measurement the profile performs (such as memory,
disk, processor).
Type
Describes the type of checkpoint depending on which probe the checkpoint is retrieved:
Perf
Checkpoints from the perfmon probe
Service
Checkpoints from the services probe
Event
Checkpoints from the ntevl probe
Process
Checkpoints from the processes probe
Value
The most current value measured
Unit
The unit of the measured value, such as MB, %, and Mb/s
At
The time (<month> <day> <time>) the current value was measured.
Alarm limit
The defined threshold value for the profile. A breach of this value will result in an alarm message.
The Message Pool Tab
This tab contains the message pool, which is a list of predefined alarm messages. The messages can be referred to in specific profiles.
Right click for following commands in the message list:
Add Message
Creates a new message and opens with the Message properties window.
Message properties
Edit the selected Message.
Remove message
Delete the selected Message. You will be asked to confirm the deletion.
Message properties
Name
Specifies the unique name of the message. This name is used to refer to the message from the profiles.
Alarm situation
The message can be made the default message for a particular alarm situation. Leave this field empty if there is another default
message.
Text
Specifies the alarm message text. Supports variable expansion for the following variables:
Group
Profile
Value
Unit
Limit
Level
The severity assigned level to any alarm message.
Subsystem
The subsystem ID of generated alarms, a string or an ID is managed by NAS.
The Profile Setup Tab
This tab lists all the monitoring profiles that are available in the probe configuration file. You can use the following types of monitoring profiles with
the exchange_monitor probe:
Service: To monitor services running on a system.
Process: To monitor services running on a system.
Perf: To monitor the performance counters that are related to exchange.
Ntevl: To monitor the events that are related to exchange.
The File Monitoring Tab
The File Monitoring tab consists of the profiles, which monitor the files present on the local or on a network location. This tab monitors the
profiles based on patterns selected in the profile and sends configured alarms and QoS.
Note: Active profiles appear in the profile list under the Profile Setup tab.
Description
Description for the monitoring profile.
Scan Directory
Directory
Name of the directory where the files are scanned for monitoring profiles.
Browse
Browses for a directory/location where the monitoring profile is scanned.
Pattern
Pattern of files where monitoring profiles are searched.
Recurse into subdirectories
Searches for profiles in the subdirectories.
Pattern
Pattern of files where monitoring profiles are searched.
User
The user required name for sharing a folder on network.
Note: This field appears when \\IP Address is entered in Directory text box, where systems IP Address is entered to access
the shared folder.
Password
The required password to access a shared folder on network.
Note: This field appears when \\IP Address is entered in Directory text box, where systems IP Address is entered to access
the shared folder.
Alarm Messages
The fields in the Number of Files tab are described here:
Expect this value: Selects the operator and the number of files.
Message ID: Selects the Message ID, populated in the drop-down list from the Message Pool tab.
Message clear: Selects the Message clear, which is populated in the drop-down from the Message Pool tab.
The following fields in the Space used by Files tab:
Expect this Value: Selects the operator, the size and number of files.
Message ID: Selects the Message ID, populated in the drop-down list from the Message Pool tab.
Message clear: Selects the Message, populated in the drop-down list from the Message Pool tab.
The following fields in the Size of File tab:
Watch Size of: Selects the size of the file from the below options:
Smallest Files
Largest Files
Individual Files
Quality of Service Messages
Number of Matching Profiles
Sends a QoS when the number of matching profiles is matched with the value specified in Expect this Value field.
DAG stands for Data Availability Group. You can view the details of the entire DAG setup by using the DAG tab.
When you click the DAG tab, the details of all the DAGs present in the current domain is displayed. You can view the name of the DAG, server
names pertaining to DAG, and list of servers that are currently active.
A DAG displays the details of all servers present in the current domain. It also shows the details of number of active databases, number of
passive databases, number of mounted database, number of non-mounted databases, and the status of the server (Up or Down) are displayed.
To view details of database copies residing on a particular server, click on that server.
You can view the server name, database name, database copy status, mailbox size, content index state, copy queue length, replay queue length,
and last inspected log.
Click on the data base to view its details.
Note: For DAG Monitoring, check interval should be above 300 seconds depending upon the number of the Nodes of DAG. 600
seconds is the default check interval for optimal performance of the probe for DAG monitoring.
Deploy DAG
Exclude Instances
A monitoring profile can have multiple instances. If you do not want the probe to monitor an instance within a profile, you can exclude the
instance.
The Exclude Instances button appears in the Profile Properties dialog when the following conditions are met while the probe reads the instance
and object keys from the configuration file:
The object is for a perf profile.
The value of the profile instance key is instance = <all>.
Four known syntaxes are valid when authenticating against Active Directory using LDAP protocol, simple binds.
1. Full DN, e.g.
CN=Ben Talk,OU=West,OU=Users,DC=domain,DC=com
2. NT Account Name (e.g. NIMCLUS\e1john)
3. User Principal Name (UPN), in the userPrincipalName attribute of the user in Active Directory .
4. Plain username (e.g. user), in the display Name attribute in Active Directory
For option 2 and 3, see the image below.
Right-click the user in MMC snapin Active Directory Users and Computers, and click Properties and choose the Account tab.
For option 1, you need to use another tool, such as ADSIEDIT.MSC, LDP or any other LDAP browser tool. It is not possible to our knowledge, to
view the distinguishedName property of an object through the MMC snapin "Active Directory Users and Computers".
It is recommended to use the NT Account Name or UPN syntax.
Active Directory enforces/ensures that a users' sAMAccountName property is unique within a Active Directory forest. The domain NETBIOS name
is also enforced to be unique.
Note: UPN is not guaranteed to be unique in Active Directory forest. If two or more users have the same UPN, none of them will be
able to authenticate using the UPN name.
The same goes for plain username. It need to be unique within a forest.
WMI User
The probe will execute the "report collector" program as a background process, in the context of this user (i.e. "Run as"). This user context
will be used when connecting to the exchange shares and WMI queries.
You must be a domain user.
You must be a member of the default domain group named Domain Users (or equivalent rights).
You must be member of the domain security group named "Exchange Servers".
You must be added to the "local administrators" group on the machine running the exchange server.
You must also be delegated control of the "Exchange View Only administrator" role.
Note: Run your exchange servers in a cluster. You must add the WMI monitoring account to the local administrators group of all those
nodes that can be owners of your virtual exchange servers.
You must Reboot your exchange server in order for this effect to take place.
HTTP (WebDav) user permissions.
Your permission will be required when contacting the exchange server to query for public folder information, with WebDAV (http) protocol
against the IIS (Outlook Web Access (OWA) also uses WebDAV).
You must be a domain user.
You must be a member of the default domain group named Domain Users (or equivalent rights).
You require at least these client permissions (Read permissions, List Contents and Read Property) on the folders in the folder hierarchy.
HTTP users will be used in the WebDAV protocol over http to talk to IIS, where the public folders is hosted and reachable. Data for the "public
folder owner" report is gathered here.
The actual list of public folders is retrieved through WMI.
WMI - the exchange_monitor probes starts a background process in the context of the WMI user (i.e. Run as command). The probe context is
used when accessing the WMI namespaces on the server running the exchange server.
In the WMI namespace root\MicrosoftExchangeV2, Exchange Server 2003 exposes several classes. The data collector retrieves instances of the
following wmi classes.
Exchange_PublicFolder - instances of public folders, and public folder replication information (to which stores are the folders replicated)
Exchange_Server - instance of the server
Exchange_Mailbox - instances of mailboxes (sizes, deleted items, item count)
Exchange_MessageTrackingEntry - message tracking entries
In the WMI namespace root\CIMV2, instances of the following classes are read:
Win32_NTLogEvent - whitespace information
LDAP - or Active Directory - will be queried to read additional information about the Exchange Server configuration. In
the configurationNamingContext - exchange server information and structure of storage groups, stores (both public and mailbox) and db
quotas.
One or more domains can be queried for distribution lists and its members.
Queries are made to retrieve users who belong to the storage groups we found in the configurationNamingContext.
Users mailboxes, mail addresses, delegates, memberOfs, account control, password last set.
It is then all tied together before it is stored and ready to be retrieved from the exchange_monitor_backend probe.
The report collector will also try to access the database files directly (the .edb files), and read the physical size of the files. The information about
where the physical files are stored, is part of the information retrieved from configurationNamingContext.
The reports have been divided into 10 sub reports, exchange_monitor point of view. The data collected is controlled and cannot be changed.
They are:
Traffic_summary, updated when data is older than 24 hours or at midnight.
Public_folders, updated when data is older than 24 hours or at midnight.
Domains, updated when data is older than 24 hours or at midnight.
Servers, updated when data is older than 24 hours or at midnight.
Mailboxes, updated when data is older than 10 minutes if you have enabled "monitor mailbox growth", otherwise it should be 60 minutes.
Display_names, updated when mailboxes are updated.
Partners, updated when data is older than 24 hours or at midnight.
Mailbox_activity, updated when data is older than 24 hours or at midnight.
Groups, updated when mailboxes are updated.
Databases, updated when data is older than 24 hours or at midnight.
The exchange_monitor_backend probe querries and checks the timestamps of all these reports. If the exchange_monitor probe being queried has
newer information on any of the reports, it will be collected by the backend and inserted into the SQL database. The exchange_monitor_backend
operates on the same aging policy as the exchange_monitor probe. It will not query that server again for 1 hour, if it has all the data it needs from
1 server, before the mailbox data is considered old. This means that if the exchange_monitor cant for some reason read e.g. the message
tracking logs, the backend will not get an accurate timestamp and will keep querying the exchange_monitor probe until it provides new data, once
the exchange_monitor_backends data for the reports regarding traffic is older than 24 hours.
It is important to set up the report collection and ensure that all the reports receive data, otherwise your probes will be using more network
bandwidth than necessary.
Notes:
From exchange_monitor 5.0 version onwards, only server specific counters will be loaded in the Profile setup tab.
Exchange Server 2003 is not supported with the probe version 5.2 or later.
Contents:
DAG Counters
Performance Counters - Support for Exchange Server Versions
Services Counters - Support for Exchange Server Versions
DAG Counters
The following table describes the DAG counters, which are supported on DAG Exchange Server 2010 and DAG Exchange Server 2013:
Counter Name
Description
DAG
Exchange
Server
2010
DAG
Exchange
Server
2013
Yes
Yes
Yes
Yes
Yes
Yes
Number of database copies whose state is not "mounted" on current Mailbox server
Yes
Yes
CopyQueueLength-DAG
Shows the number of transaction log files waiting to be copied to the passive copy log
file folder. A copy isn't considered complete until it has been checked for corruption.
Yes
Yes
ReplayQueueLength-DAG
Shows the number of transaction log files waiting to be replayed into the passive copy.
Yes
Yes
The mailbox database copy is in a Failed state because it isn't suspended, and it isn't
able to copy or replay log files. While in a Failed state and not suspended, the system
will periodically check whether the problem that caused the copy status to change to
Failed has been resolved. After the system has detected that the problem is resolved,
and barring no other issues, the copy status will automatically change to Healthy.
Yes
Yes
The mailbox database copy is being seeded, the content index for the mailbox
database copy is being seeded, or both are being seeded. Upon successful completion
of seeding, the copy status should change to Initializing.
Yes
Yes
The mailbox database copy is being used as a source for a database copy seeding
operation.
Yes
Yes
Yes
Yes
The mailbox database copy is successfully copying and replaying log files, or it has
successfully copied and replayed all available log files.
Yes
Yes
The Microsoft Exchange Replication service isn't available or running on the server that
hosts the mailbox database copy.
Yes
Yes
The mailbox database copy will be in an Initializing state when a database copy has
been created, when the Microsoft Exchange Replication service is starting or has just
been started, and during transitions from Suspended, ServiceDown, Failed, Seeding,
SinglePageRestore, LostWrite, or Disconnected to another state. While in this state, the
system is verifying that the database and log stream are in a consistent state.
In most cases, the copy status will remain in the Initializing state for about 15 seconds,
but in all cases, it should generally not be in this state for longer than 30 seconds.
Yes
Yes
The mailbox database copy and its log files are being compared with the active copy of
the database to check for any divergence between the two copies. The copy status
will remain in this state until any divergence is detected and resolved.
Yes
Yes
The active copy is online and accepting client connections. Only the active copy of the
mailbox database copy can have a copy status of Mounted.
Yes
Yes
The active copy is offline and not accepting client connections. Only the active copy of
the mailbox database copy can have a copy status of Dismounted.
Yes
Yes
The active copy is coming online and not yet accepting client connections. Only the
active copy of the mailbox database copy can have a copy status of Mounting.
Yes
Yes
The active copy is going offline and terminating client connections. Only the active copy
of the mailbox database copy can have a copy status of Dismounting.
Yes
Yes
Database copies
status-DisconnectedAndHealthy
The mailbox database copy is no longer connected to the active database copy, and it
was in the Healthy state when the loss of connection occurred. This state represents
the database copy with respect to connectivity to its source database copy. It may be
reported during DAG network failures between the source copy and the target database
copy.
Yes
Yes
Database copies
status-DisconnectedAndResynchronizing
The mailbox database copy is no longer connected to the active database copy, and it
was in the Resynchronizing state when the loss of connection occurred. This state
represents the database copy with respect to connectivity to its source database copy.
It may be reported during DAG network failures between the source copy and the target
database copy.
Yes
Yes
Database copies
status-FailedAndSuspended
The Failed and Suspended states have been set simultaneously by the system
because a failure was detected, and because resolution of the failure explicitly requires
administrator intervention. An example is if the system detects unrecoverable
divergence between the active mailbox database and a database copy. Unlike the
Failed state, the system won't periodically check whether the problem has been
resolved, and automatically recover. Instead, an administrator must intervene to resolve
the underlying cause of the failure before the database copy can be transitioned to a
healthy state.
Yes
Yes
Database copies
status-SinglePageRestore
This state indicates that a single page restore operation is occurring on the mailbox
database copy.
Yes
Yes
The modification time of the last log that was successfully validated by the Mailbox
server hosting the database copy.
Yes
Yes
ContentIndexState
Indicates the current state of the content index for a database copy.
Yes
Yes
ActivationSuspended
Yes
Yes
DatabaseSize
Yes
Yes
DatabaseRedundancyCount
Count of redundancy of replicated mailbox databases. Both active and passive copies
are counted when determining redundancy.
Yes
No
clusterservice
Verifies that the Cluster service is running and reachable on the local exchange server.
Yes
Yes
ReplayService
Verifies that the Replay service is running and reachable on the local exchange server.
Yes
Yes
ActiveManager
Verifies that the instance of Active Manager running on the local Exchange server is in
a valid role (primary, secondary, or stand-alone).
Yes
Yes
TasksRpcListener
Verifies that the tasks remote procedure call (RPC) server is running and reachable on
the local Exchange server.
Yes
Yes
TcpListener
Verifies that the TCP log copy listener is running and reachable on the local Exchange
server.
Yes
Yes
DagMembersUp
Verifies that all DAG members are available, running, and reachable.
Yes
Yes
ClusterNetwork
Verifies that all cluster-managed networks on the local Exchange server are available.
Yes
Yes
QuorumGroup
Verifies that the default cluster group (quorum group) is in a healthy and online state.
Yes
Yes
FileShareQuorum
Verifies that the witness server and witness directory and share configured for the DAG
are reachable.
Yes
Yes
DBLogCopyKeepingUp
Verifies that log copying and inspection by the passive copies of databases on the local
exchange server are able to keep up with log generation activity on the active copy.
Yes
Yes
DBLogReplayKeepingUp
Verifies that replay activity for the passive copies of databases on the local exchange
server is able to keep up with log copying and inspection activity.
Yes
Yes
The database availability group is experiencing problems that may cause it to fail due to
stale security information.
Yes
Yes
One or more networks supporting the database availability group are not operating
properly on this server.
Yes
Yes
Note: You have to activate the DAG counters by enabling them individually from Profile Setup tab. All the DAG counters are supported
on DAG Exchange Server 2010 and DAG Exchange Server 2013, except DatabaseRedundancyCountcounter is not supported on
DAG Exchange Server 2013.
Counter Name
Exchange Server
2003
Exchange Server
2007
Exchange Server
2010
Exchange Server
2013
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
User Count
Yes
No
Yes
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Available Megabytes
Yes
Yes
No
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Processor Time
Yes
Yes
Yes
Yes
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
Yes
Yes
No
No
Current Bandwidth
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
Dumpster size
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
Categorization Count
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
Yes
No
Yes
Yes
Yes
Active Conversions
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
Yes
No
Yes
No
No
User Time
No
No
Yes
Yes
Privileged time
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
Available Mbytes
No
No
Yes
Yes
No
No
Yes
Yes
Cache Bytes
No
No
Yes
Yes
Committed Bytes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
Private Bytes
No
No
Yes
Yes
Virtual bytes
No
No
Yes
Yes
Working Set
No
No
Yes
Yes
Handle Count
No
No
Yes
Yes
DOTNET - Time in GC
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
Client Reported Failed RPCs for Server too busy Error per
second
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
Slow QP Threads
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
Events in Queue
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
CopyQueueLength
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
Application Restarts
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
Requests Queued
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
Current Connections
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
No
Yes
Yes
Inbound LocalDeliveryCallsPerSecond
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
Inbound MessageDeliveryAttemptsPerSecond
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
No
Database Mounted
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
Yes
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
Inbound: LocalDeliveryCallsPerSecond-2013
No
No
No
Yes
Inbound: MessageDeliveryAttemptsPerSecond-2013
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
Yes
No
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
DatabaseMounted
No
No
Yes
Yes
No
No
Yes
Yes
No
No
Yes
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Percent Availability
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
DNS Errors
No
No
No
Yes
Connection Failures
No
No
No
Yes
Protocol Errors
No
No
No
Yes
Inbound_LocalDeliveryCallsPerSecond
No
No
No
Yes
Inbound_MessageDeliveryAttemptsPerSecond
No
No
No
Yes
Inbound_Recipients_Delivered_Per_Second
No
No
No
Yes
MessagesFailedToRoute
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Percent Availability
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Connections Created/sec
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
No
No
No
Yes
Connections Created/sec
No
No
No
Yes
No
No
No
Yes
Recipients sent
No
No
No
Yes
DNS Errors
No
No
No
Yes
Connection Failures
No
No
No
Yes
No
No
No
Yes
Exchange Server
2003
Exchange Server
2007
Exchange Server
2010
Exchange Server
2013
HTTP SSL
No
Yes
No
No
No
Yes
No
No
No
Yes
Yes
Yes
No
Yes
Yes
No
No
No
Yes
No
No
Yes
Yes
Yes
No
Yes
Yes
No
No
No
No
Yes
No
Yes
Yes
Yes
No
Yes
Yes
No
No
No
Yes
No
No
No
No
Yes
Yes
Yes
Yes
No
No
No
No
Yes
No
No
No
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
No
No
Yes
Yes
Yes
No
No
Yes
Yes
No
No
No
Yes
No
Yes
Yes
Yes
Yes
No
No
No
Yes
Yes
Yes
No
No
No
No
Yes
No
No
No
Yes
No
No
Yes
No
No
Yes
Yes
Yes
Yes
No
No
No
No
No
Yes
Yes
No
No
No
Yes
No
No
No
Yes
No
Yes
Yes
No
No
No
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
Yes
Yes
No
Yes
Yes
Yes
No
Yes
Yes
Yes
No
No
No
Yes
No
No
No
Yes
No
Yes
Yes
No
SMTP
Yes
No
No
No
No
Yes
No
No
exchange_monitor Troubleshooting
This article contains troubleshooting information for the exchange_monitor probe.
Profile Setup Tab not displaying all the metrics
Symptom
On some of the exchange 2K13 machines, the client access role checkpoint is not displayed by the probe.
Solution
Do the following:
1. Go to the Raw Configure section of the exchange_monitor probe.
2. From the left pane, select register.
3. Under 2013, paths, 2k13_path_cas, set the value of the path key from Software\Microsoft\ExchangeServer\v15\ClientAccessRole to
Software\Microsoft\ExchangeServer\v15\FrontendTransportRole
4. Restart the probe.
You are now able to see all the metrics under the Profile Setup Tab.
exchange_monitor Metrics
The following table describes the checkpoint metrics that can be configured using the Microsoft Exchange Monitoring (exchange_monitor) probe.
Contents
QoS Metrics
Alert Metrics Default Settings
QoS Metrics
The following table describes the checkpoint metrics that can be configured using the Microsoft Exchange Monitoring probe.
Monitor Name
Units
Description
QOS_EXCHANGE_MEMORY_AVAILABLE_MBYTES
MB
Available MBytes
QOS_EXCHANGE_MEMORY_POOL_PAGED_BYTES
MB
QOS_EXCHANGE_MEMORY_CACHE_BYTES
MB
Cache Bytes
QOS_EXCHANGE_MEMORY_COMMITED_BYTES
MB
Committed Bytes
QOS_EXCHANGE_MEMORY_COMMITED_BYTES_IN_USE
Pct
QOS_EXCHANGE_MEMORY_TRANSITION_PAGES_REPURPOSED_PER_SECOND
Count
QOS_EXCHANGE_MEMORY_PAGE_READS_PER_SECOND
Count
QOS_EXCHANGE_MEMORY_PAGES_INPUT_PER_SECOND
Count
QOS_EXCHANGE_MEMORY_PAGES_OUTPUT_PER_SECOND
Count
QOS_EXCHANGE_MEMORY_PRIVATE_BYTES
MB
Private Bytes
QOS_EXCHANGE_MEMORY_VIRTUAL_BYTES
MB
Virtual Bytes
QOS_EXCHANGE_MEMORY_WORKING_SET
MB
Working Set
QOS_EXCHANGE_MEMORY_HANDLE_COUNT
Count
Handle Count
QOS_EXCHANGE_MEMORY_DOTNET-TIME_IN_GC
Pct
DOTNET-Time In GC
QOS_EXCHANGE_MEMORY_DOTNET-EXCEPTION_THROWN_PER_SEC
Count
DOTNET-Exception Thrown P
QOS_EXCHANGE_MEMORY_DOTNET-BYTES_IN_ALL_HEAPS
MB
QOS_EXCHANGE_PROCESSOR_USER_TIME
Pct
User Time
QOS_EXCHANGE_PROCESSOR_PRIVILEGED_TIME
Pct
Privileged Time
QOS_EXCHANGE_PROCESSOR_PROCESSOR_TIME_INSTANCE
Pct
QOS_EXCHANGE_PROCESSOR_PROCESSOR_QUEUE_LENGTH
Count
QOS_EXCHANGE_NETWORK_PACKETS_OUTBOUND_ERRORS
Count
QOS_EXCHANGE_NETWORK_TCPV4_CONNECTIONS_ESTABLISHED
Count
QOS_EXCHANGE_NETWORK_TCPV6_CONNECTION_FAILURES
Count
QOS_EXCHANGE_NETWORK_TCPV4_CONNECTIONS_RESET
Count
QOS_EXCHANGE_NETWORK_TCPV6_CONNECTION_RESET
Count
QOS_EXCHANGE_TRANS_ROLE_AVERAGE_DISK_SECONDS_PER_READ-TRANSPORT
ms
QOS_EXCHANGE_TRANS_ROLE_AVERAGE_DISK_SECONDS_PER_WRITE-TRANSPORT
ms
QOS_EXCHANGE_TRANS_ROLE_SUBMISSION_QUEUE_LENGTH
msgs
QOS_EXCHANGE_TRANS_ROLE_RETRY_NON-SMTP_DELIVERY_QUEUE_LENGTH
msgs
QOS_EXCHANGE_TRANS_ROLE_RETRY_REMOTE_DELIVERY_QUEUE_LENGTH
msgs
QOS_EXCHANGE_TRANS_ROLE_LARGEST_DELIVERY_QUEUE_LENGTH
msgs
QOS_EXCHANGE_TRANS_ROLE_POISON_QUEUE_LENGTH-TRANSPORT
msgs
QOS_EXCHANGE_TRANS_ROLE_INPUT-OUTPUT_LOG_WRITES_PER_SEC
logs/sec
QOS_EXCHANGE_TRANS_ROLE_INPUT-OUTPUT_LOG_READS_PER_SEC
logs/sec
QOS_EXCHANGE_TRANS_ROLE_LOG_GENERATION_CHECKPOINT_DEPTH-TRANSPORT
cnt
QOS_EXCHANGE_TRANS_ROLE_VERSION_BUCKETS_ALLOCATED
versions
QOS_EXCHANGE_TRANS_ROLE_INPUT-OUTPUT_DATABASE_READS_PER_SEC
rds/sec
QOS_EXCHANGE_TRANS_ROLE_INPUT-OUTPUT_DATABASE_WRITES_PER_SEC
wrts/sec
QOS_EXCHANGE_TRANS_ROLE_LOG_RECORD_STALLS_PER_SEC-TRANSPORT
logs/sec
QOS_EXCHANGE_TRANS_ROLE_LOG_THREADS_WAITING-TRANSPORT
thrds
QOS_EXCHANGE_TRANS_ROLE_TOTAL_AGENT_INVOCATIONS
invocations
QOS_EXCHANGE_TRANS_ROLE_MESSAGES_COMPLETED_DELIVERY_PER_SECOND
msgs/sec
QOS_EXCHANGE_TRANS_ROLE_INBOUND:_LOCALDELIVERYCALLSPERSECOND
atmpts/sec
Inbound: LocalDeliveryCallsPe
QOS_EXCHANGE_TRANS_ROLE_OUTBOUND:_SUBMITTED_MAIL_ITEMS_PER_SECOND
Items/sec
QOS_EXCHANGE_TRANS_ROLE_AVERAGE_BYTES_PER_MESSAGE
byts/msg
QOS_EXCHANGE_TRANS_ROLE_MESSAGES_RECEIVED_PER_SEC-TRANSPORT
msgs/sec
QOS_EXCHANGE_TRANS_ROLE_MESSAGES_SENT_PER_SEC-TRANSPORT
msgs/sec
QOS_EXCHANGE_TRANS_ROLE_INBOUND:_MESSAGEDELIVERYATTEMPTSPERSECOND
atmpts/sec
Inbound: MessageDeliveryAtte
QOS_EXCHANGE_TRANS_ROLE_INBOUND:_RECIPIENTS_DELIVERED_PER_SECOND
recipts/sec
QOS_EXCHANGE_TRANS_ROLE_AVERAGE_AGENT_PROCESSING_TIME_IN_SECONDS
msg/s
QOS_EXCHANGE_TRANS_ROLE_ACTIVE_MAILBOX_DELIVERY_QUEUE_LENGTH-TRANSPORT
items
QOS_EXCHANGE_TRANS_ROLE_ACTIVE_REMOTE_DELIVERY_QUEUE_LENGTH-TRANSPORT
items
QOS_EXCHANGE_TRANS_ROLE_AGGREGATE_DELIVERY_QUEUE_LENGTH_(ALL_QUEUES)-TRANSPORT
items
QOS_EXCHANGE_TRANS_ROLE_ACTIVE_NON-SMTP_DELIVERY_QUEUE_LENGTH-TRANSPORT
items
QOS_EXCHANGE_TRANS_ROLE_RETRY_MAILBOX_DELIVERY_QUEUE_LENGTH-TRANSPORT
msgs
QOS_EXCHANGE_TRANS_ROLE_UNREACHABLE_QUEUE_LENGTH-TRANSPORT
msgs
QOS_EXCHANGE_MAILBOX_ROLE_DATABASE_READS_(ATTACHED)_AVERAGE_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_DATABASE_WRITES_(ATTACHED)_AVERAGE_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_DATABASE_PAGE_FAULT_STALLS_PER_SEC
psg/s
QOS_EXCHANGE_MAILBOX_ROLE_LOG_WRITES_AVERAGE_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_LOG_RECORD_STALLS_PER_SEC
rcrds/s
QOS_EXCHANGE_MAILBOX_ROLE_LOG_THREADS_WAITING
thrds
QOS_EXCHANGE_MAILBOX_ROLE_DATABASE_READS_(RECOVERY)_AVERAGE_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_DATABASE_WRITES_(RECOVERY)_AVERAGE_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_LOG_READS_AVERAGE_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_RPC_REQUESTS_-_MAILBOX
reqs
QOS_EXCHANGE_MAILBOX_ROLE_RPC_AVERAGED_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_RPC_AVERAGE_LATENCY_-_MAILBOX
ms
QOS_EXCHANGE_MAILBOX_ROLE_RPC_AVERAGE_LATENCY_-_CLIENT
ms
QOS_EXCHANGE_MAILBOX_ROLE_CLIENT_REPORTED_FAILED_RPCS_FOR_SERVER_TOO_BUSY_ERROR_PER_SEC
errs/s
QOS_EXCHANGE_MAILBOX_ROLE_CLIENT_REPORTED_FAILED_RPCS_FOR_SERVER_TOO_BUSY_ERROR
errs
QOS_EXCHANGE_MAILBOX_ROLE_MESSAGES_QUEUED_FOR_SUBMISSION_MAILBOX
msgs
QOS_EXCHANGE_MAILBOX_ROLE_MESSAGES_QUEUED_FOR_SUBMISSION_PUBLIC
msgs
QOS_EXCHANGE_MAILBOX_ROLE_LOG_GENERATION_CHECKPOINT_DEPTH
cnt
QOS_EXCHANGE_MAILBOX_ROLE_DATABASE_PAGE_FAULT_STALLS_PER_SEC_-_INFORMATION_STORE
pgs/s
QOS_EXCHANGE_MAILBOX_ROLE_LOG_RECORD_STALLS_PER_SEC_-_INFORMATION_STORE
rcrds/s
QOS_EXCHANGE_MAILBOX_ROLE_LOG_THREADS_WAITING_-_INFORMATION_STORE
thrds
QOS_EXCHANGE_MAILBOX_ROLE_VERSION_BUCKETS_ALLOCATED_-_INFORMATION_STORE
bukts
QOS_EXCHANGE_MAILBOX_ROLE_DATABASE_READS_AVERAGE_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_DATABASE_WRITES_AVERAGE_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_DATABASE_CACHE_SIZE_-_INFORMATION_STORE
MB
QOS_EXCHANGE_MAILBOX_ROLE_DATABASE_CACHE_PERCENTAGE_HIT
QOS_EXCHANGE_MAILBOX_ROLE_LOG_BYTES_WRITE_PER_SEC
bytes/s
QOS_EXCHANGE_MAILBOX_ROLE_SLOW_FINDROW_RATE
rate
QOS_EXCHANGE_MAILBOX_ROLE_SEARCH_TASK_RATE
tasks/s
QOS_EXCHANGE_MAILBOX_ROLE_SLOW_QP_THREADS
thrds
Slow QP Threads
QOS_EXCHANGE_MAILBOX_ROLE_SLOW_SEARCH_THREADS
thrds
QOS_EXCHANGE_MAILBOX_ROLE_PROCESSOR_TIME_-_MS_EXCHANGE_SEARCH_SERVICE
QOS_EXCHANGE_MAILBOX_ROLE_PROCESSOR_TIME_-_ MSFTEFD_PROCESS
QOS_EXCHANGE_MAILBOX_ROLE_RECENT_AVERAGE_LATENCY_OF_RPCS_USED_TO_OBTAIN_CONTENT
ms
QOS_EXCHANGE_MAILBOX_ROLE_AVERAGE_DOCUMENT_INDEXING_TIME
ms
QOS_EXCHANGE_MAILBOX_ROLE_FULL_CRAWL_MODE_STATUS
crawl
QOS_EXCHANGE_MAILBOX_ROLE_PROCESSOR_TIME_ - _MAILBOXASSISTANTS
QOS_EXCHANGE_MAILBOX_ROLE_EVENTS_IN_QUEUE
evnts
Events In Queue
QOS_EXCHANGE_MAILBOX_ROLE_AVERAGE_EVENT_PROCESSING_TIME_IN_SECONDS
QOS_EXCHANGE_MAILBOX_ROLE_AVERAGE_RESOURCE_BOOKING_PROCESSING_TIME
QOS_EXCHANGE_MAILBOX_ROLE_REQUESTS_FAILED_-__RESOURCE_BOOKING
reqs
QOS_EXCHANGE_MAILBOX_ROLE_AVERAGE_CALENDAR_ATTENDANT_PROCESSING_TIME
QOS_EXCHANGE_MAILBOX_ROLE_REQUESTS_FAILED_-_CALENDAR_ATTENDANT
reqs
QOS_EXCHANGE_MAILBOX_ROLE_RPC_LATENCY_AVERAGE_-_STORE_INTERFACE
ms
QOS_EXCHANGE_MAILBOX_ROLE_ROP_REQUESTS_OUTSTANDING
reqs
QOS_EXCHANGE_MAILBOX_ROLE_RPC_REQUESTS_OUTSTANDING
reqs
QOS_EXCHANGE_MAILBOX_ROLE_RPC_REQUESTS_OUTSTANDING_INSTANCE
reqs
QOS_EXCHANGE_MAILBOX_ROLE_RPC_REQUESTS_SENT_PER_SEC
reqs/s
QOS_EXCHANGE_MAILBOX_ROLE_RPC_SLOW_REQUESTS_LATENCY_AVERAGE
ms
QOS_EXCHANGE_MAILBOX_ROLE_RPC_REQUESTS_FAILED_PERCENTAGE
QOS_EXCHANGE_MAILBOX_ROLE_RPC_SLOW_REQUESTS_PERCENTAGE
QOS_EXCHANGE_MAILBOX_ROLE_SUCCESSFUL_SUBMISSIONS_PER_SECOND
sbms/s
QOS_EXCHANGE_MAILBOX_ROLE_HUB_SERVERS_IN_RETRY
srvrs
QOS_EXCHANGE_MAILBOX_ROLE_FAILED_SUBMISSIONS_PER_SECOND
sbms/s
QOS_EXCHANGE_MAILBOX_ROLE_TEMPORARY_SUBMISSION_FAILURES_PER_SEC
sbms/s
QOS_EXCHANGE_MAILBOX_ROLE_COPYQUEUELENGTH
files
CopyQueueLength
QOS_EXCHANGE_MAILBOX_ROLE_REPLAYQUEUELENGTH
files
ReplayQueueLength
QOS_EXCHANGE_MAILBOX_ROLE_SEEDING_FINISHED_PERCENTAGE
QOS_EXCHANGE_MAILBOX_ROLE_RPC_OPERATIONS_PER_SEC
oper/s
QOS_EXCHANGE_MAILBOX_ROLE_RPC_CLIENT_BACKOFF_PER_SEC
oper/s
QOS_EXCHANGE_MAILBOX_ROLE_FAILED_CLIENT_RPCS_FOR_SERVER_TOO_BUSY_PER_SEC
errs/s
QOS_EXCHANGE_MAILBOX_ROLE_FAILED_CLIENT_RPCS_FOR_SERVER_TOO_BUSY
errs
QOS_EXCHANGE_MAILBOX_ROLE_RPC_OPERATIONS_PER_SEC_-_MSEXCHANGEIS_CLIENT
oper/s
QOS_EXCHANGE_MAILBOX_ROLE_JET_LOG_RECORDS_PER_SEC
recs/s
QOS_EXCHANGE_MAILBOX_ROLE_JET_PAGES_READ_PER_SEC
pgs/s
QOS_EXCHANGE_MAILBOX_ROLE_DIRECTORY_ACCESS_LDAP_READS_PER_SEC
rd/s
QOS_EXCHANGE_MAILBOX_ROLE_DIRECTORY_ACCESS_LDAP_SEARCHES_PER_SEC
srch/s
QOS_EXCHANGE_MAILBOX_ROLE_MESSAGES_DELIVERED_PER_SEC_ - _MAILBOX
msg/s
QOS_EXCHANGE_MAILBOX_ROLE_MESSAGES_SENT_PER_SEC_ - _TOTAL
msg/s
QOS_EXCHANGE_MAILBOX_ROLE_MESSAGES_SUBMITTED_PER_SEC
msg/s
QOS_EXCHANGE_MAILBOX_ROLE_REPLICATION_RECEIVE_QUEUE_SIZE
msg/s
QOS_EXCHANGE_MAILBOX_ROLE_MAILBOXES_PROCESSED_PER_SEC
mlbox/s
QOS_EXCHANGE_MAILBOX_ROLE_EVENTS_POLLED_PER_SEC
evnts/s
QOS_EXCHANGE_CAS_ROLE_AVERAGE_SEARCH_TIME
ms
QOS_EXCHANGE_CAS_ROLE_APPLICATION_RESTARTS
count
Application Restarts
QOS_EXCHANGE_CAS_ROLE_WORKER_PROCESS_RESTARTS
count
QOS_EXCHANGE_CAS_ROLE_REQUEST_WAIT_TIME
count
QOS_EXCHANGE_CAS_ROLE_REQUESTS_IN_APPLICATION_QUEUE
requests
QOS_EXCHANGE_CAS_ROLE_AVERAGE_TIME_TO_PROCESS_A_FREE_BUSY_REQUEST
sec
QOS_EXCHANGE_CAS_ROLE_SYNC_COMMANDS_PENDING
commands
QOS_EXCHANGE_CAS_ROLE_REQUESTS_QUEUED
requests
Requests Queued
QOS_EXCHANGE_CAS_ROLE_NUMBER_OF_FAILED_BACK-END_CONNECTION_ATTEMPTS_PER_SECOND
conn/sec
QOS_EXCHANGE_CAS_ROLE_CURRENT_NUMBER_OF_INCOMING_RPC_OVER_HTTP_CONNECTIONS
RPC
QOS_EXCHANGE_CAS_ROLE_CURRENT_NUMBER_OF_UNIQUE_USERS
users
QOS_EXCHANGE_CAS_ROLE_RPC_-_HTTP_REQUESTS_PER_SECOND
req/sec
QOS_EXCHANGE_CAS_ROLE_RPC_AVERAGED_LATENCY_-_RPCCLIENTACCESS
ms
QOS_EXCHANGE_CAS_ROLE_RPC_OPERATIONS_PER_SEC_-_RPCCLIENTACCESS
/sec
QOS_EXCHANGE_CAS_ROLE_RPC_REQUESTS_-_RPCCLIENTACCESS
requests
QOS_EXCHANGE_CAS_ROLE_NSPI_RPC_BROWSE_REQUESTS_AVERAGE_LATENCY
ms
QOS_EXCHANGE_CAS_ROLE_NSPI_RPC_REQUESTS_AVERAGE_LATENCY
ms
QOS_EXCHANGE_CAS_ROLE_REFERRAL_RPC_REQUESTS_AVERAGE_LATENCY
ms
QOS_EXCHANGE_CAS_ROLE_OUTBOUND_PROXY_REQUESTS_FOR_AVERAGE_RESPONSE_TIME
ms
QOS_EXCHANGE_CAS_ROLE_REQUESTS_AVERAGE_RESPONSE_TIME
ms
QOS_EXCHANGE_CAS_ROLE_DOWNLOAD_TASK_QUEUED
tasks
QOS_EXCHANGE_CAS_ROLE_DOWNLOAD_TASKS_COMPLETED
tasks
QOS_EXCHANGE_CAS_ROLE_PING_COMMANDS_PENDING
commands
QOS_EXCHANGE_CAS_ROLE_AVAILABILITY_REQUESTS_IN_SECONDS
reqs/sec
QOS_EXCHANGE_CAS_ROLE_CURRENT_UNIQUE_USERS
users
QOS_EXCHANGE_CAS_ROLE_AUTODISCOVER_SERVICE_REQUESTS_PER_SEC
reqs/sec
QOS_EXCHANGE_CAS_ROLE_CURRENT_CONNECTIONS
conn
Current Connections
QOS_EXCHANGE_CAS_ROLE_CONNECTION_ATTEMPTS_PER_SEC
conn/sec
QOS_EXCHANGE_CAS_ROLE_ISAPI_EXTENSION_REQUESTS_PER_SEC
reqs/sec
QOS_EXCHANGE_CAS_ROLE_OTHER_REQUEST_METHODS_PER_SEC
reqs/sec
QOS_EXCHANGE_MSEXCHANGE_LDAP_SEARCHES_PER_SECOND
count
QOS_EXCHANGE_MSEXCHANGE_LDAP_READ_TIME_CONTROLLERS
ms
QOS_EXCHANGE_MSEXCHANGE_LDAP_SEARCH_TIME_CONTROLLERS
ms
QOS_EXCHANGE_MSEXCHANGE_LDAP_READ_TIME_PROCESSES
ms
QOS_EXCHANGE_MSEXCHANGE_LDAP_SEARCH_TIME_PROCESSES
ms
QOS_EXCHANGE_MSEXCHANGE_LDAP_SEARCHES_TIMED_OUT_PER_MINUTE
count
QOS_EXCHANGE_MSEXCHANGE_LONG_RUNNING_LDAP_OPERATIONS_PER_MINUTE
count
QOS_EXCHANGE_DAG_NUM_ACTIVE_DB
db
QOS_EXCHANGE_DAG_NUM_PASSIVE_DB
db
QOS_EXCHANGE_DAG_NUM_MOUNTED_DB
db
QOS_EXCHANGE_DAG_NUM_NON_MOUNTED_DB
db
QOS_EXCHANGE_DAG_DB_COPY_QUEUE_LENGTH
files
QOS_EXCHANGE_DAG_DB_REPLAY_QUEUE_LENGTH
files
QOS_EXCHANGE_DAG_DB_COPY_STATUS_FAILED
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_SEEDING
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_SEEDING_SOURCE
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_SUSPENDED
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_HEALTHY
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_SERVICE_DOWN
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_INITIALIZING
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_RESYNCHRONIZING
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_MOUNTED
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_DISMOUNTED
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_MOUNTING
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_DISMOUNTING
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_DISCONNECTED_AND_HEALTHY
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_DISCONNECTED_AND_RESYNCHRONIZING
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_FAILED_AND_SUSPENDED
bool
QOS_EXCHANGE_DAG_DB_COPY_STATUS_SINGLE_PAGE_RESTORE
bool
QOS_EXCHANGE_DAG_DB_LAST_INSPECTED_LOG_TIME
timestamp
QOS_EXCHANGE_DAG_DB_CONTENT_INDEX_STATE
state
QOS_EXCHANGE_DAG_DB_ACTIVATION_SUSPENDED
bool
QOS_EXCHANGE_DAG_DB_COPY_SIZE
GB
QOS_EXCHANGE_DAG_DB_REDUNDANCY_COUNT
Db
QOS_EXCHANGE_DAG_CLUSTER_SERVICE_HEALTH_STATUS
state
QOS_EXCHANGE_DAG_REPLAY_SERVICE_HEALTH_STATUS
state
QOS_EXCHANGE_DAG_ACTIVE_MANAGER_HEALTH_STATUS
state
QOS_EXCHANGE_DAG_TASKS_RPC_LISTENER_HEALTH_STATUS
state
QOS_EXCHANGE_DAG_TASKS_RPC_LISTENER_HEALTH_STATUS
state
QOS_EXCHANGE_DAG_DAG_MEMBERS_UP_HEALTH_STATUS
state
QOS_EXCHANGE_DAG_CLUSTER_NETWORK_HEALTH_STATUS
state
QOS_EXCHANGE_DAG_QUORUM_GROUP_HEALTH_STATUS
state
QOS_EXCHANGE_DAG_FILE_SHARE_QUORUM_HEALTH_STATUS
state
QOS_EXCHANGE_DAG_DB_LOG_COPY_KEEPING_UP_HEALTH_STATUS
state
QOS_EXCHANGE_DAG_DB_LOG_REPLAY_KEEPING_UP_HEALTH_STATUS
state
QOS_EXCHANGE_MAILBOX_ROLE_INPUT-OUTPUT_LOG_READ_AVERAGE_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_DATABASEMOUNTED
bool
QOS_EXCHANGE_MAILBOX_ROLE_AGE_OF_THE_LAST_NOTIFICATION_INDEXED
sec
QOS_EXCHANGE_MAILBOX_ROLE_INPUT_OUTPUT_LOG_WRITES_AVERAGE_LATENCY
ms
QOS_EXCHANGE_MAILBOX_ROLE_TIME_SINCE_LAST_NOTIFICATION_WAS_INDEXED
sec
QOS_EXCHANGE_MAILBOX_ROLE_EXCHANGE_SEARCH_ZERO_RESULT_QUERY
bool
QOS_EXCHANGE_MAILBOX_ROLE_LOGICAL_DISK_PERCENTAGE_FREE_SPACE
percentage
QOS_EXCHANGE_TRANS_ROLE_INBOUND_LOCALDELIVERYCALLSPERSECOND2013
atmpts/sec
QOS_EXCHANGE_TRANS_ROLE_INBOUND_MESSAGEDELIVERYATTEMPTSPERSECOND-2013
recipts/sec
QOS_EXCHANGE_MAILBOX_ROLE_MESSAGES_DELIVERED_PER_SEC_-_STORE
msg/s
QOS_EXCHANGE_MAILBOX_ROLE_MAILBOX_SEARCHES_PER_SEC_._STORE
msg/s
QOS_EXCHANGE_CAS_ROLE_NUMBER_OF_FAILED_BACK-END_CONNECTION_ATTEMPTS_PER_SECOND-2013
conn/sec
QOS_EXCHANGE_ANTI_MALWARE_ANTI-MALWARE_AGENT_MESSAGES_SCANNED
messages
QOS_EXCHANGE_ANTI_MALWARE_ANTI-MALWARE_AGENT_MESSAGES_SCANNED_PER_SECOND
messages
QOS_EXCHANGE_ANTI_MALWARE_ANTI-MALWARE_AGENT_MESSAGES_CONTAINING_MALWARE
messages
QOS_EXCHANGE_ANTI_MALWARE_ANTI-MALWARE_AGENT_MESSAGE_BLOCKED
messages
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_ALERATABLE_FAILURE_DSNS_WITHIN_THE_LAST_HOUR
messages
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_HUB_SELECTION_RESOLVER_FAILURES
messages
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_HUB_SELECTION_ORGANIZATION_MAILBOX_FAILURES
messages
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_HUB_SELECTION_ROUTING_FAILURES
messages
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_PERCENT_AVAILABILITY
percent
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_PERCENT_FAILURES_DUE_TO_MAXINBOUNDCONNECTIONLIMIT
percent
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_PERCENT_FAILURES_DUE_TO_WLID_DOWN
percent
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_PERCENT_FAILURES_DUE_TO_BACK_PRESSURE
percent
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_FAILURES_DUE_TO_MAXIMUM_LOCAL_LOOP_COUNT
messages
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_CONNECTIONS_CREATED_PER_SECOND
connections
per second
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_MESSAGES_RECEIVED_PER_SECOND
Messages
Per Second
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_MESSAGES_BYTES_RECEIVED_PER_SECOND
Messages
Per Second
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_MESSAGES_BYTES_RECEIVED_PER_SECOND
Messages
Per Second
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_CONNECTIONS_CREATED_PER_SECOND
Connections
Per Second
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_MESSAGES_SENT_PER_SECOND
Messages
Per Second
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_MESSAGE_BYTES_SENT_PER_SECOND
messages
per second
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_DNS_ERRORS
DNS
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_CONNECTION_FAILURES
messages
QOS_EXCHANGE_TRANS_ROLE_MESSAGESFAILEDTOROUTE
messages
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_HUB_SELECTION_ RESOLVER_FAILURES
messages
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_HUB_SELECTION_ORGANIZATION_
MAILBOX_RESOLVER_FAILURES
messages
QOS_EXCHANGE_DELIVERY_HEALTH_MONITOR_HUB_SELECTION_ROUTING_FAILURES
messages
QOS_EXCHANGE_TRANS_ROLE_PERCENTAGE_PROXY_SETUP_FAILURES
percent
QOS_EXCHANGE_TRANS_ROLE_TOTAL_PROXY_USER_LOOKUP_FAILURES
messages
QOS_EXCHANGE_TRANS_ROLE_TOTAL_PROXY_DNS_LOOKUP_FAILURES
messages
QOS_EXCHANGE_TRANS_ROLE_TOTAL_PROXY_CONNECTION_FAILURES
messages
QOS_EXCHANGE_TRANS_ROLE_TOTAL_BYTES_ PROXIED
messages
QOS_EXCHANGE_TRANS_ROLE_ CONNECTIONS_CREATED_PER_SECOND
connections
per second
Connections Created/sec is th
connections to the SMTP serv
established.
QOS_EXCHANGE_TRANS_ROLE_RECIPIENTS_SENT
messages
Monitoring
Profiles
Type
Error
Threshold
Error
Severity
Description
Available Mbytes
Performance
100
Major
Available Megabytes
Performance
10
Major
Performance
----
Major
Performance
----
Major
Performance
15000
Major
Performance
Major
Average Dis Queue Length is the average number of both read and
write requests that were queued for the selected disk during the
sample interval.
Performance
Major
Performance
0.05
Major
Performance
0.05
Major
Performance
Major
Performance
----
Major
This is the average queue time calculated from the Work Queue
Length and Messages per Second.
Performance
----
Major
Performance
----
Major
Performance
----
Major
Cache Bytes
Performance
----
Major
Cache bytes is the current size, in bytes, of the file system cache.
By default, the cache uses up to 50% of available physical memory.
The counter value is the sum of Memory\System Cache Resident
Bytes, Memory\System Driver Resident Bytes, Memory\System
Code Resident Bytes, and Memory\Pool Paged Resident Bytes.
Categorization Count
Performance
----
Major
Committed Bytes
Performance
----
Major
Performance
----
Major
Performance
----
Major
This counter takes the MTA Work Queue Length counter to a lower
level and breaks it out into all the different work queues within the
MTA. If a large queue is building in the MTA this will pinpoint the
exact connection that is responsible.Informational for drill down. One
instance of this profile will monitor each Message Transfer Agent
Connection.
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Current Bandwidth
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Current User Count is the number of users who are currently logged
on to Outlook Web Access. This value monitors the number of
active user sessions, so that users are only removed from this count
after they log off or their session times out.
Performance
----
Major
DNS Queries per second is the number of DNS queries per second
performed by the Sender Id agent.
Performance
----
Major
Bytes in all Heaps is the sum of four other counters: Gen 0 Heap
Size, Gen 1 Heap Size, Gen 2 Heap Size, and Large Object Heap
Size. This counter indicates the current memory allocated in bytes
on the GC Heaps.
Performance
----
Major
DOTNET - Time In GC
Performance
----
Major
Performance
----
Major
Delete Rate is the rate at which items are deleted from the
Transport Dumpster on this server.
Performance
----
Major
Insert Rate is the rate at which items are inserted into the Transport
Dumpster on this server.
Performance
----
Major
Item Count is the total number of mail items that are currently in the
Transport Dumpster on this server.
Dumpster size
Performance
----
Major
Item Size is the total size (in bytes) of mail items that are currently in
the Transport Dumpster on this server.
Performance
----
Major
Performance
----
Major
The total elapsed time that this process has been running.
Performance
----
Major
The total elapsed time that this process has been running.
Performance
----
Major
The total elapsed time that this process has been running.
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Indicates user activity. One instance of this profile will monitor each
Mailbox Store.
Performance
----
Major
Indicates user activity. One instance of this profile will monitor each
Public Folder Store.
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Shows the rate, in incidents per second, at which bytes were sent
over each network adapter. The counted bytes include framing
characters. Bytes Sent/sec is a subset of Bytes Total/sec. One
instance of this profile will monitor each Network Interface.
Performance
----
Major
Shows the rate, in incidents per second, at which bytes were sent
and received on the network interface, including framing characters.
Bytes Total/sec is the sum of the values of Bytes Received/sec and
Bytes Sent/sec. One instance of this profile will monitor each
Network Interface.
Performance
----
Major
Performance
Major
Performance
----
Major
Performance
----
Major
This will show how often your users are opening messages. Peak
load may show this coinciding with other system behavior.
Performance
----
Major
This will show how often your users are opening messages. Peak
load may show this coinciding with other system behavior. One
instance of this profile will monitor each Mailbox Store.
Performance
----
Major
This will show how often your users are opening messages. Peak
load may show this coinciding with other system behavior. One
instance of this profile will monitor each Public Folder Store.
Performance
----
Major
This will show how often your users are opening messages. Peak
load may show this coinciding with other system behavior.
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
50
Major
Pages per Second is the rate at which pages are read from or
written to disk to resolve hard page faults. This counter is a primary
indicator of the kinds of faults that cause system-wide delays.
Performance
----
Major
Performance
60
Major
% Disk Time is the percentage of elapsed time that the selected disk
drive was busy servicing read or write requests.
Performance
----
Major
Performance
100
Major
Processor Time
Performance
80
Major
Performance
75
Major
Performance
75
Major
Performance
75
Major
Performance
75
Major
Performance
75
Major
Performance
75
Major
Performance
----
Major
Performance
----
Major
Performance
Major
This is the queue of all messages destined inbound for the IS. As
with the Send Queue this should also stay at non-zero during
normal operating conditions.
Performance
Major
This is the queue of all messages destined inbound for the IS. As
with the Send Queue this should also stay at non-zero during
normal operating conditions. One instance of this profile will monitor
each Mailbox Store.
Performance
Major
This is the queue of all messages destined inbound for the IS. As
with the Send Queue this should also stay at non-zero during
normal operating conditions. One instance of this profile will monitor
each Public Folder Store.
Performance
Major
This is the queue of all messages destined inbound for the IS. As
with the Send Queue this should also stay at non-zero during
normal operating conditions.
Performance
----
Major
Performance
----
Major
Performance
1000
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
----
Major
Performance
Major
Performance
Major
Performance
Major
Performance
Major
Performance
----
Major
Performance
----
Major
User Count
Performance
----
Major
Performance
----
Major
Virtual Megabytes is the current size of the virtual address space the
Information Store is using. Use of virtual address space does not
necessarily imply corresponding use of either disk or main memory
pages. Virtual space is finite, and the process can limit its ability to
load libraries.
Performance
150
Major
Performance
Major
Performance
32
Major
Performance
Major
This is the most important counter because it gives the total of all
queues. This single counter can tell an administrator at a glance the
overall MTA health. This is the queue length for the whole of the
MTA. It covers both inbound and outbound messages for the
Information Store, the Directory and any connectors that route
through the MTA. If this counter is above zero for a sustained
amount of time it may indicate a communications problem with one
of the Exchange components, a connector or a remote Exchange
MTA.
Performance
200
Major
Indicates the average time, in ms, to read data from a log file.
Specific to log replay and database recovery operations.
Note: This counter will work for DAG also.
DatabaseMounted
Performance
Major
Performance
172800
Major
Performance
10
Major
Indicates the average time, in ms, to write a log buffer to the active
log file.
Note: This counter will work for DAG also.
Performance
3600
Major
Indicates the time in seconds since last notification was indexed for
content indexing in passive Database.
Note: This counter will work for DAG also.
Performance
Major
Indicates that more than one hundred search queries have returned
zero results. This may indicate that a corruption or other problem
affects the content indexing catalog.
Note: This counter will work for DAG also.
Performance
35
Major
Performance
----
Major
Performance
-----
Major
Performance
-----
Major
Performance
-----
Major
Performance
-----
Major
Performance
-----
Major
Performance
-----
Major
Performance
------
Major
Inbound:
MessageDeliveryAttemptsPerSecond-2013
Performance
-------
Major
Shows the number of attempts for delivering transport mail items per
second. Determines current load. Compare values to historical
baselines.
Inbound: LocalDeliveryCallsPerSecond-2013
Performance
------
Major
EdgeSync
Process
----
Major
EdgeTransport
Process
----
Major
File Distribution
Process
----
Major
Inetinfo
Process
----
Major
Information Store
Process
----
Major
Lsass
Process
----
Major
Mail Submission
Process
----
Major
Mailbox Assistants
Process
----
Major
Process
----
Major
Replication
Process
----
Major
Search Indexer
Process
----
Major
Service Host
Process
----
Major
System Attendant
Process
----
Major
Process
10
Major
LDAP Searches timed out per minute shows the number of LDAP
searches that returned LDAP_Timeout during the last minute.
Process
50
Major
LDAP Read Time shows the time (in ms) to send an LDAP read
request to the specified domain controller and receive a response.
Process
50
Major
LDAP Search Time shows the time (in ms) to send an LDAP search
request and receive a response.
Process
50
Major
Process
----
Major
Shows the average time that elapsed while waiting for a search to
complete.
Process
1000
Major
Shows the average time, in ms, that NSPI requests took to complete
during the sampling period.
Process
1000
Major
Process
100
Major
Process
10
Major
Transport
Process
----
Major
Process
----
Major
HTTP SSL
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
Service
----
Major
SMTP
Service
----
Major
Service
----
Major
MSExchange event
NT Event
Log
----
From
Event
Database
Availability
Group
----
----
Database
Availability
Group
----
----
Database
Availability
Group
-----
----
Database
Availability
Group
----
-----
CopyQueueLength-DAG
Database
Availability
Group
Minor
ReplayQueueLength-DAG
Database
Availability
Group
Minor
Database
Availability
Group
Minor
Database
Availability
Group
Minor
The mailbox database copy is being seeded, the content index for
the mailbox database copy is being seeded, or both are being
seeded. Upon successful completion of seeding, the copy status
should change to Initializing.
Database
Availability
Group
Minor
Database
Availability
Group
Minor
Database
Availability
Group
Minor
Database
Availability
Group
Minor
Database
Availability
Group
Minor
Database
Availability
Group
Minor
The mailbox database copy and its log files are being compared
with the active copy of the database to check for any divergence
between the two copies. The copy status will remain in this state
until any divergence is detected and resolved.
Database
Availability
Group
Minor
The active copy is online and accepting client connections. Only the
active copy of the mailbox database copy can have a copy status of
Mounted.
Database
Availability
Group
Minor
The active copy is offline and not accepting client connections. Only
the active copy of the mailbox database copy can have a copy
status of Dismounted.
Database
Availability
Group
Minor
The active copy is coming online and not yet accepting client
connections. Only the active copy of the mailbox database copy can
have a copy status of Mounting.
Database
Availability
Group
Minor
Database copies
status-DisconnectedAndHealthy
Database
Availability
Group
Minor
Database copies
status-DisconnectedAndResynchronizing
Database
Availability
Group
Minor
Database
Availability
Group
Minor
Database
Availability
Group
Minor
Database
Availability
Group
30
Minor
The modification time of the last log that was successfully validated
by the Mailbox server hosting the database copy.
ContentIndexState
Database
Availability
Group
failed
Minor
Indicates the current state of the content index for a database copy.
ActivationSuspended
Database
Availability
Group
Minor
DatabaseSize
Database
Availability
Group
10
Minor
DatabaseRedundancyCount
Database
Availability
Group
Minor
Clusterservice
Database
Availability
Group
failed
Critical
ReplayService
Database
Availability
Group
failed
Critical
ActiveManager
Database
Availability
Group
failed
Critical
TasksRpcListener
Database
Availability
Group
failed
Critical
Verifies that the tasks remote procedure call (RPC) server is running
and reachable on the local Exchange server.
TcpListener
Database
Availability
Group
failed
Critical
Verifies that the TCP log copy listener is running and reachable on
the local Exchange server.
DagMembersUp
Database
Availability
Group
failed
Critical
Cluster Network
Database
Availability
Group
failed
Critical
QuorumGroup
Database
Availability
Group
failed
Critical
FileShareQuorum
Database
Availability
Group
failed
Critical
Verifies that the witness server and witness directory and share
configured for the DAG are reachable.
DBLogCopyKeepingUp
Database
Availability
Group
failed
Critical
DBLogReplayKeepingUp
Database
Availability
Group
failed
Critical
Database
Availability
Group
------
------
Database
Availability
Group
------
------
Database
Availability
Group
------
------
Database
Availability
Group
Major
Alertable failure DSNs within the last hour is the number of alertable
failure delivery status notifications (DSNs) that have been generated
within the last hour. Some DSNs, such as Recipient Not Found, are
filtered out because they are expected to occur. Thus, we do not
want to generate an alert.
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
Percent Availability
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
DNS Errors
Database
Availability
Group
Major
Connection Failures
Database
Availability
Group
Major
Protocol Errors
Database
Availability
Group
Major
Inbound_LocalDeliveryCallsPerSecond
Database
Availability
Group
Major
Inbound_MessageDeliveryAttemptsPerSecond
Database
Availability
Group
Major
Inbound_Recipients_Delivered_Per_Second
Database
Availability
Group
Major
MessagesFailedToRoute
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
Database
Availability
Group
Major
The total number of user lookup failures (during the sliding time
window) while trying to set up a proxy session.
Database
Availability
Group
Major
The total number of dns lookup failures (during the sliding time
window) while trying to set up a proxy session.
Database
Availability
Group
Major
Database
Availability
Group
Major
Percentage Availability
Database
Availability
Group
Major
Connections Created/sec
Database
Availability
Group
Major
Database
Availability
Group
Major
Connection Failures
Database
Availability
Group
Major
Type of Regular
Expression
Explanation
* or ?
Standard
*a
Custom
Matches all files with extensions that end with the letter a. For a file without an extension, the last
letter of the filename must be a.
*.txt
Standard
*.t*
Custom
Matches all files in the directory with an extension that starts with a t.
?.*
Standard
Matches all files in the directory with a single character filename and of any extension.
NewFile?.*
Custom
Matches all files in the directory with a letter or number after NewFile and of any extension.
Examples: NewFile1, NewFilex etc.
[a]
Standard
Matches all files in the directory with the letter a in the filename or the extension.
?[a-c]section.*
Custom
Matches all files in the directory that start with the letters a, b, or c, followed by the word section in
the filename, and of any extension.
Example: bsection.xlsx
Explanation
%B
Standard
31%B
Custom
Matches all directories with the number 31 followed by the full name of a month
Example: 31March
%B %Y
Standard
Matches all directories with the full name of a month followed by a year
Example: January 2005.
%y/%m/%d
Custom
More Information:
fetchmsg (Fetch System Messages on iSeries) Release Notes
Contents
Verify Prerequisites
Set Up General Properties
Modify Queues In Predefined Set
Create Monitoring Profile for Messages
Configure Profile Properties
Using Regular Expressions
Alarm Thresholds
Create and Configure an Exclude Profile
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see fetchmsg (Fetch System
Messages on iSeries) Release Notes.
5. Specify the initial size of the user space where the retrieved messages are temporarily stored. The size automatically increases to contain
the requested messages.
Default: 204800
6. Specify the number of messages retrieved in one read operation in Messages To Read. The probe repeats the retrieve operation after
scanning messages at the next interval, as needed. The value must be larger than the maximum amount of messages you expect during
one check interval.
Default: 10000
7. Specify the minimum severity level of the messages that are retrieved for monitoring. The value must be between 0 to 99, where 99 is the
most severe. You can also override this setting during profile configuration.
Default: 0
8. Select Include Messages Without Message ID (Impromptu Messages) to select messages without a message ID. Both users and
applications can create messages without message ID.
Default: Not Selected
9. Select the message queues that are monitored from the Message Queues To Read drop-down list.
Note: If you select Predefined Set, the queues specified in Raw Configure are monitored. For more information, see the Modif
y Queues in Predefined Set section.
10. Specify the queue to retrieve the messages in Messages From Queue if Predefined Set is selected.
11. Click Save to save the probe configuration.
Note: Until the monitored system creates the specified queue with the message, an active profile generates an alarm indicating that the
message is not found. When the specified message is found, the profile generates the configured alarms, as applicable.
Type: select to specify the type of message in the associated When Matching field. You can select the type of message from the
drop-down list.
When Matching: specify the message type if Type is selected. You can select an existing message type from the list of types or
specify a new type. You can use regular expressions to specify the new value. For more information, see the Using Regular
Expressions section.
Text: select to specify the message text in the associated When Matching field.
Help Text (Description): select to specify additional information about the message queue and is specified in the associated When
Matching field.
When Matching: specify the message text and description depending on the associated field. You can use regular expressions to
specify the value. For more information, see the Using Regular Expressions section.
Is Unanswered: select this checkbox to generate an alarm if a message is not answered.
Job Name: select to specify the job name of the generated message in the associated When Matching field.
User Profile Name: select to specify the job owner of the generated message in the associated When Matching field.
When Matching: specify the job name and the user profile name depending on the associated field. You can use regular expressions
to specify the value. For more information, see the Using Regular Expressions section.
Only If Not Matched By Other Profile: select to indicate the probe to use a message only if the message does not match in any
other profile.
Look In Queue: select the message queue from the drop-down list for monitoring. This field is only available if Predefined Set is
selected in the fetchmsg node.
7. Click Save to save the configuration and start monitoring.
Explanation
[A-Z]
Standard (PCRE)
Standard (PCRE)
\d*
Custom
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
fetchmsg Node
All Messages Node
Excludes Node
<Exclude Profile Name> Node
Profiles Node
<Profile Name>Node
fetchmsg Node
The fetchmsg node allows you to view the probe and alarm message details, and also configure the log properties. You can also select the
message queues that you want to monitor.
Navigation: fetchmsg
Set or modify the following values as required:
fetchmsg > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the vendor who created the probe.
fetchmsg > General
This section allows you to configure the general setup properties of the probe. You can configure the log properties and control which messages
to fetch.
Log Level: specifies the level of details that are written to the log file. Log as little as possible during normal operation to minimize disk
consumption, and increase the amount of detail when debugging.
Default: 0 - Fatal
Log Size: specifies the size of the log file to which the internal log messages of the probe are written, in kilobytes. When this size is
reached, new log file entries are added and the older entries are deleted.
Default: 100 KB
Check Interval In Seconds (Perform Check Each): specifies the time interval at which the probe retrieves messages from the IBM
iSeries message queues. Reduce this interval to increase the frequency of alarms.
Default: 60
User space size: specifies the initial size of the user space where the retrieved messages are temporarily stored. The size automatically
increases to contain the requested messages.
Default: 204800
Messages to read: specifies the number of messages retrieved in one operation. The probe repeats the retrieve operation after scanning
messages at the next interval, as needed. The value must be larger than the maximum amount of messages you expect during one
check interval.
Default: 10000
Only messages with severity >=: specifies the minimum severity level of the messages that are retrieved for monitoring. The value
must be between 0 to 99, where 99 is the most severe. You can also override this setting during profile configuration.
Default: 0
Include Messages Without Message ID (Impromptu Messages): enables you to select messages without message ID. Both users and
applications can create messages without message ID.
Default: Not Selected
Message Queues To Read: specifies the message queues that are monitored.
The available options are as follows:
qSysOpr Only: Only the qSysOpr queue is monitored. This is the default selection.
Predefined Set: The queues specified in Raw Configuration are monitored. For more information, see the Modify Queues in
Predefined Set section in the v1.5 fetchmsg AC Configuration article.
Messages From Queue: specifies the queue to retrieve the messages if Predefined Set is selected in Message Queues To Read.
Default: qSysOpr and qSysMsg
fetchmsg > Alarm Messages
This section allows you to view the alarm messages.
Name: indicates the name of the alarm message.
Text: indicates the alarm message text.
Level: indicates the alarm severity.
Subsystem: indicates the subsystem_id.
Default: indicates the default message for each alarm situation (such as default, error, and answer_limit).
Excludes Node
The Excludes node is used to create a profile to exclude a message from monitoring by the probe.
Navigation: fetchmsg > Excludes
Excludes > Options (icon) > Add Exclude
This section lets you specify the profiles for messages that the probe excludes from monitoring.
Add Exclude Window
This window allows you to specify the parameters to create an exclude profile. The fields in the window are as follow:
Active: activates the profile, when selected.
User Profile Name: defines the name of the exclude profile.
Id: indicates the seven-character message identifier which identifies the message in the associated When Matching field.
When Matching: specifies a message identifier if Id is selected. You can use regular expressions to specify the value. For more
information, see the Using Regular Expressions section in the v1.5 fetchmsg AC Configuration article.
Severity: indicates the maximum numeric severity of the message that is excluded in the associated When (<=) field.
When (<=): specifies the maximum numeric severity of the message that is excluded if Severity is selected. The value can be between 0
to 99, where 99 is the most severe.
Type: indicates the type of message in the associated When Matching field. You can select the type of message from the drop-down
list.
Text: indicates the message text in the associated When Matching field.
Help Text (Description): indicates additional information about the message queue and is specified in the associated When Matching fi
eld.
Job Name: indicates the job name of the generated message in the associated When Matching field.
User Profile Name: indicates the job owner of the generated message in the associated When Matching field.
When Matching: specifies the message type, message text, description, job name, and user profile name depending on the associated
field. You can use regular expressions to specify the value. For more information, see the Using Regular Expressions section in the v1
.5 fetchmsg AC Configuration article.
Queue Name: selects the message queue that is monitored from the drop-down list. The options are only available if Predefined Set is
selected in the fetchmsg node.
Note: The fields in this node are the same as the fields in the Add Exclude window of the Excludes node.
Excludes > Exclude Profile Name > Options (icon) > Delete
This option allows you to delete the exclude profile.
Profiles Node
This node is used to create a monitoring profile. You can create multiple monitoring profiles with different criteria to monitor the messages. All
monitoring profiles are displayed under this node.
Profiles > Options (icon) > Add Profile
This section allows you to create profiles in the fetchmsg probe.
Add Profile Window
This window allows you to specify a name and create a monitoring profile.
<Profile Name>Node
The Profile Name node is used to define the monitoring criteria of the messages and generate alarms. The monitoring profile sets up the
requirements and alerts the user when something unexpected occurs.
Navigation: fetchmsg > Profiles > Profile Name
Set or modify the following values as required:
Profile Name > Edit Profile
This section enables you to edit the monitoring profile.
Active: activates the profile, when selected.
User Profile Name: indicates the profile name.
Profile Name > Send Alarm
This section enables you to send alarm message when an alarm condition arises.
Use Alarm Message: allows you to select the alarm message when an alarm condition arises. The default message is used if an alarm
message is not specified.
Profile Name > Answer Message
This section enables you to send a reply to unanswered messages.
Using the Following Reply: defines the text which is sent as a reply to the unanswered messages.
Answer Limit: specifies the maximum number of answer messages that can be sent.
Send Alarm on Answer Limit Breach Using Alarm Message: enables you to configure the alarm when the answer limit is breached.
Default: Not Selected
Answer Message: specifies the text of the answer message.
Profile Name > Alarm Parameter
This section lets you define the suppression key which is used for alarms that are sent by this profile.
Suppression Key: specifies the suppression key to send only one instance of a repeated alarm.
Default: $profile/$key
Time Frame: specifies the time frame of the message, which prevents the generation of multiple alarms for duplicate messages. You can
select an existing time frame from the list or specify a new time frame.
Message Count Operator: allows you to select a logical operator to be used with the Message Count Value field. The operator allows
you to define conditions for the threshold to generate alarms. For example, select <= to specify the maximum value.
Message Count Value: specifies the threshold value of the number of messages. The profile generates an alarm if the message count
within the specified time frame breaches the specified message count condition.
Profile Name > Message Recognition
This section lets you set the criteria that the profile uses to retrieve messages.
Id: indicates the seven-character message identifier which identifies the message in the associated When matching field.
When matching: specifies a message identifier if Id is selected. You can use regular expressions to specify the value. For more
information, see the Using Regular Expressions section in the v1.5 fetchmsg AC Configuration article.
Severity: indicates the minimum severity level of the messages that are retrieved for monitoring in the associated When (>=) field.
When (>=): specifies the minimum numeric severity of the alarm if Severity is selected. The value can be between 0 to 99, where 99 is
the most severe.
Type: indicates the type of message in the associated When Matching field. You can select the type of message from the drop-down
list.
When Matching: specifies the message type if Type is selected. You can select an existing message type from the list of types or
specify a new type. You can use regular expressions to specify the new value. For more information, see the Using Regular
Expressions section in the v1.5 fetchmsg AC Configuration article.
Text: indicates the message text in the associated When Matching field.
Help Text (Description): indicates additional information about the message queue and is specified in the associated When Matching fi
eld.
When Matching: specifies the message text and description depending on the associated field. You can use regular expressions to
specify the value. For more information, see the Using Regular Expressions section in the v1.5 fetchmsg AC Configuration article.
Is Unanswered: select this checkbox to generate an alarm if a message is not answered.
Job Name: indicates the job name of the generated message in the associated When Matching field.
User Profile Name: indicates the job owner of the generated message in the associated When Matching field.
When Matching: specifies the job name and the user profile name depending on the associated field. You can use regular expressions
to specify the value. For more information, see the Using Regular Expressions section in the v1.5 fetchmsg AC Configuration article.
Only If Not Matched By Other Profile: enables you to select a message only if the message does not match in any other profile.
Look In Queue: selects the message queue from the drop-down list for monitoring. This field is only available if Predefined Set is
selected in the fetchmsg node.
Contents
Verify Prerequisites
Setup General Properties
Modify Queues In Predefined Set
Create Monitoring Profile for Messages
Configure Profile Properties
Create and Configure an Exclude Profile
Using Regular Expressions
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see fetchmsg (Fetch System
Messages on iSeries) Release Notes.
3. Specify the initial size of the user space where the retrieved messages are temporarily stored. The size automatically increases to contain
the requested messages.
Default: 204800
4. Specify the number of messages retrieved in one operation in Messages to read. The probe repeats the retrieve operation after
scanning messages at the next interval, as needed. The value must be larger than the maximum amount of messages you expect during
one check interval.
Default: 10000
5. Specify the minimum severity level of the messages that are retrieved for monitoring. The value must be between 0 to 99, where 99 is the
5.
most severe. You can also override this setting during profile configuration.
Default: 0
6. Select Include messages without message ID (impromptu messages) to select messages without a message ID. Both users and
applications can create messages without message ID.
Default: Not Selected
7. Select the level of details that are written to the log file in the Log Level field. Log as little as possible during normal operation to minimize
disk consumption, and increase the amount of detail when debugging.
Default: 0-Fatal
8. Specify the maximum size of the probe log file, in kilobytes. When this size is reached, new log file entries are added and the older
entries are deleted.
Default: 100 KB
9. Select the message queues that are monitored from the Message queues to read drop-down list.
Note: If you select Predefined Set, the queues specified in Raw Configuration are monitored. For more information, see the M
odify Queues in Predefined Set section.
Note: Until the monitored system creates the specified queue with the message, an active profile generates an alarm indicating that the
message is not found. When the specified message is found, the profile generates the configured alarms, as applicable.
7.
Using the following reply: define the text which is sent as a reply to the unanswered messages.
Answer limit: specify the maximum number of answer messages that can be sent. The Send alarm on answer limit breach using
alarm message field is enabled.
Send alarm on answer limit breach using alarm message: select to send an alarm when the answer limit is breached. Select the
alarm from the drop-down list.
Default: Not Selected
8. Set or modify the following values for the messages in the Alarm parameter section:
Suppression key: specify the suppression key to send only one instance of a repeated alarm.
Default: $profile/$key
Time frame: specify the time frame of the message, which prevents the generation of multiple alarms for duplicate messages. You
can select an existing time frame from the list or specify a new time frame. The format for the time frame is <duration><unit>.
Example: 15min
Message count: select a logical operator and define the threshold value of the number of messages. The profile generates an alarm
if the message count within the specified time frame breaches the specified message count condition. For example, select <= to
specify the maximum value.
9. Click Save to save the configuration and start monitoring.
requires meta characters. The probe supports Perl Compatible Regular Expressions (PCRE) which are enclosed within forward slash (/). For
example, the expression /[0-9A-C]/ matches any character in the range 0 to 9 in the target string.
You can also use simple text with some wildcard operators for matching the target string. For example, *test* expression matches the text test in
target string. The * and ? symbols are wild card operators and denote that all characters are acceptable. The * represents multiple characters
while the ? represents exactly one character.
The probe matches the patterns to select the applicable messages for monitoring.
Profile Window
The following fields in the Profile window support regular expressions:
When matching (Type)
When matching (Text)
When matching (Help Text)
When matching (Job Name)
When matching (User Profile Name)
The probe monitors the messages that fulfill the criteria for all these fields.
Exclude Message Window
The following fields in the Exclude Message support regular expressions:
When matching (Id)
When matching (Type)
When matching (Text)
When matching (Help Text)
When matching (Job Name)
When matching (User Profile Name)
The probe excludes the messages that fulfill the criteria for all these fields from monitoring.
Sample Regular Expressions
The following table lists some examples of regex and pattern matching for the probe.
Regular expression
Explanation
[A-Z]
Standard (PCRE)
Standard (PCRE)
\d*
Custom
Setup Tab
General Tab
Messages Tab
Profiles Tab
Profile Window
Setup Tab
The Setup tab allows you to view the probe and alarm message details, and also configure the log properties.
General Tab
The General tab allows you to configure the general setup properties of the probe. You can configure the log properties and control which
messages to retrieve. The fields in this tab are as follows:
Check Interval: specifies the time interval at which the probe retrieves messages from the IBM iSeries message queues. Reduce this
interval to increase the frequency of alarms.
Default: 60
Log Level: specifies the level of details that are written to the log file. Log as little as possible during normal operation to minimize disk
consumption, and increase the amount of detail when debugging.
Default: 0 - Fatal
Log Size: specifies the size of the log file to which the internal log messages of the probe are written, in kilobytes. When this size is
reached, new log file entries are added and the older entries are deleted.
Default: 100 KB
User space size: specifies the initial size of the user space where the retrieved messages are temporarily stored. The size automatically
increases to contain the requested messages.
Default: 204800
Messages to read: specifies the number of messages retrieved in one operation. The probe repeats the retrieve operation after scanning
messages at the next interval, as needed. The value must be larger than the maximum amount of messages you expect during one
check interval.
Default: 10000
Only messages with severity >=: specifies the minimum severity level of the messages that are retrieved for monitoring. The value
must be between 0 to 99, where 99 is the most severe. You can also override this setting during profile configuration.
Default: 0
Include Messages Without Message ID (Impromptu Messages): enables you to select messages without message ID. Both users and
applications can create messages without message ID.
Default: Not Selected
Message Queues To Read: specifies the message queues that are monitored.
The available options are as follows:
qSysOpr Only: Only the qSysOpr queue is monitored. This is the default selection.
Predefined Set: The queues specified in Raw Configuration are monitored. For more information, see the Modify Queues in
Predefined Set section in the v1.5 fetchmsg AC Configuration article.
Messages Tab
The Messages tab allows you to view, modify, or delete alarm messages.
Messages in this tab have the following details:
Name: indicates the name of the alarm message.
Text: indicates the alarm message text.
Level: indicates the alarm severity.
Subsystem: indicates the subsystem_id.
Default: indicates the default message for each alarm situation (such as default, error, and answer_limit).
Variable Expansion in Messages
You can use the following variable in messages using the $ symbol:
$profile
$id
$key
$text
$description
$job_name
$user_profile_name
$job_number
$type
$severity
$date
$time
$error
$answer
$answer_limit
$queue
Note: One of the alarm messages can be designated as the default message and is used for profiles in which no alarm message is
specified. You can also specify a message tagged as error which is the default message for error situations.
Profiles Tab
The Profiles tab allows you to create monitoring profiles. You can create multiple monitoring profiles with different criteria to monitor the
messages. All monitoring profiles are displayed in this tab.
Profile Window
The Profile window is used to define or modify the monitoring criteria of the messages and generate alarms. The monitoring profile sets up the
requirements and alerts the user when something unexpected occurs.
The fields in this window are as follows:
Name: indicates the profile name.
Active: activates the profile, when selected.
The Profile window has the following tabs:
Message recognition
Actions
Message recognition Tab
This tab allows you to set the criteria that the profile uses to retrieve messages. The fields in the tab are as follows:
Id: indicates the seven-character message identifier which identifies the message in the associated When matching field.
When matching: specifies a message identifier if Id is selected. You can use regular expressions to specify the value. For more
information, see the Using Regular Expressions section in the v1.5 fetchmsg IM Configuration article.
Severity: indicates the minimum severity level of the messages that are retrieved for monitoring in the associated When (>=) field.
When (>=): specifies minimum numeric severity of the message that is retrieved if Severity is selected. The value can be between 0 to
99, where 99 is the most severe.
Type: indicates the type of message in the associated When matching field. You can select the type of message from the drop-down
list.
When matching: specifies the message type if Type is selected. You can select an existing message type from the list of types or
specify a new type. You can use regular expressions to specify the new value. For more information, see the Using Regular
Expressions section in the v1.5 fetchmsg IM Configuration article.
Text: indicates the message text in the associated When matching field.
Help text (description):indicates additional information about the message queue and is specified in the associated When matching fiel
d.
When matching: specifies the message text and description depending on the associated field. You can use regular expressions to
specify the value. For more information, see the Using Regular Expressions section in the v1.5 fetchmsg IM Configuration article.
Is unanswered: select this checkbox to generate an alarm if a message is not answered.
Job name: indicates the job name of the generated message in the associated When matching field.
User profile name: indicates the job owner of the generated message in the associated When matching field.
When matching: specifies the job name and the user profile name depending on the associated field. You can use regular expressions
to specify the value. For more information, see the Using Regular Expressions section in the v1.5 fetchmsg IM Configuration article.
Only if not matched by other profile: enables you to select a message only if the message does not match in any other profile.
Look in queue: selects the message queue from the drop-down list for monitoring. This field is only available if Predefined Set is
selected in the Setup tab.
Actions Tab
This tab enables you to configure alarm message when an alarm condition arises. The fields in this tab are as follows:
Send alarm: enables alarms for the profile.
Use Alarm Message: allows you to select the alarm message when an alarm condition arises. The default message is used if an
alarm message is not specified.
Answer message: enables answers to messages.
Using the following reply: defines the text which is sent as a reply to the unanswered messages.
Answer limit: specifies the maximum number of answer messages that can be sent. The Send alarm on answer limit breach using
alarm message field is enabled.
Send alarm on answer limit breach using alarm message: enables you to configure the alarm when the answer limit is
breached. Select the alarm from the drop-down list.
Default: Not Selected
Alarm parameter
This section allows you to set up thresholds for the alarms.
Suppression key: specifies the suppression key to send only one instance of a repeated alarm.
Default: $profile/$key
Time frame: specifies the time frame of the message, which prevents the generation of multiple alarms for duplicate messages. You
can select an existing time frame from the list or specify a new time frame.
Message count: selects a logical operator and defines the threshold value of the number of messages. The profile generates an
alarm if the message count within the specified time frame breaches the specified message count condition. For example, select <= to
specify the maximum value.
Answered: indicates whether a response for the message was sent or not.
Job Name: indicates the job name of the generated message.
User Profile Name: indicates the job owner of the generated message.
Job Number: indicates the job number of the generating message.
The Create profile option when you right-click a message enables you to create a monitoring profile. The probe creates the profile and
automatically populates details of the message.
Exclude Tab
The Exclude tab is used to create a profile to exclude a message from monitoring by the probe.
Exclude Message Window
This window allows you to specify the parameters to create an exclude profile. The fields in the window are as follow:
Name: defines the name of the exclude profile.
Active: activates the profile, when selected.
Id: indicates the seven-character message identifier which identifies the message in the associated When matching field.
When matching: specifies a message identifier if Id is selected. You can use regular expressions to specify the value. For more
information, see the Using Regular Expressions section in the v1.5 fetchmsg IM Configuration article.
Severity: indicates the maximum numeric severity of the message that is excluded in the associated When (<=) field.
When (<=): specifies maximum numeric severity of the message that is excluded if Severity is selected. The value can be between 0 to
99, where 99 is the most severe.
Type: indicates the type of message in the associated When matching field. You can select the type of message from the drop-down
list.
Text: indicates the message text in the associated When matching field.
Help text (description): indicates additional information about the message queue and is specified in the associated When matching fie
ld.
Job name: indicates the job name of the generated message in the associated When matching field.
User profile name: indicates the job owner of the generated message in the associated When matching field.
When matching: specifies the message type, message text, description, job name, and user profile name depending on the associated
field. You can use regular expressions to specify the value. For more information, see the Using Regular Expressions section in the v1.
5 fetchmsg IM Configuration article.
Queue name: selects the excluded message queue from the drop-down list. This field is only available if Predefined Set is selected in
the Setup tab.
fetchmsg Metrics
The Fetch System Messages on iSeries (fetchmsg) probe includes a set of alarm messages that can be generated for different situations.
Error Severity
Description
MsgAlarm
Major
MsgError
Major
MsgLimit
Major
The probe did not answer messages as it breached the answer limit
More information:
file_adapter Release Notes
Configure A Node
Manage Profiles
Delete Profile
Alarm Thresholds
Configure A Node
This node provides the information to configure a section within a node.
Each section within the node enables you to configure the properties of the File Adapter probe.
Manage Profiles
You can add a monitoring profile which is displayed as a child node under the file_adapter node.
Follow these steps:
1. Click Options beside the Profiles node.
2. Click Add Profile.
3. Specify profile details in the Add Profile dialog and click Submit.
Refer to the General Configuration section of the profile name node for field descriptions.
The profile details are saved.
Delete Profile
If you no longer want the probe to monitor the log messages, you can delete the profile.
Follow these steps:
1. Click Options beside the profile name node.
2. Click Delete.
The profile is deleted.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Datetime format
Data File Format
file_adapter Node
<profile name> Node
Profiles Node
The File Adapter probe is configured to generate QoS data from third party applications and generate alarms when errors are found. You can
create profiles and groups to search for files. The profiles are configured to set the search criteria for files.
Datetime format
The File Adapter probe uses Datetime format for searching specific files in the directory. The Datetime format is used in one of the columns of the
file.
The following formatting codes are available:
%a: indicates the Abbreviated weekday name.
%A: indicates the Full weekday name.
%b: indicates the Abbreviated month name.
%B: indicates the Full month name.
%d: indicates the Day of the month as the decimal number (01 - 31).
%H: indicates the Hour in 24-hour format (00 - 23).
%I: indicates the Hour in 12-hour format (01 - 12).
%j: indicates the Day of the year as the decimal number (001 - 366).
%m: indicates the Month as the decimal number (01 - 12).
%M: indicates the Minute as the decimal number (00 - 59).
%p: indicates the Current time zone as A.M./P.M. indicator for 12-hour clock.
%S: indicates the Second as the decimal number (00 - 59).
%w: indicates the Weekday as the decimal number (0 - 6; Sunday is 0).
%y: indicates Year without century, as decimal number (00 - 99).
%Y: indicates the Year with century, as decimal number.
%%: indicates the Percent sign.
For Example, "%Y/%m/%d %H:%M:%S" translates to "2005/11/30 15:40:14".
Note: "%I" indicates the hour, and am/pm setting (%p) must be placed after %I in the date format.
file_adapter Node
This node lets you view the probe information and configure the log properties.
Navigation: file_adapter
Set or modify the following values as required:
file_adapter > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
file_adapter > General Configuration
This section lets you configure the log properties of File Adapter probe.
Log Level: Defines the detail level of the log file.
Note: This node is referred to as profile name node in the document as it is user-configurable.
This section lets you configure the alarm properties of the File Adapter probe. Alarms are generated when threshold values are breached.
Enable Alarm if the File is Missing or Rejected: Indicates that alarm is issued if the probe does not find the file in the directory after
trying for a specified number of times.
Default: Selected
Retry File Import: Specifies the number of times the probe searches the file in the directory as per the input criteria.
Before Sending Alarm With Severity: Specifies the alarm severity level.
Send Alarm With Severity: Specifies the message issued with the specified severity when the processing time exceeds the input
value.
Subsystem ID: Defines the subsystem ID issued with alarm messages.
Default: 1.1.5
Reject File if the Number of Valid Samples are below: Specifies the threshold for valid samples in the file. The file is rejected when the
threshold is breached.
profile name > QoS
This section lets you view and configure the QoS properties of the File Adapter probe.
QoS Name: Indicates the QoS name.
Publish Data: Triggers the generation of QoS.
Default: Selected
Allow Asynchronous at Start of New File: Indicates that the data in the file is not recorded at regular intervals.
profile name > Script
This section lets you configure the script to check the file.
Command: Defines the command script the probe uses to check the file.
Abort FIle Import When Preprocess Terminates With One of These Return Codes: Defines the return codes for predefined process
script. If these return codes are issued, the file is not processed.
Profiles Node
This node lets you create profiles to collect data and export it to files. The File Adapter probe processes the files for QoS data based on the
criteria that the profile defines. The new profiles are displayed as child nodes.
New Group
If you want to create a new group for your profiles, you can do so by right-clicking in the left pane and selecting New, or you can click the New
group button in the Toolbar. The group will appear as a new node in the left pane, and will by default be named "New Group". Right-click the
group and select Rename to give it a name of your own choice.
QoS
Definitions
tab
Name
Group
The name of the logical group that the new QoS will belong to. In the form QOS_xxx.
Description
Unit
Unit
abbreviation
Flag
QoS Data is
asynchronous
Check this option if the data type is asynchronous (the data in the file is NOT recorded at regular intervals).
Do NOT tick this option if the data type is synchronous. That means that the data in the file is recorded at regular interval as
specified by the Sample Interval parameter on the QoS tab on the profile dialog (sample interval is the expected interval in
seconds between the samples in the data series).
Configure profiles
Clicking the Create A New Profile button opens the Profile dialog.
You can also right-click in the left pane and select New Profile.
The Profile Properties dialog pops up, enabling you to define a new monitoring profile. The dialog contains a field for general parameters and six
tabs:
Input, describing the criteria for the data file to be monitored.
Column, describing the format of the file to be monitored
Output, describing how to handle the file when processed.
Alarms, describing the conditions that will trigger an alarm
QoS, defining the QoS messages to be sent.
Script, allowing you to run a script checking the file before the file is processed by the probe.
Field
Description
Profiles
Here you can create profiles defining the files to be monitored, the QoS messages to be sent on each of the
rows in the file when the file is found, and the Alarms to be sent on error conditions.
The parameters below marked with an asterisk (*) will be contained in the QoS message.
Active
The profile is activated when this option is checked. Note that new profiles by default are activated.
Profile Name
Group
Profile description
Frequency
Time
The time the probe checks the specified file (see also Frequency above and Day(s) below).
Day(s)
The day(s) the probe checks the specified file at the time specified.
Examples
Clicking the Examples button helps you set the correct format (as an example). Use an example and edit the
specifications in the fields to match your needs.
Input tab
Import directory
The directory where the file should be located. The probe supports pattern matching on directories
The name of the file, for example log.txt or *.log. The formatting codes described in the section can be used.
The probe supports pattern matching on files.
Insert a user name and a password if necessary when a network drive is specified as the import directory.
This field is normally deactivated (greyed out), but will be activated as soon as you type \\x in the Import
directory field.
Selecting this option, the probe skips input lines beginning with a comment symbol (e.g. #).
Column tab
Datetime format
The format of the date and time specification. Select from the drop-down menu or type your own value (see
the section ).
Datetime in a format as specified here is expected to be found in one of the columns in the file (see Column
definitions below).
Column definitions
This section describes how the columns in the file are expected to be. See also the section .
Datetime:
The column containing the Datetime entry. This column is mandatory unless "current" has been selected as
Datetime format. Then the Datetime field is set to 0.
QoS:
The column containing the QoS. This column is mandatory.
Stdev (Standard deviation):
The column containing the Stdev. This parameter is optional.
Separator:
The separator dividing the rows in the file into two or more columns (e.g. comma, semicolon etc). Select the
character to be used as a separator between the columns in the file.
Select columns (button):
Clicking this button, you will be able to preview the columns and rows of the file monitored by the profile,
provided that a file is present in the specified directory.
Output tab
After a file has been processed, it will be deleted or moved to a specific directory. The destination directory is
specified here.
If this option is checked, the file will be moved to one of the directories specified below, depending on the
result of the import.
If the option is NOT checked, the file will be automatically deleted.
After the probe has processed the file, the file is stored in this directory if the import was successful (the file
contained the expected columns) or another criteria was met (e.g. <x samples).
After the probe has processed the file, the file is stored in this directory if the import failed (the file did NOT
contain the expected columns).
The files will be kept in the directories specified above for the number of days specified here.
Specifying 90 days means that history (displayed in the main window of the probe GUI) will be kept for 90
days, and that any files older than 90 days will be deleted.
Alarms tab
The probe attempts to find the file specified under the Input tab. If not found at first attempt, the probe tries
again. When the number of attempts exceeds the maximum the number of times specified in the Retry file
import field, an alarm message will be issued.
The maximum number of attempts to find the file before sending an alarm message.
If the file is still not found after the maximum number of attempts, an alarm message with the selected
severity level.
This field lets you select an alarm message with the selected severity to be sent if the input value specified is
exceeded.
Subsystem ID
The file will be rejected if at least one of the lines contains QoS columns with invalid format.
Rejects the file if the number of valid samples found in the file is below the specified number.
QoS tab
QoS
The name of the QoS on the form (QoS_xxx) to be sent on each row in a file when the file is found. Using the
pull-down menu lets you select one of the QoS definitions created on the general tab. Otherwise you can type
the name of a QoS of your own choice.
Sample Interval
The expected interval in seconds between the samples in the data series. Note that this field is greyed out if
the selected QoS type is defined to be of asynchronous type.
Maximum Value
Set the maximum sample value. Note that this field will be greyed out unless the selected QoS type is defined
to have a maximum value.
With this option selected, the probe allows the first row of a new file to be asynchronous.
Asynchronous means that data in the file is NOT recorded at regular intervals.
Script tab
Command
Here you can (optionally) insert a command script for checking the file before the file is processed by the
probe.
Here you can enter one return code or a list of comma-separated return codes. If the preprocess script
returns with one of these codes, the input file will not be processed.
Datetime format
The New Probe probe uses Datetime format for searching specific files in the directory. The Datetime format is used in one of the columns of the
file.
The following formatting codes are available:
%a: indicates the Abbreviated weekday name.
%A: indicates the Full weekday name.
%b: indicates the Abbreviated month name.
%B: indicates the Full month name.
%d: indicates the Day of the month as the decimal number (01 - 31).
%H: indicates the Hour in 24-hour format (00 - 23).
%I: indicates the Hour in 12-hour format (01 - 12).
%j: indicates the Day of the year as the decimal number (001 - 366).
Note: "%I" indicates the hour, and am/pm setting (%p) must be placed after %I in the date format.
file_adapter Metrics
The following table describes the checkpoint metrics that can be configured using the File Adapter (file_adapter) probe.
Monitor Name
Units
Description
Version
QOS_CSV_TEST
qos
CSV File
v1.0
QOS_FILEADAPTER
Value
Fileadapter variable
v1.0
QOS_FTP_RESPONSE
Seconds
v1.0
QOS_GO_CPU
Percent
v1.0
QOS_TEST
TS
Test
v1.0
More information:
fsmounts (File Systems Mount Monitoring) Release Notes
fsmounts Node
The fsmounts node contains configuration details specific to the Filesystem Mounts Monitoring probe. This node lets you:
View the probe information
Configure the general properties of the probe
Select mount points that the fsmounts probe ignores.
View the file systems that the fsmounts probe discovers.
Navigation: fsmounts
Set or modify the following values as required:
fsmounts > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the vendor who created the probe.
Note: The CA Unified Infrastructure Management Alarm Server probe manages the sub system ID.
3. Select or enter the mount points, devices, and file system types to be ignored by the probe in the Ignore tab.
4. You can edit the messages that will be sent by the probe in the Messages tab. Refer Editing Messages for more information.
4.
Editing Messages
You can modify alarm messages.
Follow these steps:
1. Right click in Setup > Messages tab and select Edit.
The Alarm Message window is displayed.
Defines the general setup parameters, like check interval and logging properties.
Check Interval
Interval at which the probe checks the computer hosting the probe for file systems changes. The default value is 60 seconds.
Log Level
Sets the level of details written to the probes log file. We recommend logging as little as possible during normal operation, to minimize
disk consumption.
Log size
Sets the size of the probes log file to which probe-internal log messages are written. The default size is 100 KB.
When this size is reached, the contents of the file are cleared.
Ignore Tab
Defines the mount points, devices, and file system types to be ignored by the probe.
ext3
defaults
LABEL=/boot
/boot
ext3
defaults
tmpfs
/dev/shm tmpfs
defaults
devpts
/dev/pts
devpts
gid=5,mode=620
sysfs
/sys
sysfs
defaults
proc
/proc
proc
defaults
/dev/VolGroup00/LogVol01
swap
swap
defaults
ext3
defaults
LABEL=/boot
/boot
ext3
defaults
tmpfs
/dev/shm tmpfs
defaults
devpts
/dev/pts
devpts
gid=5,mode=620
sysfs
/sys
sysfs
defaults
proc
/proc
proc
defaults
/dev/VolGroup00/LogVol01
swap
swap
defaults
Messages Tab
Lists out the error messages to be issued at the different error situations that may occur. You can modify the messages as well.
Alarm Message Window
The Alarm message properties dialog enables you to modify a message. Variables may be used.
The fields in the Alarm message dialog box are explained below:
Name
Identification name of the alarm message.
Text
This is the text of the alarm message.
Level
The severity of the alarm (clear, information, warning, minor, major or critical).
Subsystem
The ID of the subsystem being the source of this alarm. The subsystem id is managed by the nas.
Ignored
Indicates if the file system is ignored, based on the settings on the Setup > Ignore tab.
Mounted
Indicates if the file system is mounted or not.
Alarm
Any alarm message issued based on the condition of the file system.
Note: Ensure that you have valid credentials of your Google Apps domain account for letting the google_apps probe access the
domain.
Important! The Google Apps Monitoring probe is now available through Admin Console GUI only and not through the Infrastructure
Manager GUI. Upgrade from previous version to version 2.00 is not supported.
More information:
google_apps (Google Apps Monitoring) Release Notes
Configure a Node
Alarm Thresholds
Add New Profile
Delete Profile
Configure a Node
This procedure provides the information to configure a section within a node.
Each section within the node lets you configure the properties of the probe for connecting to the Google cloud and monitoring various Google
applications.
Follow these steps:
1. Navigate to the section within a node that you want to configure.
2. Update the field information and click Save.
The specified section of the probe is configured.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Delete Profile
You can delete a profile if you do not want the probe to collect the Google applications or domain statistics.
Follow these steps:
1. Click the Options icon next to the domain name node that you want to delete.
2. Select Delete Profile.
3. Click Save.
The monitoring profile is deleted from the resource.
google_apps Node
Google Apps Service Status Node
App Status
<Domain Name> Node
Daily Statistics Node
Contact Operations
Document Operations
Test Mail
google_apps Node
This node lets you view the probe information and configure the log file settings. You can also add a monitoring profile so that the probe can
connect to the Google Cloud and can collect the domain statistics.
Navigation: google_apps
Set or modify the following values as required:
App Status
This node lets you view the status of all the Google Apps. The probe fetches the application statistics and lists them in a tabular form. You can
configure the probe for generating QoS data for any specific application.
The status information is represented through the following codes:
0: App Normal
1: App Information Available
2: App Service Disruption
3: App Service Outage
4: App Status Unknown
Navigation: google_apps > Google Apps Service Status > App Status
Set or modify the following values as required:
App Status > App status
This section consists of a table that lists all the Google Applications. You can select any application and can configure its properties that
are listed below the table.
Refer to the How to Configure Alarm Thresholds topic for configuring the static and dynamic thresholds.
Note: This node is referred to as domain name node in the document and it represents the domain monitoring profile.
Note: The monitors of the Google App domain profile are visible in a tabular form. You can select any one monitor in the table
and can configure its properties.
Refer to the How to Configure Alarm Thresholds topic for configuring the static and dynamic thresholds.
Contact Operations
This node lets you view the list of monitors available for monitoring the operations that you perform on the contacts of your Google account. You
can let the probe to generate the QoS data for any of the available monitors.
Navigation: google_apps > domain name > Contact Operations
Set or modify the following values as required:
Contact Operations > Monitors
This section lets you configure the monitors of operations that are performed on the contacts of the Google App domain for generating
QoS.
Note: The monitors of the user operations are visible in a tabular form. You can select any one monitor in the table and can
configure its properties.
Refer to the How to Configure Alarm Thresholds topic for configuring the static and dynamic thresholds.
Document Operations
This node lets you view the list of monitors available for monitoring the operations that you perform on the documents of your Google account.
You can let the probe to generate the QoS data for any of the available monitors.
Navigation: google_apps > domain name > Document Operations
Set or modify the following values as required:
Document Operations > Monitors
This section lets you configure the monitors of operations that are performed on the documents of the Google App domain for generating
QoS.
Note: The monitors of the user operations are visible in a tabular form. You can select any one monitor in the table and can
configure its properties.
Refer to the How to Configure Alarm Thresholds topic for configuring the static and dynamic thresholds.
Test Mail
This node lets you configure the probe to monitor the time it takes for the probe to send an email using the Google SMTP server.
Navigation: google_apps > domain name > Test Mail
Set or modify the following values as required:
Test Mail > Mail Configuration
This section lets you configure the email settings for sending a test email over the SMTP server.
Active: activates the mail configurations of the probe.
Default: Not Selected
SMTP Host: defines the domain name or an IP address of a Google SMTP server.
SMTP Port: defines the SMTP port number which used when the probe sends an email through SSL.
Mail Recipient: defines any valid email address to which the probe can send a test email.
Mail Subject: defines the subject of the test email that the probe sends to the specified email recipient.
Note: The Test Email option in the Actions drop-down sends the configured test email.
Note: The monitors of the Google App domain profile are visible in a tabular form. You can select any one monitor in the table
and can configure its properties.
google_apps Metrics
The following table describes the metrics that can be configured using the Google Apps (google_apps) probe.
QoS Metrics
Monitor Name
Metric Name
Units
Description
Version
QOS_GOOGLE_APPS_APP_STATUS
App Status
Status
2.0
QOS_GOOGLE_APPS_DOMAIN_AVG_STORAGE_BYTES_PER_ACCOUNT
Average Bytes
Per Account
Bytes
2.0
QOS_GOOGLE_APPS_ACTIVE_ACCOUNTS
Number of
Active
Accounts
Count
2.0
QOS_GOOGLE_APPS_ACTIVE_ACCOUNTS_ON_GIVEN_DAY
Number of
Active
Accounts on a
given day
Count
2.0
QOS_GOOGLE_APPS_EMAIL_ACCOUNTS_ACCESSED
Number of
Accounts
Accessed on a
day
Count
2.0
QOS_GOOGLE_APPS_EMAIL_ACCOUNTS_ACCESSED_VIA_POP
Number of
Accounts
Accessed on a
day using POP
Count
2.0
QOS_GOOGLE_APPS_EMAIL_ACCOUNTS_ACCESSED_VIA_WEB
Number of
Accounts
Accessed on a
day using web
Count
2.0
QOS_GOOGLE_APPS_IDLE_ACCOUNTS_OVER_30DAYS
Number of Idle
Accounts
Count
2.0
QOS_GOOGLE_APPS_LARGE_MAIL_BOXES
Number of
Large
Mailboxes
Count
2.0
QOS_GOOGLE_APPS_MEDIUM_MAIL_BOXES
Number of
Medium
Mailboxes
Count
2.0
QOS_GOOGLE_APPS_SMALL_MAIL_BOXES
Number of
Small
Mailboxes
Count
2.0
QOS_GOOGLE_APPS_USED_STORAGE_BYTES
Total Bytes
Used
Bytes
2.0
QOS_GOOGLE_APPS_USED_STORAGE_PERCENT_QUOTA
Quota Used
Percent
2.0
QOS_GOOGLE_APPS_CONTACT_CREATE_LATENCY
Contact Create
Latency
Milliseconds
2.0
QOS_GOOGLE_APPS_CONTACT_DELETE_LATENCY
Contact Delete
Latency
Milliseconds
2.0
QOS_GOOGLE_APPS_CONTACT_FIND_LATENCY
Contact Find
Latency
Milliseconds
2.0
QOS_GOOGLE_APPS_DOCUMENT_CREATE_LATENCY
Document
Create Latency
Milliseconds
2.0
QOS_GOOGLE_APPS_DOCUMENT_DELETE_LATENCY
Document
Delete Latency
Milliseconds
2.0
QOS_GOOGLE_APPS_DOCUMENT_UPDATE_LATENCY
Document
Update Latency
Milliseconds
2.0
QOS_GOOGLE_APPS_MAIL_SEND_LATENCY
Message Send
Latency
Milliseconds
2.0
ha (High Availability)
Continual operation of your UIM monitoring infrastructure requires minimal downtime for the primary hub. One way to achieve this is to install UIM
Server on a Microsoft cluster, which provides a true high-availability solution. Another solution is to configure a secondary hub to take over if the
primary hub goes down. This solution, which is based on the High Availability (HA) probe, ensures the essential functions of the primary hub
continue. One limitation, however, is that UMP is unavailable during failover.
HA Capabilities
The primary hub performs the following essential functions:
Routes all QoS data to the database through the use of the data_engine probe.
Sends alarm messages via the NAS probe.
Hosts the Admin Console web application, made available through the service_host probe and admin_console package.
Enables UMPs dashboard_engine to locate key UIM components, such as data_engine and NAS.
Routes discovery data to the database through the use of the discovery_server and nis_server
During failover, all functions of the primary hub can be temporarily assumed by an HA hub, EXCEPT for communication with UMP, which initiates
communication with the primary hub. In most deployments, this limitation is acceptable because:
Restoring the primary hub is a top priority, and access to UMP is not needed for this task.
The QoS message flow to the database and alarm notification via email continue without interruption.
If Admin Console is configured on the HA hub and Infrastructure Manager is installed on another system, administrators can continue to
manage the UIM infrastructure during failover.
The best way to ensure that UMP can always access the data is to install UIM Server on a Microsoft cluster. If the clusters active node fails, the
secondary node takes over and primary hub functionality continues with very limited interruption. Data flow, alarms, component management, and
data viewing all continue as normal. For more information, see the article Installing in an Active/Passive Microsoft Cluster on the CA UIM
Documentation Space.
In this configuration:
The HA probe monitors the primary hubs state (active or non-active) from the HA hub.
Secondary hubs and robots that have the primary hub as their parent send data to the primary hub.
Tunnels and queues are configured from the remote secondary hub to the primary hub and to the HA hub. The queue to the secondary
hub (shown with a dashed line) is inactive.
On the primary hub:
The data_engine probe sends QoS data to the database.
The NAS probe has auto-operator and nis_bridge enabled, allowing the probe to email alarm notifications and send alarm data to the
database.
Remote monitoring probes (such as RSP) collect data.
The data_engine on the primary hub provides a database connection string to UMP components, which allows UMP to connect to the
database.
Admin Console is hosted on the primary hub. On Windows systems, Infrastructure Manager is also installed on the primary hub.
In this configuration:
The HA probe lost contact with the primary hub and initiated a failover. The probe will initiate a failback when contact with the primary hub
is reestablished.
All connections to the primary hub are unavailable. Any remote monitoring done from the primary hub (such as through the RSP probe)
ceases.
On the HA secondary hub, the HA probe activates:
The data_engine probe, which now sends QoS data to the database.
Queues for remote hubs.
The NAS probe. If the HA probes NAS AO option is enabled, the auto-operator allows the secondary NAS to email notifications when
thresholds are met. However, the messages are not stored in the database. They are stored locally with the secondary NAS.
The service_host probe, which runs the Admin Console web app.
Remote monitoring probes (such as RSP). This ensures that the secondary hub can continue the remote monitoring that was being
done by the primary hub.
Robots that were managed by the primary hub and local secondary hubs now send data to the HA hub.
The UMP server cannot send requests to the HA hub, and thus receives no data.
The database does NOT have current alarm information. The information is stored locally stored by the secondary NAS, which will send
the replicated information to the primary NAS when failback occurs.
The database does NOT have current alarm information because the nis-bridge on the secondary NAS is disabled. The information is not
lost; it is stored locally stored by the secondary NAS, which will send the replicated information to the primary NAS when failback occurs.
Admin Console is available if service_host and admin_console are deployed on the HA hub. It can be accessed at http://<HA_hub_IP_a
ddress>:<adminconsole_port>/adminconsole.
More information:
ha (High Availability) Release Notes
v1.4 ha AC Configuration
The Admin Console ha configuration GUI lets you configure the ha probe to provide high availability for the primary hub. This article covers the
following topics:
Important! You cannot change the order that probes or queues are enabled/disabled on the secondary hub in the ha probe Admin
Console GUI. If a specific order is required on failover, it must be configured using the Raw Configuration menu. For more information,
see Change the Probe and Queue Enablement/Disablement.
You can select the probes that will be activated on the secondary hub after failover. Any probes that are activated on the secondary hub must be
configured on the system hosting the ha and the secondary hub.
Follow these steps:
1. Click the Probes to Enable folder.
2. Select the Probe that you want to enable from the Probe Name drop-down list.
3. Click the Actions button, select Add Probe.
7.
probes_down folder- Contains the probes to disable on the secondary hub.
queues_up folder- Contains the queues to enable on the secondary hub.
queues_down folder- Contains the queues to disable on the secondary hub.
Example: Change the order that the data_engine and service host probes activate on failover
This example switches the service host and data_engine probe deployment order. The data_engine probe should be deployed before the service
host probe to avoid logging invalid errors. If the probes appear out of order in the Raw Configure menu:
4.
The new log values are saved.
Note: If the NIS-bridge is enabled, the nas log contains database constraint violation errors. These errors do not cause
database issues, but result in error messages until the secondary nas can successfully import data into the NIS database.
Important! Do not include the nas probe and associated queues in the HA probe configuration for probes and queues to
enable. The secondary nas is always on.
2. Enable replication on the nas probe on the primary and secondary hubs.
3. Enable Auto-operator on the primary nas.
4. Enable NIS-bridge on the primary nas.
The secondary nas communicates with the primary nas and its information is synchronized with the primary through the replication
4.
mechanism. The primary nas sends the information to the NIS database.
HA
Advanced
Probes to Disable
Probes to Enable
Queues to Disable
Queues to Enable
To access the ha configuration interface, select the hub robot in the Admin Console navigation pane. In the Probes list, click the arrow to the left
of the ha probe and select Configure.
HA
Navigation: HA
This section lets you view probe information and change the general configuration options.
Probe Information
This section displays the probe name, start time, version, and vendor.
General Configuration
This section lets you modify the log file settings and change the Primary Hub Address.
Log level
The level of detail that is written to the log.
Log size (KB)
The maximum size of the log file before it rolls over.
Primary Hub Address
The address of the monitored primary hub.
Alarm Messages
This section displays the following read-only information about each message received by the HA probe.
Message ID
The identification name of the alarm message.
Message Text
The text of the alarm message. Variables are used in the message.
Severity
The severity of the alarm (clear, information, warning, minor, major, or critical).
Message Token
The token value of the message.
Subsystem
The ID of the subsystem where the alarms originated. The value is always 1.2.3.8.
il8n Token
The token value for internationalization.
Advanced
v1.4 ha IM Configuration
Contents
The GUI
The Setup tab
The Configure tab
The Options tab
The Messages tab
The Status tab
The GUI
The ha probe is configured by double-clicking the probe in Infrastructure Manager, which brings up the GUI (configuration tool).
Log level
Select the level of detail to be written to the probes log file. We recommend logging as little as possible during normal operation to
minimize disk consumption.
Log size
Enter the size of the probes log file where probe-internal log messages are written. The default size is 100 KB.
When this size is reached, the contents of the file are saved to a log backup file, the log file is cleared and new log entries are added to
the log file. The backup file is overwritten each time the log file reaches the maximum size.
Primary hub
The ha probe uses a secondary Hub to monitor the primary hub. If the primary hub becomes unavailable, the secondary hub will
automatically take over.
The drop-down menu lets you select the name of the primary hub (the hub to be monitored).
Queues to enable
Select the queues to be enabled at failover. These queues must exist on the computer hosting the ha probe and the secondary hub.
Queues to disable
Select the queues to be disabled at failover. These queues must exist on the computer hosting the ha probe and the secondary hub.
Probes to enable
Select the probes to be enabled (activated) at failover. These probes must be configured on the computer hosting the ha probe and the
secondary hub.
Note: You should verify the probes are properly configured and activate successfully prior to failover. If the probes are not
properly configured they may not start when failover occurs.
Probes to disable
Select the probes to be disabled (deactivated) at failover. These probes must be configured on the computer hosting the ha probe and
the secondary hub.
Heartbeat interval
The interval at which the ha probe checks that the primary hub is available and operative. The default value is 10 seconds.
Failover Wait time
How long the ha probe waits for a response from the primary hub before it instructs the secondary hub to take over. The default wait time
is 30 seconds.
Failback Wait time
How long the ha probe waits after communication with the primary hub has been re-established before it begins failback, allowing time for
all probes, tunnels, and queues configured on the primary hub to be ready. The default wait time is 30 seconds.
Reset nas Auto Operator on failover
Select this option to enable Auto-Operator on the nas probe running on secondary hub on failover. If selected, the auto-operator setting in
the nas configuration is modified and the nas probe is restarted.
The Messages tab displays the alarm messages that will be issued for the different error situations that occur.
These messages may be modified by right-clicking the message in the list and select Edit. The Alarm message properties dialog will pop up,
enabling you to modify the message. Variables may be used.
Name
The identification name of the alarm message.
Text
The text of the alarm message. Variables can be used in the message.
Level
The severity of the alarm (clear, information, warning, minor, major or critical).
Subsystem
The ID of the subsystem where the alarm originated. The subsystem ID is managed by the nas. This should be '1.2.3.8' for all messages
from this probe.
Green indicator
This means the ha probe is running and successfully connected (communicating) with the primary hub. The ha probe is not in failover
state.
Red indicator
This means the ha probe is in failover state because the connection to the primary hub has been lost.
Black indicator
This means that the ha probe is not running. In this state, should the primary hub go down the ha probe would not know about it and
therefore would not enable/disable probes and queues on the secondary hub accordingly.
ha Troubleshooting
Unable to read configuration...
Symptom:
I see Error: Unable to read configuration when I try to open the hub configuration GUI. I receive the error code MONS-021.
Solution:
Do the following:
Deploy the PPM probe to the hub. The PPM probe is required for GUI operation.
Viewing the Log File...
Advanced users may find it helpful to view the log file. Click the icon next to the hub probe and select
log file settings so that it retains more data for troubleshooting.
ha Scenarios
The ha probe can be configured in a various number of ways. This article describes several different configuration scenarios.
Contents
Common Configuration
Simple Setup for Probe Enablement on Failover
Secondary nas with Replication
Important: The secondary data_engine cannot perform maintenance operations, which are relegated to the primary data_engine.
Always configure the data_engine so that it is off on the secondary hub prior to faillover.
Common Configuration
A common configuration is for the nas probe on both the primary hub and secondary hub (hub where the ha probe is running) to be running with
replication enabled. In this type of configuration, the primary nas would have both auto-operator and nis-bridge enabled. When the primary hub
goes down and failover is initiated, then the auto-operator is enabled on secondary nas probe based upon whether the 'NAS AO' checkbox is
enabled in the 'Options' tab of the ha probe configuration. Enabling auto-operator allows the secondary nas to send notifications when thresholds
are met, but because the nis-bridge on the secondary nas remains disabled the information is not stored in the NIS database. The information will
be stored locally with the secondary nas.
After the primary hub (and thus the primary nas) comes back online the information will be synchronized with the primary nas and will then be
stored in the NIS database by the primary nas probe. Additionally, after the primary hub comes back online, auto-operator on the secondary nas
will be automatically disabled if it had been enabled during failover.
During the time the primary hub is down, the NIS database will be out of date from the current alarm information. The information is not lost, it is
just being locally stored by the secondary nas until the primary hub comes back online and the primary nas can then send the replicated
information to the NIS database.
Note: If the NIS-bridge is enabled, the nas log contains database constraint violation errors. These errors do not cause
database issues, but result in error messages until the secondary NAS can successfully import data into the NIS database.
In order to keep alarms, event messages and count identifiers in-sync, most customers enable and configure nas replication.
In situations where the nas probe must continue to collect information after a failover, you can configure a high-availability configuration where the
nas probe is activated on the secondary hub and stores data locally until failback. After the primary hub comes back online, the primary nas sends
the replicated information to the NIS database.
Example Procedure for Setting up nas replication
Follow these steps:
1. Deploy a nas probe on the secondary hub.
Important! Do not include the nas probe and associated queues in the HA probe configuration for probes and queues to
enable. The secondary nas is always on.
2. Enable replication on the nas probe on the primary and secondary hubs.
3. Enable Auto-operator on the primary nas.
4. Enable NIS-bridge on the primary nas.
The secondary nas communicates with the primary nas and its information is synchronized with the primary through the replication
mechanism. The primary nas sends the information to the NIS database.
More information:
hadoop_monitor (Hadoop Monitoring) Release Notes
hadoop_monitor AC Configuration
Contents
Configuration Overview
Configuring Probe Setup
Configuring Monitoring
Alarm Thresholds
Configuration Overview
At a high level, configuring the probe consists of the following actions:
1. Configuring the probe setup.
2. Adding monitors to the appropriate system components and configuring monitor data, to include alarm thresholds.
After you install the probe, you must complete the Probe Setup configuration settings. If these settings are incomplete, the probe will not publish
the Hadoop network topology. You only need to configure the settings on the probe that is running the Hadoop NameNode process. In a Hadoop
HA Cluster environment, the NameNode process must be active. See hadoop_monitor AC GUI Reference for information about configuring the
probe setup using the probe GUI.
Configuring Monitoring
You can add and configure monitors to the components and processes shown in the navigation pane. Click a node in the navigation pane to see
any associated monitors for that component. You can configure the QoS measurements you want to collect data for, and any alarms or events
you want by modifying the appropriate fields.
Add/Configure a Monitor
Follow these steps:
1. Go to hadoop_monitor > profile name > resource name in the navigation pane.
2. Click on a component name. The available monitors appear in a table on the right side of the screen. It might be necessary to expand the
node in the navigation pane to view the monitors and Qos metrics.
3. Select the monitor you want to modify in the table.
4. Configure the monitor settings in the fields below the table.
5. Click Save at the top of the screen.
When the new configuration is loaded, a Success dialog appears.
6. Click OK.
The navigation pane is updated with the new configuration.
Important! Alarm threshold settings are dependent on the baseline_engine probe. If you do not have the correct version of
baseline_engine configured, you will not see the additional threshold options.
For more information about the basic Hadoop Monitoring GUI properties, see Admin Console Probe GUI.
hadoop_monitor Node
After installing the probe, you must configure the probe setup. Setup must be configured only on the probe which is running the Hadoop
NameNode process. For a Hadoop HA Cluster, you must configure the probe on the currently Active NameNode process. If this step is not
performed, the probe will not publish the network.
Navigation: hadoop_monitor probe
hadoop_monitor > Probe Information
This section displays read-only information about the probe.
hadoop_monitor > Probe Setup
Probe setup may differ based upon whether or not BDIM Integration is in place.
No BDIM Integration
For this setup, the only required field is the Cluster name. This cluster name must match the cluster name configured in your Cloudera or
Hortonworks systems. If you are an Apache Hadoop customer, you may set the cluster name to any value.
Fields to know:
Log Level
(Optional) Specifies the amount of log information you would like to collect for this probe.
Default: 3 - Info (Recommended)
Cluster name
Specifies the name of the Hadoop cluster you wish to monitor.
Collect System Metrics
(Optional) Specifies whether to collect system metrics.
HDFS poll interval in seconds
(Optional) Specifies how often to poll the Hadoop Distributed File System (HDFS) for data.
BDIM Integration in place
If BDIM Integration is in place, the only required fields are the BDIM Account ID, and option to Publish Messages to BDIM.
Fields to know:
Log Level
(Optional) Specifies the amount of log information you would like to collect for this probe.
Default: 3 - Info (Recommended)
BDIM Account ID
Specifies the account ID used to access BDIM.
Cluster name
(Optional) Specifies the name of the Hadoop cluster you wish to monitor.
Collect System Metrics
(Optional) Specifies whether to collect system metrics.
Publish Messages to BDIM
Specifies whether to publish alarm messages to BDIM.
HDFS poll interval in seconds
(Optional) Specifies how often to poll the Hadoop Distributed File System (HDFS) for data.
Profile Node
The Hadoop Monitoring probe includes a local resource profile. You can modify or some settings for this resource profile and verify the resource
information.
Navigation: hadoop_monitor > profile name
profile name > Actions > Verify Information
Use this option to validate the resource profile information
profile name > Resource Setup
Fields to know:
Identifier
The identifier for the local resource profile
Active
Select this checkbox to activate monitoring of the resource.
Interval (secs)
The time to wait for connection to establish.
Alarm Message
The alarm message to be sent if the resource does not respond.
Hostname Node
The hostname node is a container for the probe inventory that can be configured for monitoring. This typically will include the cluster node and
the system node. Because this node is only a container, no configuration information is displayed in the details pane for this node.
Navigation: hadoop_monitor > profile > hostname
Cluster Node
The cluster node contains all of the Hadoop system services that are discovered by the Hadoop Monitoring probe when the probe configuration is
setup. The services that are discovered may vary depending upon the configuration of your Hadoop system. The Hadoop Monitoring probe is
capable of discovering the following services:
HDFS
Hadoop Distributed File Ssytem
Roles - an empty node used to organize the following HDFS services:
namenode
keeps the directory tree of all files in the file system, and tracks where data is stored within the file system.
datanode
stores data in the HDFS.
secondarynamenode
performs periodic checkpoints of the namespace and helps minimize the size of the log containing changes to the HDFS stored at the
namenode.
YARN
Resource management platform responsible for managing cluster resources and using them to schedule applications.
Container
provides processing capacity for YARN resources.
Roles - an empty node used to organize the following YARN services:
resourcemanager
manages applications and schedules resources for the applications.
nodemanager
monitors container resource usage.
apphistoryserver
stores and retrieves historic information for applications.
MAPREDUCE
Distributes work around the cluster
Roles - an empty node used to organize the following MAPREDUCE services:
jobhistoryserver
stores and retrieves historic information about jobs executed by MAPREDUCE.
HBase
Hadoop storage manager.
Roles - an empty node used to organize the following HBase services:
hbasemaster
coordinates the HBase cluster and is responsible for administrative operations.
regionserver
responsible for handling a subset of an HBase table's data.
thrift
cross platform development framework and API used by HBase.
REST
REST server bundled with HBase.
OOZIE
Workflow scheduler system to manage Hadoop jobs
HIVE
Data warehouse that facilitates querying and managing large datasets residing in distributed storage.
Roles - an empty node used to organize the following HIVE services:
hiveserver2
server interface that enables remote clients to execute queries against Hive and retrieve the results.
metastore
stores the metadata for Hive tables and partitions and provides clients access to the data.
Navigation: hadoop_monitor > profile > hostname > cluster
Click the cluster node to add/configure monitors to the cluster. To add/configure monitors for services on the cluster, expand the cluster node,
and navigate through the hierarchy as needed. Select each node you wish to monitor. In the details pane, select the monitors you wish to
configure and add to that node, and modify settings as appropriate.
Fields to know:
QoS Name
The name to be used on the QoS message issued. The field is read-only.
Description
This is a read-only field, describing the monitor.
Units
The unit of the monitored value (for example %, Mbytes etc.). The field is read-only.
Metric Type Id
Identifies a unique Id of the QoS.
Publish Data
Select this option if you want QoS messages to be issued on the monitor.
Publish Alarms
Select this option if you want to activate alarms.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
Current Value -- The most current value measured will be used.
Delta Value (Current - Previous) -- The delta value calculated from the current and the previous measured sample will be used.
Delta Per Second -- The delta value calculated from the samples measured within a second will be used.
Average Value Last n Samples -- The user specifies a count. The value is then averaged based on the last "count" items.
Number of Samples
The count of items for the Value Definition when set to Average Value Last n Samples .
Operator
The operator to be used when setting the high or low alarm threshold for the measured value. You must select Publish Alarms to enable
this setting.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The high or low alarm threshold value. An alarm message will be sent if this threshold is exceeded.
Message Name
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message pool.
Configure dynamic alarm thresholds following the instructions found in Hadoop Monitoring (hadoop_monitor) Configuration v1.0.
System Node
In addition to clusters, you can add monitors to the Hadoop system components and services. The Hadoop Monitoring probe can monitor the
following system components and services:
CPU
Memory
Network - this node is a container node for the following network services:
eth0
Io
TCP
StorageVolumes - this node is a container node for system directories:
/
/boot
/home
Navigation: hadoop_monitor > profile > hostname > System
Click the system node to add/configure monitors to the cluster. To add/configure monitors for components or services on the system, expand the
systemnode, and navigate through the hierarchy as needed. Select each node you wish to monitor. In the details pane, select the monitors you
wish to configure and add to that node, and modify settings as appropriate.
Fields to know:
QoS Name
The name to be used on the QoS message issued. The field is read-only.
Description
This is a read-only field, describing the monitor.
Units
The unit of the monitored value (for example %, Mbytes etc.). The field is read-only.
Metric Type Id
Identifies a unique Id of the QoS.
Publish Data
Select this option if you want QoS messages to be issued on the monitor.
Publish Alarms
Select this option if you want to activate alarms.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
Current Value -- The most current value measured will be used.
Delta Value (Current - Previous) -- The delta value calculated from the current and the previous measured sample will be used.
Delta Per Second -- The delta value calculated from the samples measured within a second will be used.
Average Value Last n Samples -- The user specifies a count. The value is then averaged based on the last "count" items.
Number of Samples
The count of items for the Value Definition when set to Average Value Last n Samples .
Operator
The operator to be used when setting the high or low alarm threshold for the measured value. You must select Publish Alarms to enable
this setting.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The high or low alarm threshold value. An alarm message will be sent if this threshold is exceeded.
Message Name
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message pool.
Configure dynamic alarm thresholds following the instructions found in Hadoop Monitoring (hadoop_monitor) Configuration v1.0.
hadoop_monitor Metrics
The following tables list the metrics you can collect with the Hadoop Monitoring (hadoop_monitor) probe.
Contents
Cluster Level
Node Level
Process Level Metrics
Hadoop Service Level Metrics
Job Level Metrics
Job and Task Level Metrics
Cluster Level
The metrics in this table are configured in the Cluster node in the Hadoop Monitoring probe GUI
Metric
Units
Description
QOS_HADOOP_CLUSTER_VERSION
Float
Version of Hadoop
Node Level
The metrics in these tables are configured in the System nodes in the Hadoop Monitoring probe GUI.
Units
Description
QOS_HADOOP_SYS_NET_IF_SPEED
Bytes/Second
QOS_HADOOP_SYS_NET_RX_BYTES
Bytes
QOS_HADOOP_SYS_NET_TX_BYTES
Bytes
QOS_HADOOP_SYS_NET_RX_PACKETS
Count
QOS_HADOOP_SYS_NET_RX_DROPPED
Bytes
QOS_HADOOP_SYS_NET_RX_OVERRUNS
Count
QOS_HADOOP_SYS_NET_RX_ERRORS
Count
QOS_HADOOP_SYS_NET_RX_FRAMES
Count
QOS_HADOOP_SYS_NET_TX_COLLISIONS
Count
QOS_HADOOP_SYS_NET_TX_PACKETS
Count
QOS_HADOOP_SYS_NET_TX_DROPPED
Count
QOS_HADOOP_SYS_NET_TX_ERRORS
Count
QOS_HADOOP_NET_TX_CARRIER
Bytes
QOS_HADOOP_SYS_NET_TX_OVERRUNS
Bytes
QOS_HADOOP_SYS_TCP_ACTIVE_OPENS
Count
Units
Description
QOS_HADOOP_SYS_TCP_ACTIVE_OPENS
Count
QOS_HADOOP_SYS_TCP_ESTAB_RESETS
Count
QOS_HADOOP_SYS_TCP_CURR_ESTAB
Count
QOS_HADOOP_SYS_TCP_ATTEMPT_FAILS
Count
QOS_HADOOP_SYS_TCP_PASSIVE_OPENS
Count
QOS_HADOOP_SYS_TCP_OUT_RSTS
Count
QOS_HADOOP_SYS_TCP_RETRANS_SEGS
Count
QOS_HADOOP_SYS_TCP_IN_SEGS
Count
QOS_HADOOP_SYS_TCP_IN_ERRS
Count
QOS_HADOOP_SYS_TCP_OUT_SEGS
Count
Memory Metrics
Metric
Units
Description
QOS_HADOOP_SYS_MEM_FREE
Bytes
QOS_HADOOP_SYS_MEM_USED
Bytes
QOS_HADOOP_SYS_MEM_USED_PERCENT
Percent
QOS_HADOOP_SYS_MEM_SWAP_FREE
Bytes
QOS_HADOOP_SYS_MEM_SWAP_USED
Bytes
QOS_HADOOP_SYS_MEM_SWAP_USED_SWAP_PERCENT
Percent
I/O Metrics
Metric
Units
Description
QOS_HADOOP_SYS_IO_DISK_READ_BYTES
Bytes
QOS_HADOOP_SYS_IO_DISK_READS
Count
QOS_HADOOP_SYS_IO_DISK_SERVICE_TIME
Milliseconds
The average service time for I/O requests that were issued
QOS_HADOOP_SYS_IO_USE_PERCENT
Percent
QOS_HADOOP_SYS_IO_DISK_WRITE_BYTES
Bytes
QOS_HADOOP_SYS_IO_DISK_WRITES
Count
QOS_HADOOP_SYS_IO_TOTAL
Kilobytes
QOS_HADOOP_SYS_IO_FREE
Kilobytes
QOS_HADOOP_SYS_IO_USED
Kilobytes
CPU Metrics
Metric
Units
Description
QOS_HADOOP_SYS_CPU_IDLE
Percent
QOS_HADOOP_SYS_CPU_WAIT
Percent
QOS_HADOOP_SYS_CPU_IRQ
Percent
QOS_HADOOP_SYS_CPU_NICE
Percent
QOS_HADOOP_SYS_CPU_SOFT_IRQ
Percent
QOS_HADOOP_SYS_CPU_STOLEN
Percent
QOS_HADOOP_SYS_CPU_SYS
Percent
QOS_HADOOP_SYS_CPU_USER
Percent
Units
Description
QOS_HADOOP_PROC_SYS_MEM_MAJOR_FAULTS
Count
QOS_HADOOP_PROC_MEM_MINOR_FAULTS
Count
QOS_HADOOP_PROC_MEM_PAGE_FAULTS
Count
QOS_HADOOP_PROC_MEM_SIZE
Bytes
QOS_HADOOP_PROC_MEM_RESIDENT
Bytes
QOS_HADOOP_PROC_MEM_SHARE
Bytes
Units
Description
QOS_HADOOP_PROC_IO_FD
Count
Units
Description
QOS_HADOOP_PROC_CPU_SYS
Milliseconds
QOS_HADOOP_PROC_CPU_SYS_PERCENT
Percent
QOS_HADOOP_PROC_CPU_TOTAL
Milliseconds
QOS_HADOOP_PROC_CPU_PERCENT
Percent
QOS_HADOOP_PROC_CPU_USER
Milliseconds
QOS_HADOOP_PROC_CPU_USER_PERCENT
Percent
Process ID Metrics
Metric
Units
Description
QOS_HADOOP_PROC_ID
Integer
Units
Description
QOS_HADOOP_JVM_GC_COUNT
Count
QOS_HADOOP_JVM_GC_TIME
Milliseconds
QOS_HADOOP_JVM_HEAP_CMTD
Megabytes
QOS_HADOOP_JVM_MEMHEAP_USED
Megabytes
QOS_HADOOP_JVM_MEMMAX_M
Megabytes
QOS_HADOOP_JVM_MEMNONHEAP_CMTD
Megabytes
QOS_HADOOP_JVM_MEMNONHEAP_USED
Megabytes
QOS_HADOOP_JVM_MEMHEAP_USED_PERCENT
Percent
QOS_HADOOP_JVM_THREADSTIME_WTNG
Count
QOS_HADOOP_JVM_THREADS_BLCKD
Count
QOS_HADOOP_JVM_THREADS_NEW
Count
Number of threads in a NEW state, which means they have not been
started
QOS_HADOOP_JVM_THREADS_RNBLE
Count
QOS_HADOOP_JVM_THREADS_TERMINATED
Count
QOS_HADOOP_JVM_THREADS_WTNG
Count
Namenode Metrics
The metrics in this table are configured in the Cluster node in the Hadoop Monitoring probe GUI.
Metric
Units
Description
QOS_HADOOP_CLUSTER_LIVE_NODES
Count
QOS_HADOOP_CLUSTER_DEAD_NODES
Count
QOS_HADOOP_CLUSTER_DECOM_NODES
Count
Units
Description
QOS_HADOOP_HDFS_CAPACITY_TOTAL
Bytes
QOS_HADOOP_HDFS_SENDDATAPACKETTRANSFERNANOS
Float
QOS_HADOOP_HDFS_SNAPSHOTTABLEDIRECTORIES
Count
QOS_HADOOP_HDFS_SNAPSHOTS
Count
QOS_HADOOP_HDFS_BLOCKSTOTAL
Count
QOS_HADOOP_HDFS_PENDINGREPLICATIONBLOCKS
Count
QOS_HADOOP_HDFS_UNDERREPLICATEDBLOCKS
Count
QOS_HADOOP_HDFS_CORRUPTBLOCKS
Count
QOS_HADOOP_HDFS_PENDINGDELETIONBLOCKS
Count
QOS_HADOOP_HDFS_EXCESSBLOCKS
Count
QOS_HADOOP_HDFS_CAPACITYUSED
Bytes
QOS_HADOOP_HDFS_POSTPONEDMISREPLICATEDBLOCKS
Count
QOS_HADOOP_HDFS_PENDINGDATANODEMESSAGECOUNT
Count
QOS_HADOOP_HDFS_MILLISSINCELASTLOADEDEDITS
Milliseconds
QOS_HADOOP_HDFS_BLOCKCAPACITY
Bytes
Block capacity
QOS_HADOOP_HDFS_SENDDATAPACKETBLOCKEDONNETWORKNANOSAVGTIME
Float
QOS_HADOOP_HDFS_STALEDATANODES
Count
QOS_HADOOP_HDFS_CREATEFILEOPS
Count
QOS_HADOOP_HDFS_FILESCREATED
Count
QOS_HADOOP_HDFS_FILESAPPENDED
Count
QOS_HADOOP_HDFS_FILESRENAMED
Count
QOS_HADOOP_HDFS_CAPACITYREMAINING
Bytes
QOS_HADOOP_HDFS_GETLISTINGOPS
Count
QOS_HADOOP_HDFS_DELETEFILEOPS
Count
QOS_HADOOP_HDFS_FILESDELETED
Count
QOS_HADOOP_HDFS_FILESINFOOPS
Count
QOS_HADOOP_HDFS_FILESINGETLISTINGOPS
Count
QOS_HADOOP_HDFS_CREATESNAPSHOTOPS
Count
QOS_HADOOP_HDFS_DELETESNAPSHOTOPS
Count
QOS_HADOOP_HDFS_RENAMESNAPSHOTOPS
Count
QOS_HADOOP_HDFS_LISTSNAPSHOTTABLEDIROPS
Count
QOS_HADOOP_HDFS_TRANSACTIONSNUMOPS
Count
QOS_HADOOP_HDFS_CAPACITYUSED_PERCENT
Percent
QOS_HADOOP_HDFS_TRANSACTIONSAVGTIME
Float
QOS_HADOOP_HDFS_SYNCSNUMOPS
Count
QOS_HADOOP_HDFS_SYNCSAVGTIME
Float
QOS_HADOOP_HDFS_BLOCKREPORTNUMOPS
Count
QOS_HADOOP_HDFS_BLOCKREPORTAVGTIME
Float
QOS_HADOOP_HDFS_BYTESWRITTEN
Bytes
QOS_HADOOP_HDFS_BYTESREAD
Bytes
QOS_HADOOP_HDFS_BLOCKSWRITTEN
Count
QOS_HADOOP_HDFS_BLOCKSREAD
Count
QOS_HADOOP_HDFS_BLOCKSREPLICATED
Count
QOS_HADOOP_HDFS_CAPACITYUSEDNONDFS
Bytes
QOS_HADOOP_HDFS_BLOCKSREMOVED
Count
QOS_HADOOP_HDFS_BLOCKSVERIFIED
Count
QOS_HADOOP_HDFS_BLOCKVERIFICATIONFAILURES
Count
QOS_HADOOP_HDFS_READSFROMLOCALCLIENT
Count
QOS_HADOOP_HDFS_READSFROMREMOTECLIENT
Count
QOS_HADOOP_HDFS_WRITESFROMLOCALCLIENT
Count
QOS_HADOOP_HDFS_WRITESFROMREMOTECLIENT
Count
QOS_HADOOP_HDFS_BLOCKSGETLOCALPATHINFO
Count
QOS_HADOOP_HDFS_FSYNCCOUNT
Count
QOS_HADOOP_HDFS_READBLOCKOPNUMOPS
Count
QOS_HADOOP_HDFS_TotalLoad
Count
QOS_HADOOP_HDFS_READBLOCKOPAVGTIME
Float
QOS_HADOOP_HDFS_WRITEBLOCKOPNUMOPS
Count
QOS_HADOOP_HDFS_WRITEBLOCKOPAVGTIME
Float
QOS_HADOOP_HDFS_BLOCKCHECKSUMOPNUMOPS
Count
QOS_HADOOP_HDFS_BLOCKCHECKSUMOPAVGTIME
Float
QOS_HADOOP_HDFS_COPYBLOCKOPNUMOPS
Count
QOS_HADOOP_HDFS_COPYBLOCKOPAVGTIME
Float
QOS_HADOOP_HDFS_REPLACEBLOCKOPNUMOPS
Count
QOS_HADOOP_HDFS_REPLACEBLOCKOPAVGTIME
Count
QOS_HADOOP_HDFS_HEARTBEATSNUMOPS
Count
QOS_HADOOP_HDFS_NUMFAILEDVOLUMES
Count
QOS_HADOOP_HDFS_HEARTBEATSAVGTIME
Float
QOS_HADOOP_HDFS_BLOCKREPORTSNUMOPS
Count
QOS_HADOOP_HDFS_BLOCKREPORTSAVGTIME
Float
QOS_HADOOP_HDFS_PACKETACKROUNDTRIPTIMENANOSNUMOPS
Count
Number of
PacketAckRoundTripTimeNanos
operations
QOS_HADOOP_HDFS_PACKETACKROUNDTRIPTIMENANOSAVGTIME
Float
QOS_HADOOP_HDFS_FLUSHNANOSNUMOPS
Count
QOS_HADOOP_HDFS_FLUSHNANOSAVGTIME
Float
QOS_HADOOP_HDFS_FSYNCNANOSNUMOPS
Integer
QOS_HADOOP_HDFS_FSYNCNANOSAVGTIME
Float
QOS_HADOOP_HDFS_SENDDATAPACKETBLOCKEDONNETWORKNANOSNUMOPS
Count
Number of
SendDataPacketBlockedOnNetworkNanos
operations
QOS_HADOOP_HDFS_TOTALFILES
Count
QOS_HADOOP_HDFS_SENDDATAPACKETTRANSFERNANOSNUMOPS
Count
Number of
SendDataPacketTransferNanos Operation
HBASE Metrics
The metrics in this table are configured on the HBASE Node in the Hadoop Monitoring probe GUI.
Metric
Units
Description
QOS_HADOOP_HBASE_CLUSTERREQUESTS
Count
QOS_HADOOP_HBASE_AVERAGELOAD
Float
QOS_HADOOP_HBASE_NUMREGIONSERVERS
Count
QOS_HADOOP_HBASE_NUMDEADREGIONSERVERS
Count
QOS_HADOOP_HBASE_BULKASSIGN_NUM_OPS
Count
QOS_HADOOP_HBASE_BULKASSIGN_MIN
Bytes
QOS_HADOOP_HBASE_BULKASSIGN_MAX
Bytes
QOS_HADOOP_HBASE_BULKASSIGN_MEAN
Float
QOS_HADOOP_HBASE_BULKASSIGN_MEDIAN
Float
QOS_HADOOP_HBASE_ASSIGN_NUM_OPS
Count
QOS_HADOOP_HBASE_ASSIGN_MIN
Bytes
QOS_HADOOP_HBASE_ASSIGN_MAX
Bytes
QOS_HADOOP_HBASE_ASSIGN_MEAN
Float
QOS_HADOOP_HBASE_ASSIGN_MEDIAN
Float
QOS_HADOOP_HBASE_HLOGSPLITTIME_NUMOPS
Count
QOS_HADOOP_HBASE_HLOGSPLITTIME_MIN
Milliseconds
QOS_HADOOP_HBASE_HLOGSPLITTIME_MAX
Milliseconds
QOS_HADOOP_HBASE_HLOGSPLITTIME_MEAN
Float
QOS_HADOOP_HBASE_HLOGSPLITTIME_MEDIAN
Float
QOS_HADOOP_HBASE_METAHLOGSPLITTIME_NUM_OPS
Count
QOS_HADOOP_HBASE_METAHLOGSPLITTIME_MIN
Milliseconds
QOS_HADOOP_HBASE_METAHLOGSPLITTIME_MAX
Milliseconds
QOS_HADOOP_HBASE_METAHLOGSPLITTIME_MEAN
Float
QOS_HADOOP_HBASE_METAHLOGSPLITTIME_MEDIAN
Float
QOS_HADOOP_HBASE_METAHLOGSPLITSIZE_NUM_OPS
Count
QOS_HADOOP_HBASE_METAHLOGSPLITSIZE_MIN
Bytes
QOS_HADOOP_HBASE_METAHLOGSPLITSIZE_MAX
Bytes
QOS_HADOOP_HBASE_METAHLOGSPLITSIZE_MEAN
Float
QOS_HADOOP_HBASE_METAHLOGSPLITSIZE_MEDIAN
Float
QOS_HADOOP_HBASE_HLOGSPLITSIZE_NUM_OPS
Count
QOS_HADOOP_HBASE_HLOGSPLITSIZE_MIN
Bytes
QOS_HADOOP_HBASE_HLOGSPLITSIZE_MAX
Bytes
QOS_HADOOP_HBASE_HLOGSPLITSIZE_MEAN
Float
QOS_HADOOP_HBASE_HLOGSPLITSIZE_MEDIAN
Float
QOS_HADOOP_REGION_REGIONCOUNT
Count
QOS_HADOOP_REGION_STORECOUNT
Count
QOS_HADOOP_REGION_READREQUESTCOUNT
Count
QOS_HADOOP_REGION_WRITEREQUESTCOUNT
Count
QOS_HADOOP_REGION_TOTALREQUESTCOUNT
Count
QOS_HADOOP_REGION_SLOWDELETECOUNT
Count
QOS_HADOOP_REGION_SLOWINCREMENTCOUNT
Count
QOS_HADOOP_REGION_SLOWGETCOUNT
Count
QOS_HADOOP_REGION_SLOWAPPENDCOUNT
Count
QOS_HADOOP_REGION_SLOWPUTCOUNT
Count
QOS_HADOOP_REGION_CHECKMUTATEFAILEDCOUNT
Count
QOS_HADOOP_REGION_CHECKMUTATEPASSEDCOUNT
Count
QOS_HADOOP_REGION_STOREFILEINDEXSIZE
Kilobytes
QOS_HADOOP_REGION_STATICINDEXSIZE
Kilobytes
QOS_HADOOP_REGION_STATICBLOOMSIZE
Kilobytes
QOS_HADOOP_REGION_PERCENTFILESLOCAL
Percent
QOS_HADOOP_REGION_COMPACTIONQUEUELENGTH
Count
QOS_HADOOP_REGION_FLUSHQUEUELENGTH
Count
QOS_HADOOP_REGION_HLOGFILECOUNT
Count
QOS_HADOOP_REGION_HLOGFILESIZE
Kilobytes
QOS_HADOOP_REGION_STOREFILECOUNT
Count
QOS_HADOOP_REGION_MEMSTORESIZE
Kilobytes
QOS_HADOOP_REGION_STOREFILESIZE
Kilobytes
QOS_HADOOP_REGION_BLOCKCACHEFREESIZE
Bytes
QOS_HADOOP_REGION_BLOCKCACHECOUNT
Count
QOS_HADOOP_REGION_BLOCKCACHESIZE
Kilobytes
QOS_HADOOP_REGION_BLOCKCACHEHITCOUNT
Count
QOS_HADOOP_REGION_BLOCKCACHEMISSCOUNT
Count
QOS_HADOOP_REGION_BLOCKCACEEVICTIONCOUNT
Count
QOS_HADOOP_REGION_BLOCKCOUNTHITPERCENT
Percent
QOS_HADOOP_REGION_BLOCKCACHEEXPRESSHITPERCENT
Percent
QOS_HADOOP_REGION_APPEND_NUM_OPS
Count
QOS_HADOOP_REGION_APPEND_MIN
Milliseconds
QOS_HADOOP_REGION_APPEND_MAX
Milliseconds
QOS_HADOOP_REGION_APPEND_MEAN
Float
QOS_HADOOP_REGION_APPEND_MEDIAN
Float
QOS_HADOOP_REGION_DELETE_NUM_OPS
Count
QOS_HADOOP_REGION_DELETE_MIN
Milliseconds
QOS_HADOOP_REGION_DELETE_MAX
Milliseconds
QOS_HADOOP_REGION_DELETE_MEAN
Float
QOS_HADOOP_REGION_DELETE_MEDIAN
Float
QOS_HADOOP_REGION_MUTATE_NUM_OPS
Count
QOS_HADOOP_REGION_MUTATE_MIN
Milliseconds
QOS_HADOOP_REGION_MUTATE_MAX
Milliseconds
QOS_HADOOP_REGION_MUTATE_MEAN
Float
QOS_HADOOP_REGION_MUTATE_MEDIAN
Float
QOS_HADOOP_REGION_GET_NUM_OPS
Count
QOS_HADOOP_REGION_GET_MIN
Milliseconds
QOS_HADOOP_REGION_GET_MAX
Milliseconds
QOS_HADOOP_REGION_GET_MEAN
Float
QOS_HADOOP_REGION_GET_MEDIAN
Float
QOS_HADOOP_REGION_REPLAY_NUM_OPS
Count
QOS_HADOOP_REGION_REPLAY_MIN
Milliseconds
QOS_HADOOP_REGION_REPLAY_MAX
Milliseconds
QOS_HADOOP_REGION_REPLAY_MEAN
Float
QOS_HADOOP_REGION_REPLAY_MEDIAN
Float
QOS_HADOOP_REGION_INCREMENT_NUM_OPS
Count
QOS_HADOOP_REGION_INCREMENT_MIN
Milliseconds
QOS_HADOOP_REGION_INCREMENT_MAX
Milliseconds
QOS_HADOOP_REGION_INCREMENT_MEAN
Float
QOS_HADOOP_REGION_INCREMENT_MEDIAN
Float
Units
Description
QOS_HADOOP_YARN_RUNNING_GREATER_THAN_60
Count
QOS_HADOOP_YARN_RUNNING_GREATER_THAN_300
Count
QOS_HADOOP_YARN_RUNNING_GREATER_THAN_1440
Count
QOS_HADOOP_YARN_FAIRSHAREMB
Megabytes
QOS_HADOOP_YARN_FAIRSHAREVCORES
Count
QOS_HADOOP_YARN_MINSHAREMB
Megabytes
QOS_HADOOP_YARN_MINSHAREVCORES
Count
QOS_HADOOP_YARN_MAXSHAREMB
Megabytes
QOS_HADOOP_YARN_MAXSHAREVCORES
Count
QOS_HADOOP_YARN_APPSSUBMITTED
Count
QOS_HADOOP_YARN_CURRENLY_RUNNING
Count
QOS_HADOOP_YARN_APPSPENDING
Count
QOS_HADOOP_YARN_APPSCOMPLETED
Count
QOS_HADOOP_YARN_APPSKILLED
Count
QOS_HADOOP_YARN_APPSFAILED
Count
QOS_HADOOP_YARN_ALLOCATEDMB
Megabytes
QOS_HADOOP_YARN_ALLOCATEDVCORES
Count
QOS_HADOOP_YARN_ALLOCATEDCONTAINERS
Count
QOS_HADOOP_YARN_AGGREGATECONTAINERSALLOCATED
Count
QOS_HADOOP_YARN_AGGREGATECONTAINERSRELEASED
Count
QOS_HADOOP_YARN_AVAILABLEMB
Megabytes
QOS_HADOOP_YARN_AVAILABLEVCORES
Count
QOS_HADOOP_YARN_PENDINGMB
Megabytes
QOS_HADOOP_YARN_PENDINGVCORES
Count
QOS_HADOOP_YARN_PENDINGCONTAINERS
Count
QOS_HADOOP_YARN_RESERVEDMB
Megabytes
QOS_HADOOP_YARN_RESERVEDVCORES
Count
QOS_HADOOP_YARN_RESERVEDCONTAINERS
Count
QOS_HADOOP_YARN_ACTIVEUSERS
Count
Units
Description
QOS_HADOOP_YARN_ALLOCATEDCONTAINERS
Count
QOS_HADOOP_YARN_CONTAINERSCOMPLETED
Count
QOS_HADOOP_YARN_CONTAINERSFAILED
Count
QOS_HADOOP_YARN_CONTAINERSINITING
Count
Number of initializing
QOS_HADOOP_YARN_CONTAINERSKILLED
Count
QOS_HADOOP_YARN_CONTAINERSLAUNCHED
Count
QOS_HADOOP_YARN_CONTAINERSRUNNING
Count
QOS_HADOOP_YARN_ALLOCATEDGB
Gigabytes
QOS_HADOOP_YARN_AVAILABLEGB
Gigabytes
Units
Description
QOS_HADOOP_JOB_TOTAL_LAUNCHED_MAPS
Count
QOS_HADOOP_JOB_TOTAL_LAUNCHED_REDUCES
Count
QOS_HADOOP_JOB_TOTAL_DATA_LOCAL_MAPS
Count
QOS_HADOOP_JOB_SLOTS_MILLIS_MAPS
Milliseconds
Total time spent by all maps in occupied slots, averaged across all
runs
QOS_HADOOP_JOB_SLOTS_MILLIS_REDUCES
Milliseconds
Total time spent by all reduces in occupied slots, averaged across all
runs
QOS_HADOOP_JOB_MILLS_MAPS
Milliseconds
Total time spent by all map task,s averaged across all runs
QOS_HADOOP_JOB_MILLIS_REDUCES
Milliseconds
Total time spent by all reduce tasks, averaged across all runs
QOS_HADOOP_JOB_VCORES_MILLIS_MAPS
Milliseconds
Total virtual core seconds taken by all map tasks, averaged across all
runs
QOS_HADOOP_JOB_VCORES_MILLIS_REDUCES
Milliseconds
Total virtual core seconds taken by reduce tasks, averaged across all
runs
QOS_HADOOP_JOB_MB_MILLIS_MAPS
Milliseconds
Total megabyte seconds taken by all map tasks, averaged across all
runs
QOS_HADOOP_JOB_MB_MILLIS_REDUCES
Milliseconds
Units
Description
QOS_HADOOP_JOB_FILE_BYTES_READ
Bytes
Number of non-HDFS bytes read from the filesystem, averaged across all runs
QOS_HADOOP_JOB_FILE_BYTES_WRITTEN
Bytes
Number of non-HDFS bytes written to the filesystem, averaged across all runs
QOS_HADOOP_JOB_HDFS_BYTES_READ
Bytes
Number of HDFS bytes read from the filesystem, averaged across all runs
QOS_HADOOP_JOB_HDFS_BYTES_WRITTEN
Bytes
Number of HDFS bytes written to the filesystem, averaged across all runs
Units
Description
QOS_HADOOP_JOB_FILE_READ_OPS
Count
QOS_HADOOP_JOB_FILE_LARGE_READ_OPS
Count
QOS_HADOOP_JOB_FILE_WRITE_OPS
Count
QOS_HADOOP_JOB_HDFS_READ_OPS
Count
QOS_HADOOP_JOB_HDFS_LARGE_READ_OPS
Count
QOS_HADOOP_JOB_HDFS_WRITE_OPS
Count
Units
Description
QOS_HADOOP_JOB_MAP_INPUT_RECORDS
Count
QOS_HADOOP_JOB_MAP_OUTPUT_RECORDS
Count
QOS_HADOOP_JOB_MAP_OUTPUT_BYTES
Bytes
QOS_HADOOP_JOB_MAP_OUTPUT_MATERIALIZED_BYTES
Bytes
QOS_HADOOP_JOB_SPLIT_RAW_BYTES
Bytes
QOS_HADOOP_JOB_COMBINE_INPUT_RECORDS
Count
QOS_HADOOP_JOB_COMBINE_OUTPUT_RECORDS
Count
QOS_HADOOP_JOB_REDUCE_INPUT_GROUPS
Count
QOS_HADOOP_JOB_REDUCE_SHUFFLE_BYTES
Bytes
QOS_HADOOP_JOB_REDUCE_INPUT_RECORDS
Count
QOS_HADOOP_JOB_REDUCE_OUTPUT_RECORDS
Count
QOS_HADOOP_JOB_SPILLED_RECORDS
Count
QOS_HADOOP_JOB_CPU_MILLISECONDS
Milliseconds
QOS_HADOOP_JOB_PHYSICAL_MEMORY_BYTES
Bytes
QOS_HADOOP_JOB_VIRTUAL_MEMORY_BYTES
Bytes
QOS_HADOOP_JOB_COMMITTED_HEAP_BYTES
Bytes
Units
Description
QOS_HADOOP_JOB_SHUFFLED_MAPS
Count
QOS_HADOOP_JOB_FAILED_SHUFFLE
Counts
QOS_HADOOP_JOB_MERGED_MAP_OUTPUTS
Count
QOS_HADOOP_JOB_GC_TIME_MILLIS
Milliseconds
Units
Description
QOS_HADOOP_JOB_BYTES_READ
Bytes
QOS_HADOOP_JOB_BYTES_WRITTEN
Bytes
Units
Definition
QOS_HADOOP_JOB_SHUFFLE_ERRORS_BAD_ID
Count
QOS_HADOOP_JOB_SHUFFLE_ERRORS_CONNECTION
Count
QOS_HADOOP_JOB_SHUFFLE_ERRORS_IO_ERROR
Count
QOS_HADOOP_JOB_SHUFFLE_ERRORS_WRONG_LENGTH
Count
QOS_HADOOP_JOB_SHUFFLE_ERRORS_WRONG_MAP
Count
QOS_HADOOP_JOB_SHUFFLE_ERRORS_WRONG_REDUCE
Count
hdb
The hdb probe provides a simple database service for managed probes. The robot uses the
and data trending. Collected data survives power outages.
Hdb is included in the robot installer and update packages, and deploys during robot installation or upgrade.
Hdb does not require configuration, and does not have a configuration UI.
health_index
Collecting information about the health of your system over time provides comparison data that will help you detect a change in system behavior.
The health_index probe, which is deployed to the primary hub, collects alarms generated by monitoring probes from the UIM message bus. The
health index calculator uses these alarms to calculate a health score for computer systems and configuration items (CI Type IDs). Health index
information is displayed in CA Unified Service Manager or Infrastructure Manager.
More Information:
Note: Two data points are needed to display a health index chart in the Metrics tab of Unified Service Manager. By default, it can take
up to two hours (or double the health index calculation interval setting) to display the chart.
Max and Min shows the values for the interval, if the value falls above or below the trend.
Trend provides the linear trend-line value for the health scores generated over a period of time.
Alarm generated for the interval if the health score breached a user-defined threshold.
QoS message generated for the interval. Also note the name of the policy applied to the selected computer system is displayed for
reference purposes.
health_index Metrics
This article describes the QoS messages generated by the health_index probe.
QoS Metrics
The following QoS messages are generated when health scores breach user-configured health index alarm thresholds.
QoS Name
Units
Description
Version
QOS_HEALTH_INDEX
Health
number
from 1 to
100
Indicates the health score of a computer system. For example, the composite health score of disks,
memory, and CPU on a device. The QoS details for this type of message always contains 13:1.
1.0
QOS_HEALTH_INDEX
Health
number
from 1 to
100
Indicates the health score of a specific configuration item (CI). The QoS details for this type of message
contains ':0' appended to the metric type ID, for example 1.5:0 for the CPU Total Memory metric
configurable for the CDM probe.
1.0
Note: The history probe does not generate any QoS metrics. Therefore, there are no probe checkpoint metrics to be configured for this
probe.
More information:
history (iSeries QHST Data Monitoring) Release Notes
Verify Prerequisites
Configure General Properties
Create a Monitoring Profile
Using Regular Expressions
Alarm Thresholds
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see history (iSeries QHST Data
Monitoring) Release Notes.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
log size: specify the maximum size of the probe log file.
Default: 100 KB
3. Specify the time interval at which the jobs in the history log is scanned for messages. Reduce this interval to generate alarms frequently.
3.
Note: The alarm messages and suppression key are configured for every profile.
8. When matching: specify the job ID, job name, receiving program, sending program, job sending the log messages, alarm message text,
and alarm type. The probe matches the specified value in the history log. If the specified value is recognized, the probe raises an alarm.
The values can be new or you can select from the existing IDs. You can use regular expressions to specify the value. For more
information, see the Using Regular Expressions section.
9. Enable the alarm severity to indicate the severity code.
10. Specify the severity code. The probe raises an alarm when the specified severity is breached.
Standard
*a
Custom
a?
Standard
Match all two letter values that start with the letter a
*t*
Custom
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
History Node
Profiles Node
<Profile Name> Node
Messages Node
History Node
The history node allows you to view the probe and alarm message details and configure the log properties.
Navigation: history
history > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
history > Setup Configuration
This section lets you configure the log properties and the log file size of the history probe.
Log Level: specifies the level of details that are written to the log file.
Default: 0-Fatal
Log Size (KB): specifies a maximum size of the probe log file.
Default: 100
Check Interval (Perform check each): specifies the time interval at which the job queue in the system is accessed.
Default: 600
Command on Interval: enables monitoring of the updated or deleted the commands in the retrieved history log file
Default: Not Selected
Update Command: specifies the command the probe will match in the retrieved history log file to indicate if the command has been
updated.
Cleanup Command: specifies the command the probe will match in the retrieved history log file to indicate if the command has been
deleted.
history > Alarm Messages
This section lets you view the alarm messages and their properties.
Name: indicates the message name which is used to refer to the alarm message.
Text: indicates the alarm message text.
Level: indicates the alarm severity.
Subsystem: indicates the subsystem ID that the probe generates.
Default: indicates whether the message is the default message for this alarm situation.
Profiles Node
This node is used to create a monitoring profile. You can create multiple monitoring profiles, each with a different criteria to monitor the log files.
The new profiles are displayed as child nodes.
Messages Node
This node displays a list of scanned messages from the latest log file.
Verify Prerequisites
Configure the General Properties
Create a Monitoring Profile
Using Regular Expressions
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see history (iSeries QHST Data
Monitoring) Release Notes.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
log size: specify the maximum size of the probe log file.
Default: 100 KB
5. Specify the maximum size of the probe log file.
Default: 100 KB
6. In the Alarm messages section, right-click if you want to create a new message.
7. Enter the following details, to configure the properties of the alarm:
The alarm message name
The Alarm Message text, for example, $profile: ($type / $severity) $text (at $time)
The severity level of the alarm
The subsystem ID (SID) of alarms that the probe generates.
The default message for a particular alarm situation
If you want to set another message as default, leave this field empty.
8. Click OK.
Delete
Delete the existing profile.
A confirmation message is displayed to confirm this operation.
Important! These values are checked against all QHST messages to determine if a match is found in the logs. All the
selected fields must match for the profile to send an alarm. Regular expressions can be used in all fields except Severity.
Note: To create a profile through the Messages tab, right click and select Create Profile.
Standard
*a
Custom
a?
Standard
Match all two letter values that start with the letter a
*t*
Custom
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when
debugging.
Log size: Specifies the size of the probe log file to which the internal log messages of the probe are written. When this size is reached,
new log file entries are added and the older entries are deleted.
Default: 100 KB.
Alarm messages
Specifies the message, which is generated when a threshold is breached. You can create your own messages with the message text and
severity level you desire. The following options are available in the right-click menu for the message list.
New Alarm Window
You can create a new alarm message. The following variables are available for the alarm message:
Name
Specifies a name for the job.
Text
Specifies the Alarm Message text.
Variable expansion is supported for this field. The following variables are available in the message text:
profile
time
severity
text
type
Level
Specifies the severity level of the alarm.
Subsystem
The subsystem ID (SID) of alarms generated by this probe. The string or the id managed by NAS.
Default Alarm Situation
The message can be made the default message for a particular alarm situation.
You must leave this field empty if another message is to be the default.
Edit: modifies fields of an existing alarm message.
Delete: removes the selected alarm message.
Message properties
These values are checked against all QHST messages read to determine if the profile matches the message. All checked fields must match for
the profile to match and an alarm to be sent.
Regular expressions can be used in all fields except Severity.
ID: enables the job ID.
Job: enables the job name.
RecvProg (Receiving program name): specifies the receiving program in the history log.
SendProg (Sending program name): specifies the sending program in the history log.
Severity: specifies the alarm severity code.
Text Message text: specifies the Message type.
Alarm Tab
Use message: specifies the alarm message that is used when a threshold is breached. The default message will be used, if no alarm
message is selected.
Suppression key: specifies the suppression key is used by NAS to determine which messages describe the same alarm situation.
Leave this field empty if you want the nas to just use the alarm message text.
Disks
More information:
hitachi (Hitachi Storage Monitoring) Release Notes
Verify Prerequisites
Add Profile
Define an Array for Monitoring
Add a Monitor
Add Monitors Manually
Alarm Thresholds
Verify Prerequisites
Verify that required hardware and software is available and pre-configuration requirements are met before you configure the probe. For more
information, see hitachi (Hitachi Storage Monitoring) Release Notes.
Add Profile
You can add a monitoring profile to connect to the SMI-S server. The hitachi probe also collects and stores data and information from the
monitored components.
Follow these steps:
1. Open the probe configuration interface.
2. Click the Options (icon) next to the hitachi node in the navigation pane.
3. Click Add New Profile.
4.
Note: The profile goes into pending state to retrieve the data for the specified host.
Add a Monitor
You can manually add monitors to the templates for the probe to retrieve the required alarms and QoS data.
3.
The type and property of the component.
A description about the monitor.
The type of value to be compared with the threshold value for generating alarms. This value type is also used in the QoS messages.
Number of samples to use for the average value option in the Value Definition field.
The maximum number of samples is 4.
4. Select the checkbox to activate the fields for configuring Alarms and QoS.
5. Enable the Operator, Threshold, Unit, and Message ID fields for configuring the alarms. You can configure both high and low thresholds.
The low threshold generates a warning alarm and the high threshold generates an error alarm.
Note: By default, the high threshold is set to a default value or the current value and the low threshold is disabled.
6. Enable the QoS Name drop-down list, which specifies the QoS message that the monitor can generate.
7. Click OK and Apply to configure the settings.
8. Click OK to restart the probe when prompted.
The configuration properties of the monitor are saved for the probe to generate alarms and QoS messages.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
hitachi Node
<Resource IP Address> Node
<IP Address> Node
Arrays Node
<Array Name> Node
hitachi Node
The hitachi node displays the probe information and enables you to configure the log settings of the probe.
Navigation: hitachi
hitachi > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
hitachi > Probe Setup
This section enables you to configure the default log level for the probe.
Log Level: specifies the level of details that are written to the log file.
Default: 3 - Info
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when
debugging.
Started
StartMode
Statistics
Status
Version
Navigation: hitachi > Resource IP Address > IP Address
IP Address > AdapterType
This section configures the AdapterType monitor of the Hitachi storage system.
Units: identifies a unit of the monitor. For example, % and Mbytes
Metric Type ID: identifies the unique identification number of the monitor.
Value Definition: specifies the value type that is compared with the threshold value and generates alarms. The specified value type is
also used in the QoS messages.
Select from the following values:
Current Value
Average Value Last Samples
Delta Value (Current - Previous)
Delta Per Second
Number of Samples: defines the number of samples to calculate the average value. This field appears only for those monitors where
the Value Definition field is enabled.
High Operator: specifies the operator to validate the alarm threshold value. For example, >= 90 means the monitor generates an alarm if
the measured value is more than 90.
High Threshold: defines the threshold value for comparing the actual value.
HighMessage Name: specifies the alarm message that is generated when the threshold value breaches.
Notes:
Similarly you can configure Low Operator, Low Threshold, and Low Message Name.
Typically, the low threshold generates a warning alarm and the high threshold generates an error alarm.
Arrays Node
The Arrays node is used to classify the list of arrays that are available in the Hitachi storage system.
Navigation: hitachi > Resource IP Address > IP Address > Arrays>
The following diagram outlines the process to configure the hitachi probe to collect QoS data for Hitachi storage systems.
Verify Prerequisites
Create or Edit Resources
Define an array to be monitored
Add a Monitor
Add Monitors Manually
Use Templates
Use Auto Configurations
Verify Prerequisites
Verify that required hardware and software is available and pre-configuration requirements are met before you configure the probe. For more
information, see hitachi (Hitachi Storage Monitoring) Release Notes.
Add a Monitor
You can add monitors to the templates, to retrieve the required alarms and data. There are three ways to add monitors:
Add manually - Select monitors from the defined list
Use templates - Define reusable sets of monitors to use with auto configurations
Use Auto Configurations - Apply monitor templates to all components of a resource
Note: The Monitor Properties dialog lets you define the thresholds for generating alarms. You can also configure the QoS
messages by editing the monitor properties.
4.
Note: You can also right-click the monitor and select Edit from the menu.
Note: Initially the high threshold is set to a default value or the current value and the low threshold is not enabled.
6. Select the s checkbox to enable the QoS Name drop-down list, which specifies the QoS message that the monitor can generate.
7. Click OK and Apply to configure the settings.
8. Click OK to restart the probe when prompted.
The configuration properties of the monitor take effect to enable the probe for generating alarms and QoS accordingly.
Use Templates
Templates are reusable sets of monitors with pre-configured monitoring properties. You can create templates and can apply its monitors to auto
configurations by dragging-and-dropping them on the Auto Configurations node. The probe applies the template monitors to all relevant
components of a resource. The probe also displays all the monitors, where the template is applied in the Auto Monitor node.
Create or Edit a Template
You can create a new template or can edit the settings for an existing template. Template defines the monitors and their monitoring parameters,
which is applied to the resource.
Follow these steps:
1. Do one of the following actions:
Click the Create New Template icon in the toolbar.
Right-click the Templates node in the left pane, then select New Template from the menu.
To edit an existing template, right-click a template under the Templates node in the left pane, then select Edit from the menu.
2. Enter a name and description in the Template Properties dialog and click OK.
3. Add the monitors (checkpoints) by doing one of the following actions:
Drag-and-drop the monitors from the right pane onto the template node in the left pane.
3.
Right-click on a monitor, then select Add to Template from the menu.
4. Edit the monitor properties, as required. Refer Edit Monitor Properties for more information.
Click the Active and Enable Monitoring check boxes in the Monitor Properties dialog, to monitor the collected data.
5. Drag-and-drop the template monitor on the Auto Configurations node for applying the monitor to the resource.
6. Click Apply to configure the settings.
7. Click OK to restart the probe when prompted.
Important! Do not add multiple monitors or templates to the Auto Configurations node. This can overburden the system.
Note: You must click the Apply button and restart the probe to activate configuration changes.
The auto configurations feature is implemented through two sub nodes of the All Resources node in the left pane:
Auto Configurations Node: Lists the applied template monitors and individual monitors for the resource. The probe searches through the
resources and applies relevant monitors to its components.
Auto Monitors Node: Lists the auto monitors for the resource. The properties of these monitors are inherited from the applied template
monitors of the Auto Configurations node.
Add a Template Monitor to Auto Configurations
You can add a template monitor to the Auto Configurations node of a resource. This process applies the monitor with preconfigured monitoring
properties to all components of the resource.
Follow these steps:
1. Click the Templates node in the left pane.
The list of templates is displayed in the right pane.
2. Select the template you want to monitor from the right pane.
3. Drag-and-drop the selected template monitor from the right pane onto the Auto Configurations node.
4. Click the Auto Configurations node and verify that the template monitor is listed in the right pane.
5. Click Apply.
6. Click OK to restart the probe when prompted.
The template monitor is applied to all components of the resource.
Add a Monitor to Auto Configurations
You can add a single monitor to the Auto Configurations node of a resource to apply the monitor to all components of a resource.
Follow these steps:
1. Expand the All Resources node in the left pane and click on a component for listing the available monitors.
2. Drag-and-drop the monitor from right pane onto the Auto Configurations node.
3. Click the Auto Configurations node and verify that the monitor is listed.
4. Configure the monitor properties. Refer Edit Monitor Properties for more information.
5. Click Apply.
6. Click OK to restart the probe when prompted.
The Resources node displays a list of Hitachi resources which are configured in the probe for monitoring. Each resource is an SMI-S provider that
discovers the Hitachi storage systems and provides a connection to them. The resource lets the probe to collect and store data for the monitored
components. The Resources node also displays the connection status for each resource:
Resource IP Address
The System node for the Hitachi storage system has the following nodes:
Auto Configurations: configures the unmonitored devices automatically. You can add one or more checkpoints (or templates) to this
node using drag-and-drop feature.
Auto Monitors: lists the auto monitors created for previously unmonitored devices that are based on the contents of the Auto
Configurations node.
All Monitors: lists all monitors either defined by an auto configuration or by an auto monitor.
Arrays: displays the array equipment hierarchy of a Hitachi storage system. You can view directors, disks, device pools, devices, and
other physical elements of the Hitachi storage system.
Templates Node
The Templates node displays a list of monitoring templates, which contain a list of monitors with their corresponding monitoring properties. You
can drag-and-drop a template monitor on a resource for applying the monitor properties and start monitoring the resource.
Note: When a monitor is selected, the Refresh menu item refreshes the display only and not the updated values. The new values are
retrieved after the poll interval of the selected resource.
Clicking the General Setup button opens the Probe Settings dialog, which contains the following items:
Log-level
This sets the level of details written to the log file. Log as little as possible during normal operation to minimize disk consumption.
0= Fatal
1= Error
2= Warn
3= Info
4= Debug
5= Trace
New Resource
You can click on the New Resource tab and update the following fields to add a new resource to the resource list.
Hostname or IP address: defines the host name or the IP address of the Hitachi storage system.
Port: defines the port number on which the SMI-S provider is listening.
Default: 443
Username: defines the user account for accessing the SMI-S provider.
Password: defines the password for the given username.
Alarm Message: specifies the alarm message to issue when the probe is unable to connect with the Hitachi storage system.
Default: ResourceCritical
Check Interval: specifies the time interval before the next log scans for new messages.
Default: 10
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Namespace: identifies the namespace that is supported by the Device Manager and the SMI-S version.
Use SSL: allows the probe to use HTTPS for connecting the Hitachi storage system.
Organizations: allows you to select all arrays that are to be monitored. Select test to update.
Message Pool Manager
You can open the Pool Manager by clicking the Message Pool Manager button in the Tool bar. You can view the list of alarm messages that are
available for the hitachi probe.
Message ID: identifies the alarm message.
Message Text Error: specifies the message content.
Message Text Ok: specifies the clear alarm message text.
Subsystem: specifies the alarm subsystem ID that defines the alarm source.
Token: identifies the predefined alarms.
Severity: specifies the alarm messages severity level.
The alarm messages for each alarm situation are stored in the Message Manager. Right-clicking in the list allows you to customize an alarm text,
and to create and remove messages.
Template Properties
You can create a new template by clicking on the Create New Template tab in the tool bar. The new template will be displayed under the
Templates node:
Name: defines a name of the template.
Description: gives a short description of the template.
hitachi Metrics
This section contains the QoS metrics for the Hitachi Storage Monitoring (hitachi) probe.
QoS Monitor
Unit
Description
Version
QOS_STORAGE_ARRAYGROUP_BLOCK_SIZE
Bytes
v1.0
QOS_STORAGE_ARRAYGROUP_CAPACITY
Gigabytes
v1.0
QOS_STORAGE_ARRAYGROUP_CAPACITY_FREE_PERCENT
Percent
v1.0
QOS_STORAGE_ARRAYGROUP_CAPACITY_USED_PERCENT
Percent
v1.0
QOS_STORAGE_ARRAYGROUP_CONSUMABLE_BLOCKS
Blocks
v1.0
QOS_STORAGE_ARRAYGROUP_CONSUMABLE_CAPACITY
Gigabytes
v1.0
QOS_STORAGE_ARRAYGROUP_NUMBER_OF_BLOCKS
Blocks
v1.0
v1.0
QOS_STORAGE_ARRAYGROUP_OPERATIONAL_STATUS
QOS_STORAGE_ARRAYGROUP_REMAINING_MANAGED_SPACE
Gigabytes
v1.0
QOS_STORAGE_ARRAYGROUP_TOTAL_MANAGED_SPACE
Gigabytes
v1.0
QOS_STORAGE_ARRAYGROUP__EXTENT_STRIPE_LENGTH
Count
v1.0
QOS_STORAGE_ARRAY_BLOCK_SIZE
Bytes
v1.0
QOS_STORAGE_ARRAY_CAPACITY_FREE_PERCENT
Percent
v1.0
QOS_STORAGE_ARRAY_CONSUMABLE_BLOCKS
Blocks
v1.0
QOS_STORAGE_ARRAY_KBYTES_TRANSFERRED
KB
v1.0
QOS_STORAGE_ARRAY_KBYTES_TRANSFERRED_RATE
KB/Sec
v1.0
QOS_STORAGE_ARRAY_NUMBER_OF_BLOCKS
Blocks
v1.0
v1.0
QOS_STORAGE_ARRAY_OPERATIONAL_STATUS
QOS_STORAGE_ARRAY_READ_HIT_IOS
Count
v1.0
QOS_STORAGE_ARRAY_READ_IOS
Count
v1.0
QOS_STORAGE_ARRAY_READ_IOS_RATE
Count/Sec
v1.0
QOS_STORAGE_ARRAY_REMAINING_MANAGED_SPACE
Gigabytes
v1.0
QOS_STORAGE_ARRAY_TOTAL_IOS
Count
v1.0
QOS_STORAGE_ARRAY_TOTAL_IOS_RATE
Count/Sec
v1.0
QOS_STORAGE_ARRAY_TOTAL_MANAGED_SPACE
Gigabytes
v1.0
QOS_STORAGE_ARRAY_WRITE_HIT_IOS
Count
v1.0
QOS_STORAGE_ARRAY_WRITE_IOS
Count
v1.0
QOS_STORAGE_ARRAY_WRITE_IOS_RATE
Count/Sec
v1.0
QOS_STORAGE_ARRAY__PERCENT_USED_CAPACITY
Percent
v1.0
QOS_STORAGE_ARRAY__SAMPLE_INTERVAL
Seconds
v1.0
v1.0
QOS_STORAGE_COMPONENT_OPERATIONAL_STATUS
QOS_STORAGE_DISK_BLOCK_SIZE
Bytes
v1.0
QOS_STORAGE_DISK_CAPACITY
Gigabytes
v1.0
QOS_STORAGE_DISK_CONSUMABLE_BLOCKS
Blocks
v1.0
QOS_STORAGE_DISK_CONSUMABLE_CAPACITY
Gigabytes
v1.0
QOS_STORAGE_DISK_NUMBER_OF_BLOCKS
Blocks
v1.0
v1.0
QOS_STORAGE_DISK_OPERATIONAL_STATUS
QOS_STORAGE_EXTVOL__CAPACITY
Gigabytes
v1.0
QOS_STORAGE_EXTVOL__CONSUMABLE_CAPACITY
Gigabytes
v1.0
QOS_STORAGE_LU__A_F_S_P_SPACE_CONSUMED
Gigabytes
v1.0
QOS_STORAGE_LU__REMAINING_MANAGED_SPACE
Gigabytes
v1.0
QOS_STORAGE_LU__SAMPLE_INTERVAL
Seconds
v1.0
v1.0
QOS_STORAGE_LU__S_S_EXTENT_STRIPE_LENGTH
QOS_STORAGE_LU__S_S_USER_DATA_STRIPE_DEPTH
v1.0
QOS_STORAGE_LU__S_V_BLOCK_SIZE
Bytes
v1.0
QOS_STORAGE_LU__S_V_CONSUMABLE_BLOCKS
Blocks
v1.0
QOS_STORAGE_LU__S_V_NUMBER_OF_BLOCKS
Blocks
v1.0
QOS_STORAGE_POOL_CAPACITY_FREE_PERCENT
Percent
v1.0
QOS_STORAGE_POOL_CAPACITY_USED_PERCENT
Percent
v1.0
v1.0
QOS_STORAGE_POOL_OPERATIONAL_STATUS
QOS_STORAGE_POOL_REMAINING_MANAGED_SPACE
Gigabytes
v1.0
QOS_STORAGE_POOL_TOTAL_MANAGED_SPACE
Gigabytes
v1.0
QOS_STORAGE_PORT_KBYTES_TRANSFERRED
KB
v1.0
QOS_STORAGE_PORT_MAX_SPEED
Hz
v1.0
QOS_STORAGE_PORT_OPERATIONAL_STATUS
v1.0
QOS_STORAGE_PORT_PORT_NUMBER
Port Number
v1.0
QOS_STORAGE_PORT_PORT_TYPE
Port Type
v1.0
QOS_STORAGE_PORT_SPEED
Hz
Port Speed
v1.0
QOS_STORAGE_PORT_TOTAL_IOS
Count
v1.0
QOS_STORAGE_PORT__SAMPLE_INTERVAL
Seconds
v1.0
QOS_STORAGE_RES_CONFIGURATION_EXECUTION_TIME
Seconds
v1.0
QOS_STORAGE_RES_DISCOVERY_EXECUTION_TIME
Seconds
v1.0
QOS_STORAGE_RES_STATS_EXECUTION_TIME
Seconds
v1.0
v1.0
QOS_STORAGE_SP_OPERATIONAL_STATUS
QOS_STORAGE_THINPOOL__A_F_S_P_SPACE_CONSUMED
Gigabytes
v1.0
QOS_STORAGE_THINPOOL__PERCENT_FREE_CAPACITY
Percent
v1.0
QOS_STORAGE_THINPOOL__PERCENT_SUBSCRIBED
Percent
v1.0
QOS_STORAGE_THINPOOL__PERCENT_USED_CAPACITY
Percent
v1.0
QOS_STORAGE_THINPOOL__REMAINING_MANAGED_SPACE
Gigabytes
v1.0
QOS_STORAGE_THINPOOL__SPACE_LIMIT
Gigabytes
v1.0
QOS_STORAGE_THINPOOL__SUBSCRIBED_CAPACITY
Gigabytes
v1.0
v1.0
QOS_STORAGE_THINPOOL__S_S_CHANGEABLE_TYPE
QOS_STORAGE_THINPOOL__S_V_BLOCK_SIZE
Bytes
v1.0
QOS_STORAGE_THINPOOL__S_V_CONSUMABLE_BLOCKS
Blocks
v1.0
QOS_STORAGE_THINPOOL__S_V_NUMBER_OF_BLOCKS
Blocks
v1.0
QOS_STORAGE_THINPOOL__S_V_USAGE
Gigabytes
v1.0
QOS_STORAGE_THINPOOL__THIN_PROVISION_META_DATA_SPACE
Gigabytes
v1.0
QOS_STORAGE_THINPOOL__TOTAL_MANAGED_SPACE
Gigabytes
v1.0
QOS_STORAGE_VOL_BLOCK_SIZE
Bytes
v1.0
QOS_STORAGE_VOL_CAPACITY
Gigabytes
v1.0
QOS_STORAGE_VOL_CONSUMABLE_BLOCKS
Blocks
v1.0
QOS_STORAGE_VOL_CONSUMABLE_CAPACITY
Gigabytes
v1.0
QOS_STORAGE_VOL_CONSUMED_CAPACITY
Gigabytes
v1.0
QOS_STORAGE_VOL_KBYTES_TRANSFERRED
KB
v1.0
QOS_STORAGE_VOL_KBYTES_WRITTEN
KB
v1.0
QOS_STORAGE_VOL_NUMBER_OF_BLOCKS
Blocks
v1.0
v1.0
QOS_STORAGE_VOL_OPERATIONAL_STATUS
QOS_STORAGE_VOL_READ_HIT_IOS
Count
v1.0
QOS_STORAGE_VOL_READ_IOS
Count
v1.0
QOS_STORAGE_VOL_TOTAL_IOS
Count
v1.0
QOS_STORAGE_VOL_WRITE_HIT_IOS
Count
v1.0
QOS_STORAGE_VOL_WRITE_IOS
Count
v1.0
hitachi Troubleshooting
This article describes how to verify your configuration and communications, and how to examine the log file for detailed information.
To verify your setup, perform the following steps:
1. Verify the probe configuration.
2. Verify network communication.
User Name - Enter a valid user name. If no user was set up, use cimuser as the default user.
Password - Enter the correct password. If no user was set up, use password as the default user.
Note: If you are unable to collect performance data, make sure the firmware version of the Hitachi system is 7.0 or higher; upgrade the
firmware if necessary.
Error Messages
You can view error messages in the log dialog by right clicking the probe icon and selecting View Log.
You can increase the level of logging from the Probe Setup section in the probe AC version or the General Setup in the probe IM version.
Note: If, after these steps, you still have difficulties, you may want to use a third-party tool such as CimNavigator (not affiliated with or
endorsed by CA) to view and manipulate the CIM objects on the SMI-S server. A tool of this kind may help you identify configuration
issues.
the CLI option only when data is not available through the SMI-S provider. You must start the Common Information Model (CIM) server on your
system to enable communication between the probe and the HP 3PAR storage system using the SMI-S provider. For running the CLI commands,
the probe makes ssh connection with the 3PAR system. Storage Management Initiative Specification (SMI-S) is an industry standard supported
by multiple storage vendors and uses an object-oriented model based on the Common Information Model (CIM) to define objects and services
which comprise a Storage Area Network (SAN). CIM API uses HTTP or HTTPS protocol with dedicated TCP ports (5988/5989 by default).
More Information:
hp_3par (HP 3PAR Storage Monitoring) Release Notes
Contents
Verify Prerequisites
Naming Convention For Storage Objects
Alarm Thresholds
Verify Prerequisites
Verify that required hardware and software is available and preconfiguration requirements are met before you configure the probe. For more
information, see hp_3par (HP 3PAR Storage Monitoring) Release Notes.
6. Define the Username and Password of the user account to access the HP 3PAR system.
7. Select Active to activate the profile for monitoring. By default, the profile is active.
You can skip this step if you do not want the profile to start monitoring on creation.
8. Specify the time interval in seconds after which the probe collects the data from the HP 3PAR system for the specific profile. Reduce this
interval to generate alarms faster, as the next interval takes lesser time but it can increase the system load. You can also increase this
interval to generate alarms later and reduce the system load.
Default: 600
9. Specify the Alarm Message to be generated when the profile is unable to retrieve monitoring data. For example, the profile does not
respond if there is a connection failure or inventory update failure.
Default: ResourceCritical
10. Select Use SSL to allow the probe to use HTTPS for connecting to the HP 3PAR system.
11. Click Submit.
12. Click Save to save the configuration.
The new monitoring profile is created and displayed under the hp_3par node. The monitoring categories for the storage objects are
displayed as nodes below the Profile Name node.
13. Verify the connection between the probe and the storage server through the Test Connection button under the Actions drop down.
A connection successful message is displayed if the connection is verified.
14. Save the configuration to start monitoring.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Template Editor
hp_3par Node
<Profile Name> Node
<Resource Name> Node
<Storage Object Name> Node
Template Editor
The Template Editor interface is used to create, modify, or delete templates that can be applied to the probe. The editor allows you to define
templates that can be applicable across multiple profiles. For more information, see hp_3par Apply Monitoring with Templates.
hp_3par Node
The hp_3par node contains configuration details specific to the probe. This node enables you to view the probe information and configure the log
properties of the probe.
Navigation: hp_3par
hp_3par > Configuring hp_3par Monitoring
This section displays the minimum configuration settings required to monitor a storage object using the hp_3par probe.
hp_3par > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the vendor who created the probe.
hp_3par > Probe Setup
This section enables you to configure the default log level and log size for the probe.
Log Level: specifies the level of details that are written to the log file. Log as little as possible during normal operation to minimize disk
consumption, and increase the amount of detail when debugging.
Default: 3-Info
Log Size: specifies the size of the log file to which the internal log messages of the probe are written, in kilobytes. When this size is
reached, new log file entries are added and the older entries are deleted.
Default: 1000
hp_3par > Options (icon) > Add New Profile
This button allows you to add a resource as a monitoring profile to the probe.
respond if there is a connection failure with the HP_3PAR storage system or inventory update failure.
Default: ResourceCritical
Use SSL: allows the probe to use HTTPS for connecting to the hp_3par storage system.
NameSpace: identifies the namespace that is supported by the Device Manager and the SMI-S version. This field is read-only.
Contents
Prerequisites
How to Monitor the HP 3PAR StoreServ
Alarm Thresholds
Prerequisites
Verify that required hardware, software, and information is available before you configure the probe. For more information, see hp_3par (HP
3PAR Storage Monitoring) Release Notes.
The following are the preconfiguration requirements for the HP 3PAR Storage Monitoring probe.
The CIM server is required as a data collector for interfacing with the HP 3PAR storage system using the SMI-S provider. You must start
the CIM server on your system to enable communication between the probe and the HP 3PAR storage system using the SMI-S provider.
To enable the CIM Server via the CLI, use startcim command:
# startcim
To disable the CIM Server via the CLI, use stopcim command:
# stopcim
To display the overall CIM Server status, use the showcim command:
# showcim
How to Monitor the HP 3PAR StoreServ
This section describes the minimum configuration settings required to configure the hp_3par probe to monitor the storage objects.
Follow these steps:
1. Open the probe configuration interface.
2. Click the Options (icon) next to the hp_3par node in the navigation pane.
3. Click the Add New Profile option.
4. Set or modify the following values in the Add New Profile window:
Hostname: specifies the host name or IP address of HP 3PAR storage system.
Port: specifies the port number on which the SMI-S provider is listening.
Username: defines the user account for accessing the HP 3PAR system.
Password: defines the password for the given username.
Active: activates the profile for monitoring. By default, the profile is active.
Interval (secs): specifies the time interval in seconds after which the probe collects the data from the HP 3PAR system for the specific
profile.
Default: 600
Alarm Message: specifies the alarm to be generated when the profile is not responding. For example, the profile does not respond if
there is a connection failure or inventory update failure.
Default: ResourceCritical
Use SSL: allows the probe to use HTTPS for connecting to the HP 3PAR system.
NameSpace: identifies the namespace that is supported by the Device Manager and the SMI-S version. This field is read-only.
5. Click Submit and click Save to save the configuration.
The new monitoring profile is created and displayed under the hp_3par node. The monitoring categories for the storage objects are
displayed as nodes below the Resource IP Address node.
6. Verify the connection between the probe and the storage server through the Test Connection button under the Actions drop down.
7. Configure the alarms and thresholds in the monitors for the desired storage object. For example, to configure the alarms and thresholds
in the monitors for the Physical Disks node:
a. Navigate to hp_3par > Resource IP Address > IP Address > Physical Disks > Disk Name > Monitors
b. The monitors are visible in a tabular form. Select any one monitor from the table, and configure its properties.
8. Save the configuration to start monitoring.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Template Editor
hp_3par Node
<Resource IP Address> Node
<IP Address> Node
<Storage Object Name> Node
Template Editor
The Template Editor interface is used to create, modify, or delete templates that can be applied to the probe. The editor allows you to define
templates that can be applicable across multiple profiles. For more information, see hp_3par Apply Monitoring with Templates.
hp_3par Node
The hp_3par node contains configuration details specific to the HP 3PAR Storage Monitoring probe. This node enables you to view the probe
information and configure the log properties of the probe.
Navigation: hp_3par
hp_3par > Configuring hp_3par Monitoring
This section describes the minimum configuration settings required to monitor a storage object using the hp_3par probe.
hp_3par > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the vendor who created the probe.
hp_3par > Probe Setup
This section enables you to configure the default log level and log size for the probe.
Log Level: specifies the detail level of the log file.
Default: 3-Info
Log Size: specifies the size of the log file in kilobytes.
Default: 1000
hp_3par > Options (icon) > Add New Profile
This button allows you to add a resource as a monitoring profile to the probe.
The Resource IP Address node displays the resource connection properties and enables you to update the properties. The resource properties
include IP address of the resource, port, and user authentication details.
Navigation: hp_3par > Resource IP Address
Set or modify the following values, if needed:
IP Address > Options (icon) > Delete Profile
This button allows you to delete the monitoring profile from the probe.
IP Address > Resource Setup
This section enables you to edit the resource properties, which are configured while creating a profile.
Hostname: specifies the host name or IP address of hp_3par storage system.
Port: specifies the port number on which the SMI-S provider is listening.
Default: 5988
Username: defines the user account for accessing the hp_3par system.
Password: defines the password for the given username..
Active: activates the profile for monitoring. By default, the profile is active.
Interval (secs): specifies the time interval (in seconds) after which the probe collects the data from the hp_3par system for the specific
profile.
Default: 600
Alarm Message: specifies the alarm message to be generated when the profile is not responding. For example, the profile does not
respond if there is a connection failure with the HP_3PAR storage system or inventory update failure.
Default: ResourceCritical
Use SSL: allows the probe to use HTTPS for connecting to the hp_3par storage system.
NameSpace: identifies the namespace that is supported by the Device Manager and the SMI-S version. This field is read-only.
5.
Create Template
You can create a new template to configure multiple existing profiles with the same monitor configuration.
Follow these steps:
1. Open the probe configuration interface.
2. Click Template Editor.
3. The Template Editor page is displayed.
4. Click the Options (icon) next to the hp_3par probe node.
5. Click Create Template.
6. Specify a name and description for the template.
7. Enter the precedence value if you want to modify the default precedence setting.
For more information, see the Template Precedence Rules section.
8. Click Submit to create the template.
The <templateName> node is displayed followed by all the profile nodes of the probe.
9. (Optional) Click the <templateName> node to view the monitors and configure and include them in the template.
10. (Optional) Click the Options (icon) next to nodes representing multiple possible values (such as Physical Disks) to create filters.
You can also configure the default Auto Filters.
11. Create rules for filters, as needed.
For more information, see the Creating Filter Rules section.
12. Select Active to activate the template.
13. Select Save.
The template is created and applied to the probe at the next probe interval.
Note: All probe settings that are configured using templates are not available for individual configuration. You must clear the Active che
ckbox to deactivate the template to unlock the required setting for modifications.
Note: The rules for setting the precedence value for filters are same as setting precedence for templates.
Note: You must activate the template for the probe to apply the monitor configuration. When you change the template state to active,
the probe immediately applies all template configuration, including filters, rules, and monitors.
Explanation
[A-Z]
Standard (PCRE)
Standard (PCRE)
Standard (PCRE)
\d*
Custom
hp_3par Metrics
This article describes the metrics that can be configured using the HP 3PAR Storage Monitoring (hp_3par) probe.
The following table lists the Device metrics you can collect with the hp_3par probe. This table also lists the CIM element if the data collection
method is SMI-S or the CLI command if the data collection method is CLI.
Resource
Entity
Device
Monitor
Name
API
CIM
Element/Command
QoS Monitor
Unit
Description
Version
IOPS
CLI
statpd
QOS_HP_3PAR_IOPS
Count
1.0
Latency
CLI
statpd
QOS_HP_3PAR_LATENCY
ms
1.0
Total
Capacity
CLI
showsys
QOS_HP_3PAR_RAW_TOTAL_CAPACITY
GB
1.0
Used
Capacity
CLI
showsys
QOS_HP_3PAR_RAW_USED_CAPACITY
GB
1.0
Free
Capacity
CLI
showsys
QOS_HP_3PAR_RAW_FREE_CAPACITY
GB
1.0
Free
Capacity
Percent
CLI
showsys
1.0
Monitor
Name
API
CIM
Element/Command
QoS Monitor
Unit
Description
Version
Availability
SMI-S CIM_DiskDrive
QOS_PHYSICAL_DISK_AVAILABILITY
State
1.0
Total
Capacity
SMI-S CIM_DiskDrive
QOS_PHYSICAL_DISK_CAPACITY
GB
1.0
Latency
CLI
QOS_PHYSICAL_DISK_LATENCY
ms
1.0
IOPS
Count
1.0
1.0
statpd
Utilization
SMI-S CIM_DiskDrive
QOS_PHYSICAL_DISK_UTILIZATION
1.0
Monitor
Name
Logical Disks
(Raid Group)
Availability
API
CLI
CIM
Element/Command
showld
QoS Monitor
QOS_LOGICAL_DISK_AVAILABILITY
Unit
State
Description
Version
1.0
CLI
showld
QOS_LOGICAL_DISK_CAPACITY
GB
1.0
Latency
CLI
showld
QOS_LOGICAL_DISK_LATENCY
ms
1.0
IOPS
CLI
statld
QOS_LOGICAL_DISK_IOPS_RATE
Count
1.0
Throughput CLI
statld
QOS_LOGICAL_DISK_THROUGHPUT KB\s
1.0
Utilization
statld
QOS_LOGICAL_DISK_UTILIZATION
CLI
1.0
CPG Metrics
The following table lists the CPG metrics you can collect with the hp_3par probe. This table also lists the CIM element if the data collection
method is SMI-S or the CLI command if the data collection method is CLI.
Resource
Entity
CPG (Pool)
Monitor
Name
API
CIM
Element/Command
QoS Monitor
Unit
Description
Version
Total
Capacity
GB
Utilization
1.0
1.0
Monitor
Name
API
CIM Element/Command
QoS Monitor
Unit
Description
Version
Availability
SMI-S SNIA_StorageVolume
QOS_VIRTUAL_VOLUME_AVAILABILITY
State
1.0
Total
Capacity
SMI-S SNIA_StorageVolume
QOS_VIRTUAL_VOLUME_CAPACITY
GB
1.0
Latency
CLI
QOS_VIRTUAL_VOLUME_LATENCY
ms
1.0
IOPS
Count
1.0
1.0
statvv
CLI
showvv -s
QOS_VIRTUAL_VOLUME_UTILIZATION
1.0
Ports Metrics
The following table lists the Ports metrics you can collect with the hp_3par probe. This table also lists the CIM element if the data collection
method is SMI-S or the CLI command if the data collection method is CLI.
Resource
Entity
Ports
Monitor
Name
API
CIM
Element/Command
QoS Monitor
Unit
QOS_PORT_AVAILABILITY State
Description
Version
1.0
Monitor
Name
API
CIM
Element/Command
QoS Monitor
Unit
Description
Version
Controller
CPU Utilization
CLI
statcpu -t
1.0
You can configure the Auto-Operator functionality of the Alarm Server (NAS) for automatically assigning the alarm to the designated CA Unified
Infrastructure Management user. You can also assign the alarm manually from the IM to generate a new incident.
Note: The HP Service Manager Gateway probe initiates a new incident request when an alarm is assigned to the CA Unified
Infrastructure Management user. This username is configured in the HP Service Manager Gateway probe.
More information:
hpsmgtw (HP Service Manager Gateway) Release Notes
hpsmgtw Prerequisites
The prerequisites for the hpsmgtw probe are as follows:
The user credentials to access the HPSM application for working with incidents.
The UIM user to assign alarms and trigger creating an incident in the HPSM.
Note: You are recommended to create a separate user in Infrastructure Manager for assigning the alarms.
Configure a Node
Add Field Mapping Details
Delete Field Mapping Details
Configure the Offline Management Mode
Configure the subscribe_alarm_closure Key
Configure the subscribe_alarm_updates Key
Configure a Node
This procedure provides the information to configure a particular section within a node.
Each section within the node lets you configure the probe properties. These properties are used for generating incident in the HPSM application
for the CA Unified Infrastructure Management Probes alarm.
Follow these steps:
1. Select the appropriate navigation path.
2. Update the field information and click Save.
The specified section of the probe is configured.
The probe is now ready for generating incidents in the HPSM application.
Note: The Service Desk field list appears only if valid credentials are provided in the Server Configuration section of the hps
mgtw node.
4. Specify the Alarm Field or define the Default Value for the selected Service Desk Field.
5. Click Submit.
The mapping details are saved and displayed in the Mapping table of the Field Mapping node.
Note: You can map a field (an Alarm field or a Service Desk field) again for updating the field mapping details.
0 - The probe does not update the incident status.1 - The probe updates the incident status. By default, the value is 1.
4. Click Apply.
5. Close the raw configuration GUI.
6. Restart the probe for applying the changes.
The subscribe_alarm_closure key is configured.
Important! The subscribe_alarm_updates key does not update the incident status to Close or Resolve when the alarm is
acknowledged in the CA Unified Infrastructure Management server.
0 - The probe does not update the incident information.1 - The probe updates the incident information. By default, the value is
1.
4. Click Apply.
5. Close the raw configuration GUI.
6. Restart the probe for applying the changes.
hpsmgtw Node
<IP Address> Node
Field Mapping Node
Alarm Severity Mapping
HPSM Mandatory Fields Default Configuration
The HP Service Manager Gateway probe is configured for automatically or manually creating the incidents from the CA Unified Infrastructure
Management alarms. The probe enables the bi-directional synchronization of alarms and incidents status.
The HP Service Manager Gateway probe creates an incident in the HPSM application when alarms are assigned to the CA Unified Infrastructure
Management user. The probe automatically acknowledges the corresponding alarms once these incidents are closed in the HPSM application.
hpsmgtw Node
The hpsmgtw node is used for establishing a connection between the probe and the HPSM application.
This section contains configuration details specific to the HP Service Manager Gateway probe.
Navigation: hpsmgtw
Set or modify the following values as required:
hpsmgtw > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
hpsmgtw > Server Configuration
This section lets you configure the URL of the HPSM application and user credentials for authorization.
Server URL: Defines the WSDL URL of the HPSM application for retrieving the Web Service description. This Web Service exposes
methods for performing necessary operations on the HPSM application.
Username: Defines the user name for logging in to the HPSM application. The username is case-sensitive.
Password: Defines the password for the HPSM user.
Note: Use the Test option from the Actions drop-down list for verifying the connectivity between the probe and the HPSM
application.
Note: The IP address node displays the actual IP of the HPSM application server.
closed incidents.
This sweeptime_before_lastrun key stores an integer value which denotes the time in minutes. The default value of the sweeptime_
before_lastrun key is 10 minutes and this key is configured by using the Raw Configure option under the setup section.
Default: 30
NAS Address: Defines the address of the local Nimsoft Alarm Server (NAS) in the /<Domain>/<Hub>/<Robot>/nas format where the
probe is deployed. The address is case-sensitive.
Date Format: Defines the date format that matches the date format of the HPSM platform. The closed incident retrieval timer uses this
format to build queries. The configuration file can use the following date formats:
MM/dd/yyyy HH:mm:ss
dd/MM/yyyy HH:mm:ss
Timezone: Specifies the time zone code for storing the time value.
Important! The time zone must be same as configured in the HPSM application against the Username.
Incident Id Custom Field: Specifies the custom field of the alarm (custom_1 to custom_5) to save the incident id of the corresponding
incident.
On Cleared Alarm: Specifies the incident status when the alarm is acknowledged.
IP address > Connection Warning Alarm
This section lets you view the warning alarm properties when the probe is not able to connect with the HPSM application.
IP address > New Incident Warning Alarm
This section lets you view the warning alarm properties when the probe is not able to generate incidents on the HPSM platform.
Impact
Urgency
Critical
1 - Enterprise
1 - Critical
Major
2 - Site/Dept
2 - High
Minor
3 - Multiple Users
3 - Average
Warning
3 - Multiple Users
3 - Average
Information
4 - User
4 - Low
You can use the Raw Configure option and update following key values under the urgency section:
critical
major
minor
warning
information
Important! Do not change mapping of the alarm severity field with the Impact and Urgency fields.
Field Name
Default Value
Create Incident
Assignment Group
Hardware
Service
Applications
Description
Area
hardware
Sub Area
missing or stolen
Impact
Severity
Urgency
Severity
Update Incident
Journal Updates
Close Incident
Closure Code
Automatically Closed
Solution
Fixed
Important! Do not change value of the Mandatory Fields key under the Setup section. Any change in this key that is not supported on
the HPSM platform crashes the probe.
hpsmgtw Metrics
This section contains the alert metric default settings for the HP Service Manager Gateway (hpsmgtw) probe.
Alert Metric
Warning
Threshold
Warning
Severity
Error
Threshold
Error
Severity
Description
Version
GTWConnectionWarning
Major
v1.0
GTWNewIncidentWarning
Major
v1.0
Note: Use the Raw Configure option for updating values of the Name, Severity, Token, msg_err, subsystem, suppression_id, and
msg_ok fields of the given messages.
hub
A CA UIM hub serves as the communication center for a group of robots. A hub binds robots into a logical group with the hub as the central
connection point. Hubs are typically set up based on location (such as a lab or building) or by service (such as development). A hub can connect
LDAP Requirements
UIM Server currently works with the following LDAP versions:
Novell eDirectory (TM) 8.8 SP1 (20114.57) and a Novell KDC (Key
Distribution Center) server
SUN Java Directory Server v5.2
Windows 2008 and Windows 2012 Active Directory
Note: If you have secondary hubs in your deployment, create queues between them and the primary hub. If secondary hubs are
separated from the primary hub by a firewall:
1. Create a static route to the hub.
2. Create tunnels and queues to connect the secondary hubs to the primary hub.
3. Delete the static route.
1. In Admin Console, expand the tunnel server hub, and select the hub robot.
2. Select the hub probe in the right pane, and click the green icon to expand the drop-down list.
3. Select Configure to open the hub Probe Configuration page.
4. In the left pane, select Advanced, Tunnel Settings to open the Tunnel Settings pane.
5. Select the Ignore Controller First Probe Port checkbox.
6. Enter the first port number to use in First Tunnel Port. The hub skips ports that are used by the controller and probes.
7. Click Save.
How port assignment works
For each additional tunnel, the tunnel server increments the port number, and assigns the port to the tunnel client.
The client keeps the port as long as the hub is running.
The server does not track disconnected clients. If a tunnel client is connected to the server, the number increments. If a
previously used port becomes available, it is ignored.
When there are no active clients, the counter resets.
If you plan to configure more than one tunnel, we recommend that you specify the first port. The hub skips ports that are used
by the controller and probes.
If the First Tunnel Port field is blank, the operating system assigns random ports.
1. In Admin Console, expand the tunnel server hub, and select the hub robot.
2. Select the hub probe in the right pane, and click the green icon to expand the drop-down list.
3. Select Configure to open the hub Probe Configuration page.
4. In the left pane, select Tunnel to open the Tunnel Activation pane.
5. Select the Tunnel Active checkbox, and click Save.
6. Click OK to acknowledge the configuration change.
3. In the Common Name field, enter the IPv4 address, IPv6 address, or the DNS-resolvable host name of the tunnel client hub. Use regular
expressions to specify multiple tunnel client hubs, and create multiple client certificates.
4. Enter a password for the tunnel client certificate
5. In Expiration Days, specify the number of days the tunnel client certificate is valid
6. Click Actions, and select Create Client Certificate
7. Click Reload to refresh the page
8. Scroll down to the Certificate field below the Client Certificate List
9. Select all the text in the Certificate field, including BEGIN CERTIFICATE and END CERTIFICATE, and copy the client certificate to the
clipboard. If you do not plan to configure the tunnel client immediately, save the certificate to a plain text file.
If you have a static route to a hub that is connected by a tunnel, delete the unsecured static route.
1. Expand the hub with the static route, and select the hub robot.
2. Select the hub probe in the right pane, and click the green icon to expand the drop-down list.
3. Select Configure to open the hub Probe Configuration page.
4. Select Name Services and delete the static route.
Note: Use Infrastructure Manager to create access lists. Tunnel access lists created with the Admin Console hub configuration UI can
have issues. Refer to Set Up Access List Rules in the hub IM Configuration article for details.
Note: In a high-volume environment, create separate queues for important subjects, such as alarm, or for subjects that create many
messages. Create one multiple-subject queue for all subjects that are not critical.
Used by
alarm
Alarm messages
alarm2
alarm_new
alarm_update
alarm_close
Message that is sent when a client closes (acknowledges) an alarm and removes it from the currently active alarms
alarm_assign
Message that is sent when a client closes (acknowledges) an alarm and removes it from the currently active alarms
alarm_stats
Generated by the NAS probe, statistical event messages that contain the severity level summary information for all open
alarms
audit
probe_discovery
Device information
QOS_BASELINE
QOS_DEFINITION
QOS_MESSAGE
Example Queue
The following example shows how to configure an attach and a get queue pair with the subject probe_discovery. The example demonstrates how
automated discovery data flows up from secondary hubs to the discovery server on the primary hub.
Important! We recommend that server hubs and robots reside behind a firewall, with secondary hubs connected by secure tunnels. In
this configuration, the secondary hubs and robots can use SSL compatibility mode or SSL Only mode.
At startup, the controller creates the UIM_HOME/robot/robot.pem file, which enables SSL communication. The file contains the key
that decodes the encrypted CA UIM messages.
Important! v7.70 of the CA UIM hub and robot have improved the robustness of SSL communication. Before, in a nontunneled domain,
hubs that are configured for unencrypted communication could decode and respond to encrypted messages. In a multiple hub domain,
upgrading to v7.70 breaks this type of communication. See Impact of Hub SSL Mode When Upgrading Nontunneled Hubs in the Hub
Release Notes.
The tunnels that are set up between hubs remain after you upgrade, and communication continues. We strongly recommend that you
connect all hubs with tunnels to ensure the security of communications.
option is false, the user tags are not copied again. The user can unset the user tags in the hub intentionally, and the tags are not rewritten.
After upgrading to v7.80, existing user tags in robot.cfg continue to propagate automatically. The user tags do not need to be manually
configured in hub.cfg.
hub.cfg. When:
robot up
robot down
probe up
probe down
The following table compares the user tag behavior for hub v7.63, v7.70, and v7.80 for the robot alarms.
Consider a hub with user tags hub1 and hub2, and a robot with user tags rob1 and rob2. The robot can be local or remote.
User tags in robot alarms sent by the hub:
Option
Option
OFF
ON
7.63
7.70
7.80
7.80
user_tag_1
rob1
hub1 rob1
hub1
user_tag_2
rob2
hub2 rob2
hub2
context_user_tag_1
N/A
N/A
hub1
rob1
context_user_tag_2
N/A
N/A
hub2
rob2
User tags in other hub alarms and messages processed by the hub spooler (no changes since 7.70):
7.70, 7.80
user_tag_1
hub1
user_tag_2
hub2
Hub
Advanced
SSL
Tunnel Settings
Hub List
Name Services
Queue List
Robot List
Tunnel
1 - Tunnel Server
2 - Tunnel Client
3 - Tunnel Access List
Note: To access the hub configuration interface, select the hub robot in the Admin Console navigation pane. In the Probes list, click the
checkmark for the hub probe, and select Configure.
Hub
To view information about the hub, and adjust log file settings, use the hub node.
Probe Information contains the probe name, the start time, the installed version, and the vendor
Hub Information contains the hub name, domain, IP address, hub address, /domain/hub_name/robot_name/hub, and the length of the
up time
General Configuration is where to modify user tags and log file settings:
Identification property User Tag 1 and Identification property User Tag 2 (Optional)
User tags are optional values that can be attached to probe-generated messages to control access to the data in USM.
On a robot system, user tags are specified in robot.cfg.
As of hub v7.70, user tags on a hub system are specified in hub.cfg. User tags that are defined in robot.cfg are
ignored.
Before hub v7.70, user tags on a hub system are read from robot.cfg.
Default: blank
Log Level specifies the level of alarm information that is saved in the log file.
0 - Fatal (default) logs the least information
5 - Trace logs all alarms
During normal operation, log at a low level to reduce disk usage
Increase the logging level during debugging
Log Size specifies the amount of data that is retained in the log file. Default, 1024 KB
Large log files can cause performance issues and can deplete disk space.
License Information is where to modify the license information for the hub
License contains the current license string.
An invalid license
Messages stop flowing from the hub to the subscribers, typically service probes
The robot spooler does not upload messages
Expire Date contains the date that the license expires. The field is populated automatically.
Licenses Available indicates how many robot licenses are available.
Licenses Total indicates the total number of robot licenses.
Advanced
Use the Advanced node to control the hub connectivity.
Hub Settings controls how the hub communicates
Hub Request Timeout specifies how long the hub waits for a response from other hubs. Default, 30 seconds
Hub Update Interval specifies how often the hub sends messages to other hubs. Default, 600 seconds
Origin identifies the sender for data that is sent from the probes.
The origin is used during report generation.
The origin is obtained from the controller probe configuration.
If the origin is not specified in the controller, the field is blank, and the hub name is used.
Disable IP Validation turns off the hub IP address validation for servers sending requests to probes. Use IP address validation in a
Network Address Translation (NAT) environment.
Login Mode options
Normal (default) permits login from any robot that is connected to the hub
Local Machine Only permits a login only from the server which is hosting the hub
No Login disables login to the hub
Broadcast Configuration controls how the hub lets other hubs know it is active:
Broadcast On (default) enables the hub to broadcast status.
Broadcast Address is the hub broadcast IP address. Default, 255.255.255.255 (the default broadcast address for a local
network).
Lockout Configuration controls the settings for the hub to avoid brute-force password guessing.
Login Failure Count specifies the number of failed attempts which are allowed from a single IP address.
Lockout Time specifies the number of seconds that must pass before a user can attempt to log in after a failure.
Robot Settings controls the alarm settings for events that occur on the robots that are connected to the hub:
Inactive Robot Alarm Severity specifies the alarm level that is sent when a robot fails to respond.
Audit Settings for Robots enable or disable auditing for the hub robots. Each robot can be configured to use custom settings.
Auditing records important events, such as starting and stopping the robot.
Audit Once per User
Queue Settings specifies the behavior and the size of queues.
Reconnect Interval the number of seconds between attempts to reconnect a disconnected hub. Default, 180 seconds.
Disconnect Passive Queues specifies how long a queue can be passive, that is, receive no messages, before disconnection.
Default, 180 seconds
Post Reply Timeout specifies how long a hub waits for a reply to a message.
Alarm Queue Size the size of the queue file on the hub. If the queue exceeds the threshold, an alarm is sent. Default, 10 MB
SSL
Use the Advanced, SSL node to specify the communication mode of hub-managed components.
SSL mode is used for robot-to-hub communication.
When the hubs are not connected with tunnels, the SSL mode is used for hub-to-hub communication.
The hub for a CA UIM component controls the SSL mode that is used by the component.
The hub propagates SSL settings to the robots, and robots propagate the SSL settings to the probes.
SSL settings are specific to each hub.
Set the SSL mode on each hub that requires SSL communications.
Mode
Normal SSL mode 0 No encryption
The OpenSSL transport layer is not used.
Compatibility mode SSL mode 1 the hub and robot can communicate without encryption or can communicate with OpenSSL e
ncryption.
Components attempt to communicate with SSL. If the request is not acknowledged, the communication is unencrypted.
SSLOnly SSL mode 2 OpenSSL encryption
Cypher Type specifies the cipher suite
Note:
Hub v7.80 supports the TLS protocol and TLS cipher suites for hub-to-hub tunnels, and hub-to-robot SSL settings.
To use TLS cipher suites between tunnel servers and tunnel clients:
Upgrade the hubs to v7.80
Select a cipher suite that resolves to the TLS protocol
Hub v7.71 and before, support cipher suites that resolve to both TLS and SSLv3. TLS-only cipher suites are not supported.
If the tunnel server cipher is changed, restart the tunnel server and tunnel clients.
Tunnel Settings
Use the Advanced, Tunnel Settings node to control the behavior of the hub tunnels.
Tunnel Advanced Settings control how tunnels connect
Ignore Controller First Probe Port controls how tunnel ports are assigned.
Enabled the hub uses the First Tunnel Port setting. If the hub has more than one tunnel server, use this setting.
Disabled the tunnel uses the First Probe Port in the controller probe configuration.
First Tunnel Port specifies the port that is used by the first tunnel that you set up. The tunnel server increments the port number for
each additional tunnel, and assigns that port to the tunnel client. The client keeps that port as long as the hub is running.
The server does not manage disconnected clients. If a tunnel client is connected to the server, the number increments. If a
previously used port is available, it is ignored. If there are no active clients, the counter is reset.
If the field is blank, the operating system assigns random ports.
Hang Timeout specifies the interval between attempts to restart the tunnel. The tunnel server continuously monitors the status of the
tunnels. If a tunnel does not respond, the hub continues to attempt a tunnel restart until the tunnel is active. Default, 120 seconds
Tunnel SSL Session Cache controls SSL caching
Use Client Cache / Use Server Cache SSL sessions are cached, and previous session credentials are used. Enable both options to
reduce the server-client connection time.
Server Cache Timeout specifies how long the cached sessions are valid for reuse by the client. Default, 7200 seconds (two hours)
Server Cache Size Default, 1024 KB
Hub List
The Hub List node lists the hubs in a CA UIM domain, displays the hub information, and monitors the hub status.
Hub List provides the following information about each hub:
Domain
Name
Status
Version of the hub probe
Last Updated date and time when the hub probe was last restarted
IP address
Port
To monitor the status of other hubs:
Actions, Alive Check monitor the status of the selected hub.
Actions, Response Check monitor the response time, connect - reconnect and no transfer, between the current hub and the
selected hub.
Actions, Transfer Check transfers data from the current hub to the selected hub, and monitors the transfer rate.
Name Services
Use the Name Services node to connect hubs that are separated by firewalls, routers, or that are in a NAT environment.
Static Hub List Entry enter information about the static route:
Active the route is active
Synchronize the hub sends status information to the static hub
Hostname/IP the address of the static hub
Actions, Create Static Hub sets up the static route
Static Hub List displays the hubs to which there is a static route from the hub being configured:
Active indicates that the route is active.
Synchronize indicates that the hub is sending status information to the static hub
Name, IP, Domain, and Robot Name identify the static hub
Actions, Remove Static Hub removes the selected static hub
Network Aliases specifies the return address for requests from remote hubs in a NAT environment
From Address is the address from which the remote hub sends requests
To Address is the address to which the responses are sent
Queue List
Use the Queue List node to create hub-to-hub queues.
Queue List Entry add a new queue subject
Subject To Add specify the new subject. Some subjects are reserved for use by CA UIM probes. See Reserved UIM Subject IDs.
Actions, Add Subject To List add a queue subject immediately so that it can be used in a new queue
Queue List Configuration enter information for new queues, or view the configuration of the existing queues. Some fields are specific to
the type of queue.
New and Delete add and delete queues
Queue Name the name of the new queue
Active the queue status
Type the type of queue, attach,
Hub Address
(get
post, or get.
Subject (attach or post queues) the types of the messages to collect in the queue
Remote Queue Name
Remote Queue List
(get
(get
queues) the list of attach queues that are found in the domain
Bulk Size the number of messages that are transferred in one package
Robot List
The Robot List node lists the hub-controlled robots.
Robot List displays the following information about each robot:
Name
Status
IP address
Version
OS
Robot commands
Actions, Alive Check monitor the status of the selected robot
Actions, Restart restart the selected robot
Tunnel
Use the Tunnel node to enable tunneling on a tunnel server or a tunnel client.
Select Tunnel Active, and click Save to enable tunneling.
1 - Tunnel Server
Use the Tunnel, 1 - Tunnel Server node to configure a hub as a tunnel server.
Certificate Authority (CA) Initialization designate a hub as a certificate authority
Note: Designating a certificate authority is a one-time task. When a certificate authority is specified, Tunnel Server CA Is
Initialized is displayed.
Use the Tunnel, 2 - Tunnel Client node to configure a hub to be a tunnel client.
Client Certificate Configuration lets you add, delete, and view tunnel client certificate:
New and Delete add and delete tunnel client certificates.
Certificate ID the number that is assigned to the certificate
Active the certificate status
Server * specifies the IP address of the tunnel server hub
Server Port * the port to use for tunneled data
Check Server 'Common Name' Value the tunnel server verifies that the tunnel comes from the IP address in the client certificate.
Disable the setting in NAT environments. We recommend enabling the setting in other environments.
Description describes the tunnel
Password * the password of the tunnel client certificate
Keep Alive the interval in seconds at which small data packets are sent
Certificate * paste the client certificate text from the tunnel server hub
3 - Tunnel Access List
Use the Tunnel, 3 - Tunnel Access List node to restrict the access privileges for CA UIM users, addresses, and commands.
Note: Log is typically used for testing commands against targets. The result is recorded in the hub log file.
This article explains how to use Infrastructure Manager (IM) to configure a hub. For an overview of the hub, see hub.
Important! Do not connect a hub with both a tunnel and a static route.
In some situations, data can be transmitted over the insecure static route rather than over the secure tunnel.
Delete static routes that are used to configure a tunnel. Do not retain static routes when tunnels exist.
3.
Disable Synchronize on slow networks. Disabling synchronization reduces the network traffic. Disable this option on both hubs.
Enter the IP address of the static hub. The system fills in the Hub Information fields (hub name, domain, and robot name) with the
information retrieved from the remote hub.
4. Click OK. The static route is created and the hub appears in the list of static hubs.
To delete a static route, select it and click Delete.
Note: The type of queue you select determines which fields are active in the New Queue dialog.
c.
Used by
alarm
Alarm messages
alarm2
alarm_new
alarm_update
alarm_close
Message that are sent when a client closes (acknowledges) an alarm, and removes it from the currently active alarms.
alarm_assign
Message that are sent when a client closes (acknowledges) an alarm, and removes it from the currently active alarms.
alarm_stats
NAS probe Statistical event messages. The messages contain the severity level summary information for open alarms.
audit
Audit messages: probe package distributions, activations, and other audit messages.
probe_discovery
Device information
QOS_BASELINE
QOS_DEFINITION
QOS_MESSAGE
Example Queue
The following example shows how to configure an attach and a get queue pair with the subject probe_discovery. Automated discovery data flows
from the nonprimary hubs to the discovery server on the primary hub.
5. In the Issued Certificates field, click View to display the certificate information.
6. Click Copy. Open a text editor, such as Notepad, and paste the certificate into a new file. Save the file to a location where the tunnel
client can access it. Exit Certificate Information.
7. Click Apply, and click Yes to restart the probe.
log.
For high restrictions, start with the accept rules, then create a deny rule to deny access to all others.
For low restrictions, start with the deny rules, then create an accept rule to grant access to all others.
Rules can contain regular expressions.
To add a rule:
1. Open the Access List tab.
2. In Edit Rule, enter:
Source IP - IP address of the source hub, robot, or probe
Destination Address - IP address of the target hub, robot, or probe
Probe Command - the specific command to allow or deny. Use the Probe Utility to view the probe command set.
User - the user to allow or to deny access
3. Select the mode - Accept, Deny, or Log
4. Click Add. The rule is added to the rule table.
In the rule table, use Move Up and Move Down to change the rule priority.
5. Click Apply to activate the rule list.
hub.
Networks that use Network Address Translation (NAT) affect how a tunnel is configured.
The following scenarios describe three possible configurations.
Important! When a tunnel is configured, the tunnel replaces the static hub and NAT setup in the hub configuration.
The client certificate must be issued to the common name that is visible to the server. In this case, that is
55.111.
Server address in NAT environment
Clear Check Server Common Name in Tunnel Client Setup to disable server common name checking. The client sees 202.1.1.1, but the
server certificate contains the common name 10.1.1.1. If server common name checking is enabled, the communication fails.
Server and Client addresses in NAT environment
Before, in a nontunneled domain, hubs that are configured for unencrypted communication can decode encrypted messages.
In a multiple hub domain, upgrading to v7.70 does not allow this scenario. See, Impact of Hub SSL Mode When Upgrading
Nontunneled Hubs in the Hub (hub) Release Notes.
Note: Any tunnels set up between hubs remain after you upgrade, and communication will continue.
We strongly recommend that you connect all hubs with tunnels.
Note: Hub v7.80 supports the TLS protocol by using TLS cipher suites for tunnels between hubs, and hub-to-robot SSL settings.
To restrict tunnel communication to TLS cipher suites, upgrade the hubs to v7.80. Select a cipher suite that resolves to TLS.
To use TLS with hubs that are at v7.71 and earlier, use a cipher suite resolving to TLS and SSLv3.
To use a TLS cipher suite for hub-to-robot SSL settings, use a cipher suite resolving to TLS and SSLv3.
Restart the tunnel server and tunnel clients when:
The tunnel server cipher suite is changed
The tunnel server hub is reverted to a prior release and the tunnel clients are using a TLS cipher suite
Alarm Configuration
Advanced alarm configuration lets you trigger alarms in certain situations. Add alarm keys to the hub section.
You can trigger alarms to:
Send an alarm when there are a significant number of subscriptions or queues (which can decrease hub performance). Specify:
subscriber_max_threshold - the number of subscribers that triggers an alarm
subscriber_max_severity - alarm severity
Send an alarm when a queue reaches a certain size. Specify:
queue_growth_size - queue size that triggers an alarm
queue_growth_severity - alarm severity
LDAP Configuration
You can add two keys in the
/LDAP/server section of hub.cfg to affect how the hub communicates with the LDAP protocol.
Timeout specifies the number of seconds to spend on each LDAP operation, such as searching or binding (authentication) operations.
Default, 15 seconds
codepage is used to translate characters from UTF-8 encoding to ANSI. When text comes from the LDAP library as UTF-8, the
codepage translates the characters into ANSI.
Windows - Specify a valid codepage. The hub LDAP library uses the MultibyteToWideChar and WideCharToMul
tiByte
functions to translate between ANSI and UTF-8. The functions take a codepage as a parameter. Default,
28591
ISO-8859-1
Note: The codepage key is not shipped with the hub configuration file. The default codepage is ISO
8859-1 Latin
Description
Default
passive_robot_threads
The hub communicates with a robot through worker threads from a thread pool. We recommend
setting the pool to 50 percent of the number of passive robots.
10
passive_robot_interval
The number of seconds between trying to retrieve messages from a passive robot
15
seconds
passive_robot_messages
The number of messages the hub accepts from a passive robot in a single retrieval interval
1000
passive_robot_comms_timeout
The amount of time the comms API blocks a call to a robot that is not responding.
15
seconds
passive_robot_max_interval
The interval between retrying an unresponsive passive robot doubles every 10 minutes up to
this value.
passive_robot_restart_wait
Number of seconds the hub waits for passive robot management threads to stop before killing
the threads.
Important! To avoid monitoring delays, seek the advice of Support before modifying
the key.
60
seconds
General Tab
Hub Advanced Settings
General Advanced Settings
SSL Advanced Settings
LDAP Advanced Settings
Hubs Tab
Robots Tab
Name Services Tab
Queues Tab
Tunnels Tab
Server Configuration
Client Configuration
Access List
Advanced
Status Tab
General Tab
The General tab contains basic hub information:
Hub information
Hub name
Domain - to which the hub belongs
Hub IP address
Hub address - UIM format:/domain/hub_name/robot_name/hub
Version - number and distribution date of the hub probe
Uptime - length of time from the last restart
Modify - open Edit Hub Address. Edit the hub name and the domain name. If these parameters are modified, the hub controller
probe restarts.
License information - A hub maintains the licenses for all the connected robots.
When the license is invalid:
The message flow from the hub to the service probes and other subscribers stops
The messages from the robot spoolers stop
The license key contains the following fields:
Licenses in use - the number of robots that are connected to the hub, and the number of robots the license allows.
Expire date - when the license expires. An asterisk (*) indicates an unlimited license.
Owner - The owner of the license
Modify open Edit License. License keys are provided by CA and must be entered exactly.
Log in Settings for this hub
Normal (login allowed) - allow users to log in to the hub from any robot.
Local machine only - allow normal logins from the hub server. Attempts to log in from remote robots are refused
No login - disable local or remote login to the hub.
Log Level - specify the level of detail that is written to the log file. To reduce the disk usage, log at a low level. Increase the logging level
for debugging.
Log Size - the size of the log file. Default, 1024 KB
Advanced - view and configure advanced options for the hub
Enable tunneling - activate the Tunnel configuration tab. To disable tunnels, clear the option, and click Apply.
Disable IP validation - use this option with Network Address Translation (NAT). See, Setting up a Tunnel in a NAT Environment.
Statistics - display the traffic statistics for the hub for the previous 12 hours
The graph shows the number of messages that are sent and received per minute, and the number of requests. Specify a time period
in the Period section. Click Get to update the values.
See, Checking Traffic Statistics.
Monitor - display the current hub traffic. See, Monitoring the Message Flow.
View Log - display the contents of the hub log file. Use the log settings to set the level of detail. The Log Viewer window contains:
File - save or print the file
Edit - copy the contents and search in the log file
Actions - limit the output in the window, and highlight text or dates within the log file
Start and Stop - start and stop the log file updates
Settings - open Hub Advanced Settings See, Hub Advanced Settings.
Hub Advanced Settings
General, Settings
The Hub Advanced Settings dialog contains three sections:
General
SSL
LDAP
Hub Settings
Hub Request Timeout - the timeout value for hub communication requests. Default, 30 seconds
Hub Update Interval - the interval at which the hub sends status information to the other hubs. Default, 600 seconds
Queue Settings
Reconnect Interval - the interval at which a disconnected queue is reconnected. Default, 180 seconds
Disconnect passive queues - the interval at which passive queues (no messages) are disconnected. Default, 180 seconds
Post Reply Timeout - the length of time a hub waits for a reply to a message. If no response is received within this interval, a timeout
occurs.
Alarm on queue size - the size of the queue file (in MB) on the hub. If the queue exceeds this threshold, an alarm is sent. Default, 10
MB
Lock Out Time
The hub implements extra security measures to avoid leaving the system vulnerable to brute-force password guessing.
If the number of consecutive login failures from a user or an IP address reaches the Lock After Fails value, login is blocked
until the Lock
Important! These changes are not persistent. The changes do not survive a hub restart.
Origin - where a message comes from. QoS messages from probes are tagged with a name to identify the origin of the data. By default,
the origin is the name of the parent hub of the probe.
To override the origin value:
Change the value. The new value is used for hub-managed QoS messages.
Set the origin at the robot level. Use Setup, Advanced in the controller configuration GUI.
Audit Settings for Robots - specify the recording of important events, such as starting and stopping the robot. The setting is used for all
hub robots. Select one of the following options:
Override: audit off
Override: audit on
Use audit settings from robot
SSL Advanced Settings
Note: Hub v7.80 supports the TLS protocol by using TLS cipher suites for tunnels between hubs, and hub-to-robot SSL settings.
To restrict tunnel communication to TLS cipher suites, upgrade the hubs to v7.80. Select a cipher suite that resolves to TLS.
To use TLS with hubs that are at v7.71 and earlier, use a cipher suite resolving to TLS and SSLv3.
To use a TLS cipher suite for hub-to-robot SSL settings, use a cipher suite resolving to TLS and SSLv3.
Restart the tunnel server and tunnel clients when:
The tunnel server cipher suite is changed
The tunnel server hub is reverted to a prior release and the tunnel clients are using a TLS cipher suite
Cipher Type specifies the Cipher Suite that is used by the OpenSSL library.
LDAP Advanced Settings
Note: Due to the limited availability of the LDAP library, Direct LDAP is only available on Linux and Windows hubs. Native
LDAP is not supported on Solaris.
Note: You can specify multiple LDAP servers in this field. Separate each server with a space. The first entry is the primary
LDAP server. More entries are secondary servers, which are used when the primary server is unavailable. If a nonprimary
server is used, logins can take more time.
Server Type - two LDAP server types are supported: Active
Authentication Sequence - the hub authentication sequence If you select Nimsoft, LDAP, the user is verified against Nimsoft user
credentials first. If verification fails, the hub attempts to verify using the LDAP server.
Use SSL - use SSL for LDAP communication. Most LDAP servers are configured to use SSL.
User and Password - the user name and password that is needed for querying the LDAP server
Active Directory - the user is specified as an ordinary user name
eDirectory - the user is specified as a path to the user in the format CN=yyy,O=xxx. CN is the user name, and O is the
organization.
Group Container (DN) - a group container in LDAP which defines where, in the LDAP hierarchy, to search for groups. Click Test to
verify that the container is valid.
User Container (DN) - a user container in LDAP which defines more specifically where to search for users in the LDAP structure.
Nimsoft Proxy Hub - The hub can be configured to specify a UIM probe address for login.
Proxy Hub - The drop-down list is empty by default. Click Refresh next to the drop-down list to perform a gethubs probe request
on the hub. The drop-down list is populated with a list of known hubs.
Proxy Retries - the number of retries to perform when there are communication errors.
Authentication Sequence - if you select Nimsoft, LDAP, the user is verified against Nimsoft user credentials first. If the
authentication fails, the hub tries to verify the credentials using LDAP server credentials.
Proxy Timeout - the time (in seconds) before the proxy is timed out.
Hubs Tab
The hubs tab lists all the known hubs, and displays information in different colors.
Blue - The hub is in the same domain as the hub you are currently logged in to.
Black - The hub is outside the current domain.
Red - The hub status is unknown. Typically, red is displayed when the hub is not running.
The hub list contains the following information about each hub:
Status indicator:
Green - running
Red - not running
Yellow - status unknown
Domain
Hub name
Version of the hub probe
Updated: shows when the hub was last updated
IP address for the hub
Port number for the hub
Right-clicking in the window displays four options:
Alive Check rechecks the status of the selected hub.
Response Check checks the response time (connect - reconnect, no transfer) between your hub and the one selected in the list.
Transfer Check transfers data from your hub to the selected hub, then checks the transfer rate.
Remove removes the selected hub from the hubs address list. If the hub is running, it can appear later.
Robots Tab
The Robot tab lets you set the alarm level for robots and displays robot information.
Inactive Robot Setup - If one of the robots that are listed is unavailable, set the severity level of the alarm that is issued.
Registered Robots displays the following information for each robot:
Name
Type - regular or passive
IP address
Version of the robot software
Created - when the robot was installed
Last Update - when the software was last updated
Operating system of the robot host system
Remove - the selected active or passive robot is removed from the list An active robot can show up later because active robots
periodically request that the hub add them to the Registered Robots list.
Add Passive Robot - open the dialog to add a passive robot
Note the Synchronize option in the New Static Hub dialog. If the synchronize option is selected, the parent hub sends status information
to the static hub. The parent hub receives status information from the static hub, unless you also disable the Synchronize option on the
static hub. You can disable the synchronize option to reduce network traffic.
Important! Do not connect hubs with both a tunnel and a static route. In some situations, data is transmitted over the insecure
static route rather than over the secure tunnel. If you create a tunnel between two hubs, delete any existing static routes.
Network Alias - the return address of a remote NAT hub
On hub A, set up the From address and the To address for
On hub
When hub
Queues Tab
hub B.
To edit a message queue, double-click the message queue, or select the message queue and click Edit.
To define a new queue, click New.
A queue is a holding area for messages passing through the hub. Queues are temporary or permanent:
Permanent queue - content survives a hub restart
Permanent queues are used by service probes to receive all messages. If the service probe is not running, the messages are held in
the queue. When the probe starts, the messages are delivered.
Temporary queue - content is cleared during restarts
Temporary queues are typically used for events that are sent to management consoles.
All queues that are defined on the Queues tab are permanent queues. Permanent queues have a name that is related to their purpose. The
permanent queue, NAS, is attached to the Nimsoft Alarm Server (NAS).
You can create a permanent queue between two hubs. Define the queue as a post type queue. Use the full UIM address of the other hub.
A Post queue sends a directed stream of messages to the destination hub.
An Attach queue creates a permanent queue for a client Get queue to attach to.
A Get queue receives messages from a permanent Attach queue on another hub.
For example, the following queue, named get-hub-4, is defined as a Get queue.
get-hub-4 receives messages from the Attach queue xprone-attach
Tunnels Tab
Select the Enable Tunneling option on the hub, General tab to enable the Tunnels tab.
Use the server configuration tab to configure the listening side of a tunnel.
Active - activate the tunnel server
When Active is selected, the Certificate Authority Setup dialog opens. See, Setting up a Tunnel.
Common name - the IP address of the tunnel server hub
Expire date - the date the server certificate expires
Port - the port that the tunnel server is listening on Open this port in your router or firewall for incoming connections.
Security settings - select None, Low, Medium, High, or Custom. Use the Custom setting to define the security protocol. See, http://ww
w.openssl.org/docs/apps/ciphers.html
Start and Stop - start and stop the tunnel server
Server - display the server details and the server certificate
CA - display the CA details and CA certificate
New - open the Client Certificate Setup dialog. You can create certificates for the clients you open for access. Supply a certificate
password. The client requires the password, the certificate (encrypted text), and the server port number.
Use the client configuration tab to configure the connecting side of a tunnel.
Server - tunnel server IP address or hostname
Port - tunnel server port number
Heartbeat - Keep-alive message interval
Description - description of the tunnel connection
New - Open New Tunnel Connection You can create a new tunnel connection to the server that has generated the certificate.
Active Tunnel - Activate the tunnel connection
Check Server CommonName - Clear this option to disable the Server IP address verification (see, Setting up a Tunnel in a NAT
Environment).
Description - description of the tunnel connection
Server - IP address of the tunnel server
Password - the password that you received with the server certificate
Server Port - the server communication port Default, 48003
Keep-alive interval - small data packets are sent at the specified interval
Certificate - paste the client certificate in this field See, Creating Client Certificates for more information.
Edit - edit the selected server connection
Delete - delete the selected server connection
Certificate - display the selected client certificate
Access List
By default, all requests and messages are routed through the tunnel. The routing is transparent to CA UIM users.
Use the Access List to set access rules for tunnels. Define the rules to restrict the access privileges for UIM addresses, commands, and users.
Access Lists are defined on the tunnel client hub.
Access list rules:
Accept rules enable access. Set up rules to grant access to probes, robots, and hubs, and to execute specific commands for users.
Deny rules disallow access for the specified addresses, commands, or users.
Log rules log all requests through the tunnel. Use log rules for testing. View the results in the hub log file.
Use Edit Rule to add, modify, or delete access rules. Access rules consist of four criteria. When all four criteria are met, the rule is
triggered.
Source IP is the name of the source hub, robot, or probe.
Destination Address is the address of the target hub, robot, or probe.
Probe Command is the specific command to allow or deny. The command set varies by probe. To view a command set, open the
Probe Utility.
User is the user to allow or deny access.
Note: Regular expressions are allowed.
The rules table displays the rules that you have created. The order of the rules is important. The first rule in the list is processed first.
Processing stops on the first rule that matches all four criteria.
Use the Move Up and Move Down buttons to change the order of the rules in the list.
Advanced
Use the advanced tab to assign the first tunnel port, establish the hang timeout, and configure the SSL session cache.
Ignore first probe port settings from controller
The first tunnel is automatically assigned the port number in First Probe Port Number on the Setup, Advanced tab in the controller co
nfiguration.
If more than one tunnel is defined, select this option to enable the First Tunnel Port field.
First Tunnel Port
For more than one tunnel, you can specify the first port in the range of ports to be used.
When this field is blank, the operating system assigns random ports.
Clients are assigned ports from the configured port range, and keep the port as long as the hub is running.
Servers assign ports from the configured port number and increase for each new client connection. If there are no active
clients, the hub resets the counter.
Tunnel is Hanging Timeout
The hub continuously checks if one or more of the active tunnels are hanging. No new connections can be established through tunnels
that are hanging.
If a tunnel is hanging, the hub attempts to restart the tunnel. If the restart fails, the hub performs a restart after the specified number of
seconds.
SSL Session Cache
Use Server Cache enables caching of SSL sessions, and reuse of session credentials. If Use Client Cache is enabled on the client,
Use Server Cache speeds up the connection time.
Server Cache Timeout defines how long the cached sessions are valid for reuse by the client.
Server Cache Size defines how many sessions can be stored in the cache. When the cache is full, the oldest sessions are deleted as
new connections are established.
Use Client Cache enables caching of SSL sessions on the client hub.
Status Tab
This tab contains four subsections that provide status information about the queues, subjects, and tunnels you have defined.
Subscribers/Queues displays a list with status information about all subscribers and queues on the hub. You can view the messages
that the hub forwards. Use the status to assist with debugging and load monitoring for the hub. The fields in the list are:
Name of the queue
Type of subscriber
Queued shows the number of messages waiting to be transferred. If you do not use spooling, this number is typically 0, for as long as
the subscriber is alive.
Sent shows the number of sent messages
Bulk Size is the maximum number of messages that are sent at once
Subject/Queue is the name of the queue or subject that the subscriber subscribes to
ID for the connected probe or program
Established shows when the hub connected to the subscriber
Address of the subscriber
Connection is the address of the subscriber network connection
Subjects a count of messages by subject from the last restart
Tunnel Status displays two windows. The upper window, which shows all tunnels that the hub is running, provides this information:
Peer Hub is the IP address or hostname of the tunnel peer
Started shows the initial tunnel connection time
Last shows the time of the last connection through the tunnel
Connection stats (ms) are the statistics for the time that is taken to set up the tunnel connection
A low minimum value can indicate a low bandwidth
A high minimum value and a high average value can indicate packet loss
Connections shows the number of connections made
Traffic in/out shows the amount of data that is received and sent through the tunnel
When you select a tunnel in the upper window, the connections appear in the lower window:
State of the connection (idle or active)
Start time of the connection
Last transfer time
In and Out show the amount of data that is received or sent
Important: If your tunnel server will have multiple tunnel clients, you can control the port assignments instead of letting the hub assign
them based on the configuration for the controller probe. To control them, follow the steps in Controlling the Ports Assigned to Tunnels
before you create client certificates.
1. Ensure that both hubs appear in the Admin Console navigation pane. If either does not, create a static route to the hub.
2. Determine which hub will be the tunnel server.
Recommendation: Because the tunnel server uses a fair amount of computing power, designate the system with the lower load as the
tunnel server. If a central hub will have several remote hubs attached to it, make the remote hubs the tunnel servers so that each remote
hub only adds a small amount of overhead to the central hub.
Important: This configuration is recommended for advanced users only. Contact Support if you need assistance.
1.
1. In Admin Console, expand the hub that will be the tunnel server. Open its hub probe in the configuration GUI and navigate to Advanced
> Tunnel Settings.
2. In Tunnel Advanced Settings:
Enable Ignore Controller First Probe Port.
Specify the First Tunnel Port, the port to be used by the first tunnel you set up. For each additional tunnel, the tunnel server
increments the number and assigns that port to the tunnel client. The client keeps that port as long as the hub is running. Note the
following:
The server does not keep track of disconnected clients. If a tunnel client is connected to the server, this number increments, even if
a previously used port becomes available. However, if there are no active clients, the counter resets.
If you plan to configure more than one tunnel, we recommend you specify the first port. Make sure you do NOT use the port range
that the controller probe uses.
If this field is blank, the operating system assigns random ports.
Make sure you do NOT use the port range that the controller probe uses.
3. Click Save.
Recommendation: Use Infrastructure Manager to create these lists. Tunnel access lists created with the Admin Console hub
configuration GUI may have issues. Refer to Access List in the Infrastructure Manager hub configuration guide for details.
Used by
alarm
Alarm messages
alarm2
alarm_new
alarm_update
alarm_close
Message sent when a client closes (acknowledges) an alarm and removes it from the currently active alarms
alarm_assign
Message sent when a client closes (acknowledges) an alarm and removes it from the currently active alarms
alarm_stats
statistical event messages generated by the NAS probe that contain severity level summary information for all open
alarms
audit
probe_discovery
device information
QOS_BASELINE
QOS_DEFINITION
QOS_MESSAGE
Example Queue
The following example shows how to configure an attach and get queue pair with the subject probe_discovery. This allows automated discovery
data to flow up from secondary hubs to the discovery server on the primary hub.
Important: The 7.70 release of the UIM hub and robot have improved the robustness of SSL communication. Prior to this version, in a
non-tunneled domain, hubs that were configured to communicate unencrypted were able to decode and respond to encrypted
messages. In a multi-hub domain, upgrading to v7.70 will break this type of communication. For details, see Impact of Hub SSL Mode
When Upgrading Non-tunneled Hubs in the Hub (hub) Release Notes.
Note: Any tunnels set up between hubs will remain after you upgrade, and communication will continue. CA strongly recommends that
you connect all UIM hubs with tunnels to help ensure communication integrity.
hub
Advanced
SSL
Tunnel Settings
Hub List
Name Services
Queue List
Robot List
Tunnel
1 - Tunnel Server
2 - Tunnel Client
3 - Tunnel Access List
To access the hub configuration interface, select the hub's robot in the Admin Console navigation pane. In the Probes list, click the
arrow to the left of the hub probe and select Configure.
hub
The hub node lets you view information about the hub and adjust log file settings.
Probe Information displays the probe name, start time, version and vendor.
Hub Information displays the hub name, domain, IP address, hub address (/domain/hub_name/robot_name/hub), and uptime data.
License Information displays details about the license used for the hub's robot is displayed; the total number of licenses and the number
available is also shown. An invalid license stops the message flow from the hub to its subscribers (mostly service probes) and prevents
the robot spoolers from uploading their messages.
General Configuration lets you modify log file settings:
Log Level specifies the level of alarm information saved in the log file. 0 - Fatal (default) logs the least; 5 - Trace logs all alarms.
Recommendation: Log as little as possible during normal operation to reduce disk consumption. Increase the level when debugging.
Log Size controls the amount of data retained in the log file (in KB, default is 1024). Large log files can cause performance issues,
therefore use caution when changing this size.
Actions > Set In-Memory Log Level writes the changes to the log file immediately without restarting the hub, which lets you view
more detail about the hub's current activity. The settings are retained until the hub restarts.
Advanced
The Advanced node allows you to control the hub's connectivity behavior.
Hub Settings controls how the hub communicates:
Hub Request Timeout specifies how long the hub waits for a response from other hubs. Default: 30 seconds.
Hub Update Interval specifies how often the hub sends its messages to the other hubs. Default: 600 seconds.
Origin identifies the sender for data sent by the probes. It is used when reports are generated. This field obtains the origin from the co
ntroller probe configuration. This field is blank if the origin is not specified in the controller, and the hub name is used. The origin is
specified in the Controller probe configuration.
Disable IP Validation turns off the IP address validation the hub does for all computers sending requests to its probes. It is typically
used when using NAT (Network Address Translation).
Login Mode provides three options:
Normal (default) allows logins from any robot connected to the hub.
Local Machine Only allows logins only from the computer hosting the hub. Attempts from any other robot connected to the hub
are refused.
No Login disables all logins to the hub.
Broadcast Configuration controls whether and where the hub lets other hubs know it is active:
Broadcast On (default) enables the hub to broadcast its status.
Broadcast Address is the IP address on which the hub broadcasts. Default is 255.255.255.255 (the default broadcast address for
any local network).
Lockout Configuration controls the lockout settings for the hub to avoid leaving the system vulnerable to brute-force password
guessing:
Login Failure Count specifies the number of attempts from a single IP address.
Lockout Time specifies the number of seconds that must pass before a user can attempt to log in after a failure.
Robot Settings controls the alarm settings for events that occur on all robots connected to the hub:
Inactive Robot Alarm Severity specifies the level or warning sent when a robot fails to respond.
Audit Settings for Robots lets you turn auditing on or off for all of the hub's robots, or allow each robot to use its own settings.
Note: Auditing records important events, such as starting and stopping the robot.
Audit Once per User
Queue Settings controls the behavior and size of queues:
Reconnect Interval is the number of seconds between a disconnected hub's attempts to reconnect (default is 180).
Disconnect Passive Queues specifies how long a queue can be passive (receive no messages) before being disconnected (default
is 180).
Post Reply Timeout specifies how long a hub waits for a reply to a message. A timeout occurs if no response is received within this
interval.
Alarm Queue Size is the size of the queue file on the hub. An alarm is sent if the queue exceeds this threshold (default is 10 MB)
SSL
The Advanced > SSL node lets you configure the hubs SSL mode, which specifies the communication mode for components managed by the
hub. It is primarily used for robot-to-hub communication. However, when hubs are not connected with tunnels, hub-to-hub communication is also
controlled by each hubs SSL mode.
SSL settings for UIM components are controlled each component's hub. The hub propagates SSL settings to the robots; each robot then
propagate the settings to its probes. SSL settings are specific to each hub. Set them on each hub that requires SSL.
Mode provides three options:
Normal (SSL mode 0) No encryption. The OpenSSL transport layer is not used.
Compatibility mode (SSL mode 1) Enables the hub and robot to communicate either without encryption or with OpenSSL
encryption. Components first attempt to communicate via SSL. If a request is not acknowledged, the component sends its requests to
the target unencrypted.
SSL Only (SSL mode 2) OpenSSL encryption only.
Cypher Type specifies the Cypher Suite used by the OpenSSL library.
Tunnel Settings
The Advanced > Tunnel Settings node lets you control the behavior of the hub's tunnels.
Tunnel Advanced Settings control how tunnels connect:
Ignore Controller First Probe Port controls how tunnel ports are assigned.
Enabled: the hub uses the First Tunnel Port setting (recommended if the hub will have more than one tunnel server).
Disabled: the tunnel is assigned the port number specified as First Probe Port in the controller probe configuration.
First Tunnel Port specifies the port to be used by the first tunnel you set up. For each additional tunnel, the tunnel server increments
the number and assigns that port to the tunnel client. The client keeps that port as long as the hub is running. Note the following:
The server does not keep track of disconnected clients. If a tunnel client is connected to the server, this number increments, even if
a previously used port becomes available. However, if there are no active clients, the counter resets.
If you plan to configure more than one tunnel, we recommend you specify the first port. Make sure you do NOT use the port range
that the controller probe uses.
If this field is blank, the operating system assigns random ports.
Hang Timeout (in seconds, default is 120) specifies the interval between automatic tunnel restart attempts. The tunnel server
continuously checks the status of its tunnels. If a tunnel does not respond, the hub attempts to restart it. If it does not respond within
the time specified, it attempts another restart, and will continue to do so until the tunnel is active.
Tunnel SSL Session Cache controls SSL caching:
Use Client Cache / Use Server Cache enables caching of SSL sessions, which allows previous session credentials to be used.
Enabling both options significantly speeds up the server/client connection time.
Server Cache Timeout (in seconds) specifies how long the cached sessions are valid for reuse by the client. Default is 7200 (2
hours).
Server Cache Size specifies how much data is stored in the cache. Default is 1024 KB.
Hub List
The Hub List node lists all the hubs within a UIM domain, displays information about them, and lets you check their status.
Hub List provides the following information about each hub:
Domain
Name
Status
Version of the hub probe
Last Updated, date and time when the hub probe was last restarted
IP address
Port
Three commands let you check the status of other hubs:
Actions > Alive Check checks the status of the selected hub.
Actions > Response Check checks the response time (connect - reconnect, no transfer) between your hub and the one selected in
the list.
Actions > Transfer Check transfers data from your hub to the one selected in the list and checks the transfer rate.
Name Services
The Name Services node lets you ensure hubs separated by firewalls or routers can discover each other and that hubs in a NAT environment
can return requests.
Static Hub List Entry lets you enter information for the static route:
Active: enable to ensure the route is active upon creation.
Synchronize: enable to ensure the hub sends status information to the static hub.
Hostname/IP of the static hub.
Actions > Create Static Hub sets up the static route.
Static Hub List displays the hubs to which there is a static route from the hub being configured:
Active indicates the route is active.
Synchronize indicates the hub is sending status information to the static hub.
Name, IP, Domain, and Robot Name identify the static hub.
Actions > Remove Static Hub removes the selected static hub.
Network Aliases let the hub know the appropriate return address for requests from remote hubs in a NAT environment:
From Address is the address from which the remote hub sends requests.
To Address is the address to which the responses should be sent.
Queue List
Navigation: hub > Queue List
The Queue List node lets you create hub-to-hub queues.
Queue List Entry lets you add a new queue subject.
Subject To Add lets you specify the new subject.
Note: Some subjects are reserved for use by UIM probes. See Reserved UIM Subject IDs.
Actions > Add Subject To List adds a queue subject immediately so it can be used in a new queue.
Queue List Configuration lets you enter information for new queues or view the configuration of existing queues. Some fields are
specific to the type of queue being created.
New and Delete let you add and delete queues.
Queue Name is the name of the queue being created.
Active shows the queue status.
Type specifies the type of queue being created: attach, post or get.
Hub Address (get queues) is the UIM address of the hub that has the corresponding attach queue.
Subject (attach or post queues) specifies the type(s) of messages to collect in the queue.
Remote Queue Name (get queues) is the name of the corresponding attach queue.
Remote Queue List (get queues) displays available attach queues found in the domain.
Bulk Size specifies the number of messages to be transferred in one package.
Robot List
The Robot List node lists all the robots controlled by the hub, displays information about them, and lets you restart them.
Robot List displays the following information about each robot:
Name
Status
IP address
Version of the robot probes
OS version and information
Two commands are available.
Actions > Alive Check checks the status of the selected robot.
Actions > Restart restarts the selected robot.
Tunnel
Navigation: hub > Tunnel
The Tunnel node enables tunneling on a tunnel server or tunnel client. This must be done once on each hub that will have a tunnel:
Tunnel Active: Check this option and then click Save to enable tunneling.
1 - Tunnel Server
The Tunnel > 1 - Tunnel Server section lets you configure a hub to be a tunnel server.
Certificate Authority (CA) Initialization lets you designate a hub as a Certificate Authority.
Note: This is a one-time task. After it has been done, this section will display Tunnel Server CA Is Initialized.
2 - Tunnel Client
The Tunnel > 2 - Tunnel Client node lets you configure a hub to be a tunnel client.
Client Certificate Configuration lets you add, delete and view tunnel client certificate:
New and Delete add and delete tunnel client certificates.
Certificate ID is the number assigned to the certificate.
Active shows the certificate status.
Server * specifies the IP address of the tunnel server hub.
Server Port * specifies the port to be used for tunneled data.
Check Server 'Common Name' Value makes the tunnel server verify that the tunnel is coming from the IP address specified in the
client certificate. IP address mapping requires that this be disabled in NAT environments. However, CA recommends you leave this
option enabled in all other cases.
Description lets you describe the tunnel.
Password * is the password that was defined when the tunnel client certificate was created.
Keep Alive (in seconds) specifies the interval at which small data packets are sent. This is to allow for firewall connection disruption
on idle connections.
Certificate * is where you paste the client certificate text (which was created on the tunnel server hub).
The Tunnel > 3 - Tunnel Access List node on lets you restrict the access privileges for UIM users, addresses and commands.
Recommendation: Due to issues with tunnel access lists created with this release of the Admin Console hub configuration GUI, CA
recommends you use Infrastructure Manager to create these lists. Refer to Access List in the Infrastructure Manager hub configuration
guide for details.
Note: This is normally used for debugging purposes when testing commands against targets before setting them up as
accept or deny rules. The result can be viewed in the hub log file before your deny or accept rules.
Setting up hub-to-hub communication. If you have any secondary hubs in your deployment, you must create queues between them and
the primary hub. If they are separated from the primary hub by a firewall, you will also need to create a static route to the hub, then create
tunnels and queues to connect them to the primary hub.
Checking traffic statistics for the hub and monitor its message flow.
Performing advanced configuration to tune hub performance.
See the following sections for details:
Important! Do not connect a hub with both a tunnel and a static route. In some situations, data could be transmitted over the
insecure static route rather than over the secure tunnel. If you set up a static route so that you can configure a tunnel, make
sure you delete the static route after the tunnel is complete.
If you have any secondary hubs in your deployment, you must create queues so that messages from those hubs can reach the primary hub. You
will create either:
Attachqueues on all secondary hubs to collect messages, and corresponding get queues on any intermediary hubs and on the primary
hub.
Post queues, which stream messages directly to a destination hub.
Note: The type of queue you select determines which fields are active in the New Queue dialog.
Used by
alarm
Alarm messages
alarm2
alarm_new
alarm_update
alarm_close
Message sent when a client closes (acknowledges) an alarm and removes it from the currently active alarms
alarm_assign
Message sent when a client closes (acknowledges) an alarm and removes it from the currently active alarms
alarm_stats
Statistical event messages generated by the NAS probe that contain severity level summary information for all open
alarms
audit
probe_discovery
Device information
QOS_BASELINE
QOS_DEFINITION
QOS_MESSAGE
Example Queue
The following example shows how to configure an attach and get queue pair with the subject probe_discovery. This allows automated discovery
data to flow up from secondary hubs to the discovery server on the primary hub.
The solution is to set up a secure shell (SSH) tunnel between two hubs that are separated by a firewall. The tunnel sets up a VPN (Virtual Private
Network) connection between the two hubs. All requests and messages are routed over the tunnel and dispatched on the other side. This routing
is transparent to users.
Tunnels can be created during hub installation, or afterward by configuring the hubs. This following process explains how to set up tunnels
between hubs that are already installed. Click the links for more information.
To set up a tunnel, you must:
1. Determine which hub will be the tunnel server.
The Tunnel Server is the hub that initiates the setup of the tunnel.
The Tunnel Client accepts the attempt to setup the tunnel.
Recommendation: Because the tunnel server uses a fair amount of computing power, designate the system with the lower load as the
tunnel server. If a central hub will have several remote hubs attached to it, make the remote hubs the tunnel servers so that each remote
hub only adds a small amount of overhead to the central hub.
2. Ensure that both hubs appear in the navigation tree. Hubs discover each other by sending out broadcast (UDP) messages. However,
hubs separated by routers and firewalls are unable to discover other hubs by this mechanism. You must configure a static route to these
hubs.
3. Set up the tunnel server as a certificate authority. This creates a server certificate and gives the hub the ability to generate digital client
certificates.
4. Create the certificate for the tunnel client. This task is done on the tunnel server.
5. Install the certificate on the tunnel client.
6. Remove the static route to either hub if you created one.
7. Create queues between the hubs, just as you would for secondary hubs that are not connected with tunnels. Refer to Setting up Queues.
8. Optional: On the tunnel client, you can set up access list rules to restrict tunnel access privileges for specified UIM addresses,
commands, or users.
Note: Networks that use NAT (Network Address Translation) affect how a tunnel is configured. For example configurations, refer to Setting up a
Tunnel in a NAT Environment.
7.
4. Click OK.
5. In the Issued Certificates field, click View to display the certificate information.
6. Click Copy. Open a text editor (such as Notepad) and paste the certificate into a new blank file. Save the file to a location where the
tunnel client can access it. Exit the Certificate Information dialog.
7. Click Apply, then click Yes when asked to restart the probe.
Tip: When making a restrictive list, start with a few Accept rules, then make a Deny rule that denies access for all others. When making a less
restrictive list, start with a few Deny rules, then make an Accept rule that gives access for all others.
To add a rule:
1. Navigate to the Access List tab.
2. In the Edit Rule area, enter:
Source IP - IP address for the source hub, robot or probe
Destination Address - address of the target hub, robot or probe
Probe Command - the specific command you want to allow or deny (to view a probe's command set, open the Probe Utility for the
probe)
User - name of the user you want to allow or deny access
3. Select the mode: Accept, Deny, or Log.
4. Click Add. The rule is added to the rule table.
In the rules table, use the Move Up and Move Down buttons to change the order of the rules. The order sets the priority.
In the example below, the first rule allows the administrator user on all nodes to access the target hub through the tunnel. The second
rule denies access to all users. If you place the Deny rule first, all users - including administrator - are denied access.
Important! You should be aware that when a tunnel is configured, it replaces the static hub and NAT setup in the hub configuration.
The Client certificate must be issued to (CommonName)--the IP address that is visible to the Server, which is in this case 10.2.1.111, not 193.71.5
5.111.
Server address in NAT environment
Uncheck the Check Server CommonName option in the Tunnel Client Setup Window. The reason for this is that the Server certificate has
10.1.1.1 as CommonName, not 202.1.1.1 which is what the client sees.
Server and Client addresses in NAT environment
Combine the two methods described above. The Client certificate must be issued to (CommonName)--the IP address that is visible to the Server
(10.2.1.111) and uncheck the Check Server CommonName option in the Tunnel Client Setup Window.
Important: The 7.70 release of the UIM hub and robot have improved the robustness of SSL communication. Prior to this version, in a
non-tunneled domain, hubs that were configured to communicate unencrypted were able to decode and respond to encrypted
messages. In a multi-hub domain, upgrading to v7.70 will break this type of communication. For details, see Impact of Hub SSL Mode
When Upgrading Non-tunneled Hubs in the Hub (hub) Release Notes.
Note: Any tunnels set up between hubs will remain after you upgrade, and communication will continue. CA strongly recommends that
you connect all UIM hubs with tunnels to help ensure communication integrity.
Alarm Configuration
Advanced alarm configuration lets you trigger alarms in certain situations. Add alarm keys to the hub section.
You can trigger alarms to:
Send an alarm when there are a significant number of subscriptions or queues (which can decrease hub performance). Specify:
LDAP Configuration
You can add two keys in the /LDAP/server section to affect how the hub communicates with the LDAP protocol.
Timeout specifies the number of seconds to spend on each LDAP operation, such as searching or binding (authentication) operations.
Default: 15 seconds.
codepage is used when translating characters from UTF-8 encoding to ANSI. When text comes from the LDAP library as UTF-8, UIM
uses this codepage to translate the characters into ANSI.
Windows: Specify a valid codepage.For a list of codepages, go to . Note that the hub LDAP library uses the MultibyteToWideChar
and WideCharToMultiByte functions to translate to and from ANSI/UTF-8. These functions take a codepage as a parameter. Default: 2
8591
Unix: Use iconv functions. Refer to http://www.gnu.org/software/libiconv. Default: ISO-8859-1
Note: The codepage key is not shipped with the hub configuration file. The default codepage is ISO 8859-1 Latin 1;
Western European (ISO).
Description
Default
passive_robot_threads
The hub communicates with a robot through worker threads from a thread pool. CA
recommends you set the pool to 50% of the number of passive robots managed by the hub.
10
passive_robot_interval
The number of seconds between trying to retrieve messages from a passive robot.
15
seconds
passive_robot_messages
The number of messages the hub will accept from a passive robot in a single retrieval interval.
1000
passive_robot_comms_timeout
The amount of time the comms API will block a call to a robot that is not responding.
15
seconds
passive_robot_max_interval
The interval between retrying a non-responding passive robot will double every 10 minutes up to
this value.
passive_robot_restart_wait
Number of seconds the hub waits for passive robot management threads to stop before it kills
them.
Important! This should not be modified without advice from support because it can
cause a significant delay in monitoring functions.
60
seconds
General Tab
Hub Advanced Settings
General Advanced Settings
SSL Advanced Settings
LDAP Advanced Settings
Hubs Tab
Robots Tab
Name Services Tab
Queues Tab
Tunnels Tab
Server Configuration
Client Configuration
Access List
Advanced
Status Tab
General Tab
The General tab contains the basic hub information in the following sections.
Hub information provides the following:
Hub name
Domain to which the hub belongs
Hub IP address
Hub address in UIM format:/domain/hub_name/robot_name/hub
Version number and distribution date of the hub probe
Uptime, the length of time the hub probe has been running since the last time it was started
Modify (button) opens the Edit Hub Address dialog, enabling you to edit the hub name and domain (note that the hub's controller
probe restarts if you modify these parameters)
License information Each hub maintains a license system used by all of the robots connected to the hub. An invalid license causes
the message flow from the hub to its subscribers (mostly service probes) to stop. It also stop the various robot spoolers from uploading
their messages as long as the license key is invalid. The license key is built based on the following fields:
Licenses in use shows the number of robots currently connected to this hub, and the number of robots the license allows.
Expire date specifies when the license expires. An asterisk (*) indicates an unlimited license.
Owner of the license.
Modify (button) opens the Edit License dialog, which contains the hub's license key. License keys are provided by CA and must be
entered exactly as specified.
Login Settings for this hub lets you configure your log in settings for the hub.
Normal (login allowed) allows users to log on the hub from any robot.
Local machine only allows normal login on the hub if attempting to log in from the computer hosting the hub. Attempts to log in from
other robots are refused
No login disables login to the hub from either the local machine or a robot connected to the hub.
Log Level lets you specify the level of detail written to the log file.
Recommendation: Log as little as possible during normal operation to reduce disk consumption, and increase the level of detail when
debugging.
Log Size sets the size of the log file (default is 1024 KB). Large log files can cause performance issues, therefore use caution when
changing this size.
Advanced allows you to view and configure advanced options for the hub.
Enable tunneling activates the tunnel tab, where you can configure the tunnels. To disable tunnels, uncheck this option and click Ap
ply.
Disable IP validation. When a computer sends a request to a probe, the computers IP-address is validated. This option is typically
used when using NAT (Network Address Translation). See Setting up a Tunnel in a NAT Environment.
Statistics (button) displays traffic statistics for the hub for the previous 12 hours.
The graph shows the number of messages sent and received per minute, in addition to the number of requests. The Period section
lets you specify a time period. Click Get to update the values.
See Checking Traffic Statistics for more information.
Monitor (button) displays the current hub traffic. See Monitoring the Message Flow for more information.
View Log (button) displays the contents from the hubs log file. The log settings allow you to set the level of detail for the logging
function. The Log Viewer window provides the following:
File lets you save or print the file.
Edit lets you copy the contents and search in the log file.
Actions lets you limit the output in the window and highlight text or the date within the log file.
Start and Stop buttons let start/stop the log file updates.
Settings (button) opens the Hub Advanced Settings dialog. See Hub Advanced Settings for more information.
Hub Advanced Settings
Important! These changes are not persistent, that is they do not survive over a hub stop and start.
Origin identifies where a message came from. QoS messages from probes are tagged with a name to identify the origin of the data. By
default, the name of the probes's parent hub is the origin. To override that value, you can:
Change the value here. The new value will be used for all QoS messages handled by the hub.
Set the origin at the robot level (in the Setup > Advanced section in controller configuration GUI).
Audit Settings for Robots allows the recording of important events, such as starting and stopping the robot. This setting is used for all
robots serviced by this hub. Select one of the following options:
Override: audit off
Override: audit on
Use audit settings from robot
SSL Advanced Settings
There are two configuration options for LDAP: direct LDAP or Nimsoft Proxy Hub.
Direct LDAP
The hub can be configured to forward login requests to a LDAP server. This makes it possible to log on to the UIM consoles with LDAP
credentials. Users belonging to different groups in LDAP can be assigned to different UIM Access Control Lists (ACLs).
Note: Direct LDAP is only available on Linux and Windows hubs, due to the availability of the LDAP library the hub uses. Native LDAP is
not supported on Solaris.
LDAP authentication includes:
Server Name: The hub can be configured to point to a specific LDAP server, using IP address or host name. A Lookup button lets you
test the communication. If you used a non-standard port for your hub, you must use the syntax "hostname:port" to indicate the server
name.
Note: You can specify multiple servers in this field, each separated with a space. The first entry acts as a primary server, while the
others act as secondary servers (taking over if the primary server goes down). Logins may take more time if a secondary server has
taken over.
Server Type: Choose an LDAP server type. Currently two server types are supported: Active Directory and eDirectory.
Authentication Sequence: Specify whether the hub authenticates using the LDAP login or Nimsoft Login first. For example, if you
select Nimsoft > LDAP, this means that the user will be verified against Nimsoft user credentials first. If this fails the hub will try to
verify the credentials against LDAP server credentials.
Use SSL: Select this option if you want to use SSL during LDAP communication. Most LDAP servers are configured to use SSL.
User and Password: You must specify a user name and a password to be used by the hub when accessing the LDAP server to
retrieve information:
Active Directory - the user can be specified as an ordinary user name.
eDirectory - the user must be specified as a path to the user in LDAP in the format CN=yyy,O=xxx, where CN is the user name and O
is the organization.
Group Container (DN): Specify a group container in LDAP to define where in the LDAP structure to search for groups. Click Test to
check if the container is valid.
User Container (DN): Specify a user container in LDAP to define more specifically where in the LDAP structure to search for users.
Nimsoft Proxy Hub
The hub can be configured to specify a UIM probe address to log in through.
Proxy Hub: The drop down list is empty by default. Click the Refresh icon next to the drop down list to perform a gethubs probe
request on the hub you are configuring, which will populate the drop down list with the hubs it knows about.
Proxy Retries: Specify the number of retries to perform in case of communication errors (network errors).
Authentication Sequence: Specify whether the hub authenticates using the LDAP login or Nimsoft Login first. For example, if you
select Nimsoft > LDAP, this means that the user will be verified against Nimsoft user credentials first. If this fails the hub will try to
verify the credentials against LDAP server credentials.
Proxy Timeout: Specify the time (in seconds per attempt) after which the proxy will be timed out.
Hubs Tab
This tab lists all known hubs and displays their information in different text colors:
Blue: Hub is within the same domain as the hub you are currently logged on to.
Black: Hub is outside the domain.
Red: Hub status is unknown, typically because the hub is not running.
Robots Tab
The Robot tab lists lets you set the alarm level for robots and displays robot information.
Inactive Robot setup lets you set the severity level of the alarm issued if one of the robots in the list becomes unavailable.
Registered Robots displays the following information for each robot controlled by the hub:
Name
Type - regular or passive
IP address
Version of the robot software
Created - when the robot was installed
Last Update - when the software was last updated
Operating system of the robot's host system
Right-clicking in the window opens a small menu with the following options:
Restart
This option only re-reads the configuration file for the selected robot. It does not do a stop and restart of the robot. If you have made any
changes to the robot configuration you must stop and restart the robot.
Check
Checks the selected robot.
Remove
Removes the selected active or passive robot from the list. An active robot may show up later since active robots will periodically request
the hub add them to the Registered Robots list.
Add Passive Robot
Opens the dialog to add a passive robot.
Note the Synchronize option in the New Static Hub dialog. If this option is not checked, the parent hub will not send status info to the
static hub. The parent hub will still receive status info from the static hub, unless you disable the Synchronize option on the static hub as
well.
This option can be disabled to reduce network traffic if your network runs on a telephone line or ISDN.
Important! Do not connect a hub with both a tunnel and a static route. In some situation, data could be transmitted over the
insecure static route rather than over the secure tunnel. If you set up a static route so that you can tunnel to a hub, make sure
you delete the static route after the tunnel is configured.
Network Alias tells the local hub the return address on requests from a remote NAT hub.
On hub A, set up the From address and the To address for hub B.
On hub B, set up the From address and the To address for hub A.
When hub B sends a request to hub A, the request will contain hub Bs From address. Hub A then knows that hub Bs To address must
be used when returning a request to hub B.
Queues Tab
The Queues tab lists all defined message queues. These include the queues that are automatically deployed, and those that an administrator
adds as needed. For example, if alarms need to be communicated from a secondary hub to the primary hub, you would create an attach queue
named nas with the subject alarm to forward alarms.
To edit a message queue, double-click the message queue (or select the message queue and click the Edit button).
To define a new queue, click New.
A queue is a holding area for messages passing through the hub. Queues are temporary or permanent:
Permanent queue content survives a hub restart. Permanent queues are meant for service probes that need to pick up all messages,
regardless of whether the service probe was running when they were created.
Temporary queue content is cleared during restarts. These queues are typically used for events to management consoles.
All queues defined on the Queues tab are permanent queues. Permanent queues are given a name related to their purpose. The permanent
queue named NAS is attached to the NAS (Nimsoft Alarm Server). You can set up a permanent queue from one hub to another by defining it as a
post type queue with the full UIM address of the other hub.
A Post queue sends a directed stream of messages to the destination hub.
Tunnels Tab
This tab is enabled if the Enable Tunneling option is checked on the General tab of the hub configuration GUI.
Most companies have one or more firewalls in their networks, both internally between different networks and externally against a DMZ or Internet.
This makes it challenging to administer and monitor the whole network from a central location.
The solution is to set up a tunnel between two hubs that are separated by a firewall. The tunnel:
Sets up a VPN-like (Virtual Private Network) connection between the hubs. All UIM requests and messages are routed over the tunnel
and dispatched on the other side. This routing is transparent to all UIM users.
Requires that the firewall open one port for connection to the target hub.
Is implemented using the SSL (Secure Socket Layer) protocol.
Security is handled in two ways; certificates that authenticate the Client, and encryption to secure the network traffic over the Internet.
Authorization and Authentication
The tunnel provides authorization and authentication by using certificates. Both the Client and the Server need valid certificates issued by
the same CA (Certificate Authority) in order to set up a connection. In the case of setting up a tunnel, the machine receiving the
connection (the Server) is its own CA and will only accept certificates issued by itself.
Encryption
The encryption settings spans from None to High. No encryption means that the traffic is still authenticated and is therefore
recommended for tunnels within LANs and WANs. You should be careful when selecting higher encryption level since this will be more
resource intensive for the machines at both ends of the tunnel.
Important! Do not use static hubs (listed under the Name Services tab) when setting up a tunnel.
Configuration tasks under this tab are for server (listening) side of the tunnel. The tab contains the following fields and buttons.
Active activates the tunnel server. The Certificate Authority Setup dialog appears. See Setting up a Tunnel for more information.
Common name is the IP address of the hub on the server side.
Expire date is the date the server certificate expires.
Port specifies the port that the tunnel server is listening on. This is the port that you have to open in your router or firewall for incoming
connections.
Security settings lets you select None, Low, Medium, High, or Custom, where you define your own security setting. For custom
definition: See http://www.openssl.org/docs/apps/ciphers.html
Note: High gives the highest degree of encryption, but slows the data traffic a lot. Normally None will be sufficient, where data is not
The configuration tasks under this tab are for the client (connecting) side of the tunnel.
The fields are:
Server
The tunnel servers IP address or hostname.
Port
The tunnel servers port number.
Heartbeat
Keep-alive message interval.
Description
Brief description of the tunnel connection.
New button
Opens the New Tunnel Connection dialog. You can create a new Tunnel connection to the server that has generated the certificate.
Active Tunnel
Activates the defined tunnel connection.
Check Server CommonName
Uncheck this option to disable the Server IP address check on connection (see Setting up a Tunnel in a NAT Environment).
Description
Brief description of the tunnel connection.
Server
The IP address of the server on the server end of the tunnel.
Password
The password you have received with your certificate from the server side.
Server Port
The communication port on the server on the server side. Default is 48003. It is recommended that you do not change the default port.
Keep-alive interval
Small data packets are sent at the specified interval. This is to allow for firewall connection disruption on idle connections.
Certificate
You paste the received certificate in this field. See Creating Client Certificates for more information.
Edit button
Edits the selected server connection.
Delete button
Deletes the selected server connection.
Certificate button
Displays the selected client certificate.
Access List
This tab allows you to set access rules for tunnels. By default, all UIM requests and messages can be routed over the tunnel and dispatched on
the other side. This routing is transparent to UIM users.
The Access List lets you define a set of rules to restrict the access privileges for specified UIM addresses, commands, or users. The Access List
must be defined on the tunnel client hub.
This tab allows you to assign the first tunnel port, establish the hanging timeout, and configure the SSL session cache.
The tab contains the following options:
Ignore first probe port settings from controller
It is not necessary to select this option if only one tunnel is defined, as the tunnel is automatically assigned the port number specified as F
irst probe port number on the Setup > Advanced tab in the controller configuration.
If more than one tunnel is defined, select this option to enable the First Tunnel Port field.
First Tunnel Port
If you have configured more than one tunnel, you should specify the first port in the range of ports to be used by the tunnels.
Important! Do NOT use the same port that is specified for the controller probe's first probe port.
If this field is blank, random ports will be assigned by the operating system.
Clients are assigned ports from the configured port range and keep that port as long as the hub is running.
Servers assign ports from the configured port number and increment the number for each new client connection. If there are no
active clients, the hub resets the counter.
Tunnel is Hanging Timeout
The hub continuously checks if one or more of the active tunnels are hanging. No new connections can be established through tunnels
that are hanging.
If one or more tunnels are hanging, the hub attempts to restart the tunnel(s). If the restart fails, the hub performs a restart after the
specified number of seconds.
SSL Session Cache
Use Server Cache enables caching of SSL sessions and reuse previous session credentials. This speeds up the connection time
between the client and the server (assuming Use Client Cache is enabled on the client).
Server Cache Timeout defines how long the cached sessions are valid for reuse by the client.
Server Cache Size defines how many sessions can be stored in the cache. When the cache is full, the oldest sessions are deleted as
new connections are established.
Use Client Cache enables caching of SSL sessions on the client hub.
Status Tab
This tab contains four subsections that provide status information about the queues, subjects and tunnels you have defined.
Subscribers/Queues displays a list with status information on all subscribers/queues on this hub. You can view the messages received
by the hub and forwarded to interested parties. This status can be used to assist with debugging and load monitoring for your hub. The
fields in the list are:
Name of the queue
Type of subscriber
Queued shows the number of messages waiting to be transferred (unless you use message spooling, this number should be 0 in
normal operation as long as the subscriber is alive)
Sent shows the number of messages sent
Bulk Size is the maximum number of messages sent at once
Subject/Queue is the name of the queue or subject that the subscriber subscribes to
ID for the connected probe or program
Established shows when the hub connected to the subscriber
Address of the subscriber
Connection is the address of the subscribers network connection
Subjects shows a count of all messages that have been transferred since the last (re)start, grouped by the subject. This information can
assist you with debugging and load monitoring for your hub.
Tunnel Status displays two windows. The upper window, which shows all tunnels that the hub is running, provides this information:
Peer Hub is the IP-address or hostname of the tunnels peer
Started shows the initial tunnel connection time
Last shows the time of the last connection through the tunnel
Connection stats (ms) are the statistics for the time taken to set up the tunnel connection (a low minimum value could indicate a low
bandwidth; a high minimum value and a high average value could indicate packet loss)
Connections shows the number of connections made
Traffic in/out shows the amount of data received/sent through the tunnel
When you select a tunnel in the upper window, its connections appear in the lower window, which displays:
State of the connection (idle or active)
Start time of the connection
Last transfer time
In and Out show the amount of data received or sent
Address is the hostname of the target of the request
Command specifies the command executed on the target of the connection
Tunnel Statistics has statistics on SSL and on various events (such as server start time or when the last connection was received). Use
the drop-down list to select the server or client you want to view.
Important: If your tunnel server will have multiple tunnel clients, you can control the port assignments instead of letting the hub assign
them based on the configuration for the controller probe. To control them, follow the steps in Controlling the Ports Assigned to Tunnels
before you create client certificates.
1. Ensure that both hubs appear in the Admin Console navigation pane. If either does not, create a static route to the hub.
2. Determine which hub will be the tunnel server.
Recommendation: Because the tunnel server uses a fair amount of computing power, designate the system with the lower load as the
tunnel server. If a central hub will have several remote hubs attached to it, make the remote hubs the tunnel servers so that each remote
hub only adds a small amount of overhead to the central hub.
3. Activate tunneling on the tunnel server:
a. In Admin Console, expand the hub that will be the tunnel server, then open its hub probe in the configuration GUI.
b. Select the Tunnel node. Check Tunnel Active and click Save.
4. Set up the tunnel server as a Certificate Authority. This creates a Certificate Authority certificate and enables the hub to create client
certificates.
a. Navigate to 1 - Tunnel Server.
b. In Tunnel Server CA Initialization, enter information about your organization.
c. Select Actions > Perform Tunnel Server CA Initialization. The CA certificate appears in the tunnel client certificate list.
d. Modify Tunnel Server Settings (optional):
Set Server Port to any available port (default is 48003).
Increase the Security Setting as desired.
5. Create the tunnel client certificate:
a. Under Client Certificate Configuration, specify the following:
Organization name, administrator email address, location (optional).
Common Name - enter the IP address of the tunnel client hub.
Note: Use regular expression to specify multiple tunnel client hubs and create multiple certificates.
Password - Create a password to be used to establish trust between the tunnel server and client. You will enter this password
when you install the certificate on the client.
Expiration days - Specify how long the certificate will be active.
b. Select Actions > Create Client Certificate. The certificate appears in the tunnel client certificate list
c. Copy all of the text in the Certificate field (below the Tunnel Client Certificate List) and close the GUI.
Note: Save the certificate to a file if you are not going to set up the tunnel client now.
6. Set up the tunnel client:
a.
6.
a. In Admin Console, expand the hub that will be the tunnel client and open its hub probe.
b. Navigate to Tunnel, check Tunnel Active, and click Save.
c. Navigate to 2 - Tunnel Client.
d. Under Client Certificates Configuration, click New. In the fields that appear:
Leave Certificate ID blank.
Server is the tunnel server's IP address.
If your environment uses NAT (network address translation), disable Check Server 'Common Name' Value.
Note: When this option is enabled, the tunnel server must verify that the tunnel is coming from the IP address specified in the
certificate. IP address mapping requires that this be disabled in NAT-ed environments. However, CA recommends you leave this
option enabled in all other cases.
Optional: Enter a Description.
Enter the Password you created for the tunnel server.
Optional: modify the Keep Alive setting.
In the Certificate field, paste the text you copied from the tunnel server.
e. Click Save at the top of the page.
7. If you created a static route to a hub that is now connected to the message bus by a tunnel, you must delete the static route:
a. Expand the hub from which you configured the static route, then open its hub probe in the configuration GUI.
b. Navigate to Name Services and remove the static route.
Important! This must be done to ensure that all UIM data flows through the secure tunnel and not through the static route.
Important: This configuration is recommended for advanced users only. Contact Support if you need assistance.
1. In Admin Console, expand the hub that will be the tunnel server. Open its hub probe in the configuration GUI and navigate to Advanced
> Tunnel Settings.
2. In Tunnel Advanced Settings:
Enable Ignore Controller First Probe Port.
Specify the First Tunnel Port, the port to be used by the first tunnel you set up. For each additional tunnel, the tunnel server
increments the number and assigns that port to the tunnel client. The client keeps that port as long as the hub is running. Note the
following:
The server does not keep track of disconnected clients. If a tunnel client is connected to the server, this number increments, even if
a previously used port becomes available. However, if there are no active clients, the counter resets.
If you plan to configure more than one tunnel, we recommend you specify the first port. Make sure you do NOT use the port range
that the controller probe uses.
If this field is blank, the operating system assigns random ports.
Make sure you do NOT use the port range that the controller probe uses.
3. Click Save.
queue that collects all messages, or create separate queues for different messages.
A get queue retrieves messages collected by an attach queue on another hub.
A post queue sends a stream of messages (based on subject) directly to a destination hub.
An attach or post queue's subject attribute determines which messages are directed to the queue:
The wildcard (*) subject collects all messages in one queue.
Queues can collect messages for more than one subject. Add a new subject with all desired subjects separated by commas (for example,
alarms, alarms2).
A number of subjects are reserved for use by UIM components. They are listed in Reserved UIM Subject IDs.
Keep in mind that queues are first-in-first-out lists, which means messages in a wildcard queue are not prioritized based on subject. If a hub
transfers thousands of messages each second, a critical alarm message might have to wait behind less urgent QoS messages.
Recommendation: In a high-volume environment, create separate queues for important subjects, such as alarm, or for subjects that will create
many messages. Create one multiple-subject queue for all subjects that are not critical.
To see an example of how queues can be set up for discovery, refer to Example Queue.
Used by
Generated
by
alarm
Alarm messages
alarm2
alarm_new
alarm_update
alarm_close
Message sent when a client closes (acknowledges) an alarm and removes it from the currently active
alarms
alarm_assign
Message sent when a client closes (acknowledges) an alarm and removes it from the currently active
alarms
alarm_stats
statistical event messages generated by the NAS probe that contain severity level summary information
for all open alarms
audit
audit probe
probe_discovery
device information
discovery
probes
QOS_BASELINE
QOS_DEFINITION
QOS_MESSAGE
Example Queue
The following example shows how to configure an attach and get queue pair with the subject probe_discovery. This allows automated discovery
data to flow up from secondary hubs to the discovery server on the primary hub.
2. Modify the settings as needed. Refer to Advanced for details on the settings.
3. Click Save at the top of the page.
hub
Advanced
SSL
Tunnel Settings
Hub List
Name Services
Queue List
Robot List
Tunnel
1 - Tunnel Server
2 - Tunnel Client
3 - Tunnel Access List
To access the hub configuration interface, select the hub's robot in the Admin Console navigation pane. In the Probes list, click the
arrow to the left of the hub probe and select Configure.
hub
Navigation: hub
This section lets you view information about the hub and adjust log file settings.
Probe Information
This section displays the probe name, start time, version and vendor.
Hub Information
This section displays the hub name, domain, IP address, hub address (/domain/hub_name/robot_name/hub), and uptime data.
License Information
This section displays details about the license used for the hub's robot is displayed; the total number of licenses and the number available
is also shown. An invalid license stops the message flow from the hub to its subscribers (mostly service probes) and prevents the robot
spoolers from uploading their messages.
General Configuration
This section lets you modify log file settings.
Log Level specifies the level of alarm information saved in the log file. 0 - Fatal (default) logs the least; 5 - Trace logs all alarms.
Recommendation: Log as little as possible during normal operation to reduce disk consumption. Increase the level when debugging.
Log Size controls the amount of data retained in the log file (in KB, default is 1024). Large log files can cause performance issues,
therefore use caution when changing this size.
One command is available.
Actions > Set In-Memory Log Level makes changes to the log file settings take effect immediately without restarting the hub, which
lets you view more detail about the hub's current activity. The settings are retained until the hub restarts.
Advanced
Navigation: hub > Advanced
This section allows you to control the hub's connectivity behavior.
Hub Settings
This section controls how the hub communicates.
Hub Request Timeout specifies how long the hub waits for a response from other hubs. Default: 30 seconds.
Hub Update Interval specifies how often the hub sends its messages to the other hubs. Default: 600 seconds.
Origin identifies the sender for data sent by the probes. It is used when reports are generated. This field obtains the origin from the co
ntroller probe configuration. This field is blank if the origin is not specified in the controller, and the hub name is used. The origin is
specified in the Controller probe configuration.
Disable IP Validation turns off the IP address validation the hub does for all computers sending requests to its probes. It is typically
used when using NAT (Network Address Translation).
Login Mode provides three options:
Normal (default) allows logins from any robot connected to the hub.
Local Machine Only allows logins only from the computer hosting the hub. Attempts from any other robot connected to the hub
are refused.
No Login disables all logins to the hub.
Broadcast Configuration
This section controls whether and where the hub lets other hubs know it is active.
Broadcast On (default) enables the hub to broadcast its status.
Broadcast Address is the IP address on which the hub broadcasts. Default is 255.255.255.255 (the default broadcast address for
any local network).
Lockout Configuration
This section controls the lockout settings for the hub to avoid leaving the system vulnerable to brute-force password guessing.
Login Failure Count specifies the number of attempts from a single IP address.
Lockout Time specifies the number of seconds that must pass before a user can attempt to log in after a failure.
Robot Settings
This section controls the alarm settings for events that occur on all robots connected to the hub.
Inactive Robot Alarm Severity specifies the level or warning sent when a robot fails to respond.
Audit Settings for Robots lets you turn auditing on or off for all of the hub's robots, or allow each robot to use its own settings.
Note: Auditing records important events, such as starting and stopping the robot.
Audit Once per User
Queue Settings
This section controls the behavior and size of queues.
Reconnect Interval is the number of seconds between a disconnected hub's attempts to reconnect (default is 180).
Disconnect Passive Queues specifies how long a queue can be passive (receive no messages) before being disconnected (default
is 180).
Post Reply Timeout specifies how long a hub waits for a reply to a message. A timeout occurs if no response is received within this
interval.
Alarm Queue Size is the size of the queue file on the hub. An alarm is sent if the queue exceeds this threshold (default is 10 MB)
SSL
Hub List
Navigation: hub > Hub List
This section lists all the hubs within a UIM domain, displays information about them, and lets you check their status.
Hub List
This section displays the following information about each hub:
Domain
Name
Status
Version of the hub probe
Last Updated, date and time when the hub probe was last restarted
IP address
Port
Three commands let you check the status of other hubs:
Actions > Alive Check checks the status of the selected hub.
Actions > Response Check checks the response time (connect - reconnect, no transfer) between your hub and the one selected in
the list.
Actions > Transfer Check transfers data from your hub to the one selected in the list and checks the transfer rate.
Name Services
Navigation: hub > Name Services
This section lets you ensure hubs separated by firewalls or routers can discover each other and that hubs in a NAT environment can return
requests.
Static Hub List Entry
This section lets you enter information for the static route.
Active: enable to ensure the route is active upon creation.
Synchronize: enable to ensure the hub sends status information to the static hub.
Hostname/IP of the static hub.
One command is available.
Actions > Create Static Hub sets up the static route.
Static Hub List
This section displays the hubs to which there is a static route from the hub being configured.
Active indicates the route is active.
Synchronize indicates the hub is sending status information to the static hub.
Name, IP, Domain, and Robot Name identify the static hub.
One command is available.
Actions > Remove Static Hub removes the selected static hub.
Network Aliases
In a NAT environment, network aliases let the hub know the appropriate return address for requests from remote hubs.
From Address is the address from which the remote hub sends requests.
To Address is the address to which the responses should be sent.
Queue List
Navigation: hub > Queue List
This section lets you create hub-to-hub queues.
Queue List Entry
This section lets you add a new queue subject.
Subject To Add lets you specify the new subject.
Note: Some subjects are reserved for use by UIM probes. See Reserved UIM Subject IDs.
One command is available.
Actions > Add Subject To List adds a queue subject immediately so it can be used in a new queue.
Queue List Configuration
This section lets you enter information for new queues or view the configuration of existing queues. Some fields are specific to the type of
queue being created.
New and Delete let you add and delete queues.
Queue Name is the name of the queue being created.
Active shows the queue status.
Type specifies the type of queue being created: attach, post or get.
Hub Address (get queues) is the UIM address of the hub that has the corresponding attach queue.
Subject (attach or post queues) specifies the type(s) of messages to collect in the queue.
Remote Queue Name (get queues) is the name of the corresponding attach queue.
Remote Queue List (get queues) displays available attach queues found in the domain.
Bulk Size specifies the number of messages to be transferred in one package.
Robot List
Navigation: hub > Robot List
This section lists all the robots controlled by the hub, displays information about them, and lets you restart them.
Robot List
This section displays the following information about each robot.
Name
Status
IP address
Version of the robot probes
OS version and information
Two commands are available.
Actions > Alive Check checks the status of the selected robot.
Actions > Restart restarts the selected robot.
Tunnel
Navigation: hub > Tunnel
This section enables tunneling on a tunnel server or tunnel client. This must be done once on each hub that will have a tunnel.
1. Tunnel Activation
Tunnel Active: Check this option and then click Save to enable tunneling.
1 - Tunnel Server
CA Certificates
This section lets you create the CA certificates, which give the hub the authority to issue client certificates.
Organization Name, Organization Unit Name, and Email Address identify the issuing entity.
Country Name, State or Province Name, and Locality Name are the location of the receiving entity.
Common Name is the IPV4 or IPV6 address (hexadecimal format) for the tunnel server hub.
Beginning Date and Ending Date specify when the certificate is valid.
Client Certificate Configuration
This section lets you create client certificates.
Note: Every tunnel client that will connect with the tunnel server requires a unique client certificate.
Organization Name, Organization Unit Name, and Email Address identify the receiving entity.
Country Name, State or Province Name, and Locality Name are the location of the receiving entity.
Common Name is the IPV4 or IPV6 address (hexadecimal format) for the tunnel client hub.
Note: The tunnel client hub must be active when the certificate is created.
Password lets you specify the password that will allow the tunnel client hub to access the tunnel server.
Beginning Date and Ending Date show when the certificate is valid.
Certificate * displays the client certificate text, which must be copied to the tunnel client hub configuration.
One command is available.
Actions > Create Tunnel Server Client Certificate creates the certificate.
Client Certificate List
This section lists your client certificates.
New and Delete let you add and delete certificates.
Rows in the table display information about the certificates.
Fields below the table display details for the selected certificate.
Certificate * displays the certificate text, which must be copied and pasted into the tunnel client hub configuration.
2 - Tunnel Client
Important! Do not connect a hub with both a tunnel and a static route. In some situations, data could be transmitted over the
insecure static route rather than over the secure tunnel. If you set up a static route so that you can configure a tunnel, make
sure you delete the static route after the tunnel is complete.
Setting up Queues
If you have any secondary hubs in your deployment, you must create queues so that messages from those hubs can reach the primary hub. You
will create:
Attach queues on all secondary hubs. These queues collect messages.
Corresponding get queues on any intermediary hubs and on the primary hub.
You also can create post queues, which stream messages directly to a destination hub. To learn more, refer to About UIM Queues.
Follow these steps to set up a hub-to-hub queue.
Note: The type of queue you select determines which fields are active in the New Queue dialog.
Used by
alarm
Alarm messages
alarm2
alarm_new
alarm_update
alarm_close
Message sent when a client closes (acknowledges) an alarm and removes it from the currently active alarms
alarm_assign
Message sent when a client closes (acknowledges) an alarm and removes it from the currently active alarms
alarm_stats
Statistical event messages generated by the NAS probe that contain severity level summary information for all open
alarms
audit
probe_discovery
Device information
QOS_BASELINE
QOS_DEFINITION
QOS_MESSAGE
Example Queue
The following example shows how to configure an attach and get queue pair with the subject probe_discovery. This allows automated discovery
data to flow up from secondary hubs to the discovery server on the primary hub.
Setting up a Tunnel
Most companies have one or more firewalls in their network, both internally between different networks and externally against the Internet or a
network DMZ.
Because network administrators are often reluctant to open a firewall for the number of IP addresses and ports that management applications
require, it can be difficult to administer and monitor the whole network from a central location.
The solution is to set up a secure shell (SSH) tunnel between two hubs that are separated by a firewall. The tunnel sets up a VPN (Virtual Private
Network) connection between the two hubs. All requests and messages are routed over the tunnel and dispatched on the other side. This routing
is transparent to users.
Tunnels can be created during hub installation, or afterward by configuring the hubs. This following process explains how to set up tunnels
between hubs that are already installed. Click the links for more information.
To set up a tunnel, you must:
1. Determine which hub will be the tunnel server.
The Tunnel Server is the hub that initiates the setup of the tunnel.
The Tunnel Client accepts the attempt to setup the tunnel.
Recommendation: Because the tunnel server uses a fair amount of computing power, designate the system with the lower load as the
tunnel server. If a central hub will have several remote hubs attached to it, make the remote hubs the tunnel servers so that each remote
hub only adds a small amount of overhead to the central hub.
2. Ensure that both hubs appear in the navigation tree. Hubs discover each other by sending out broadcast (UDP) messages. However,
hubs separated by routers and firewalls are unable to discover other hubs by this mechanism. You must configure a static route to these
hubs.
3. Set up the tunnel server as a certificate authority. This creates a server certificate and gives the hub the ability to generate digital client
certificates.
4. Create the certificate for the tunnel client. This task is done on the tunnel server.
5. Install the certificate on the tunnel client.
6. Remove the static route to either hub if you created one.
7. Create queues between the hubs, just as you would for secondary hubs that are not connected with tunnels. Refer to Setting up Queues.
8. Optional: On the tunnel client, you can set up access list rules to restrict tunnel access privileges for specified UIM addresses,
commands, or users.
Note: Networks that use NAT (Network Address Translation) affect how a tunnel is configured. For example configurations, refer to Setting up a
Tunnel in a NAT Environment.
4.
1.
1. In Infrastructure Manager, navigate to the hub that will be the tunnel server and open its hub probe in the configuration GUI.
2. On the Tunnels tab, navigate to Server Configuration and click New.
3. In the Client Certificate Setup dialog, enter your information.
Who (optional) - Company name, organizational unit, and administrator email address
Where (optional) - Company location
Authentication (required) - Enter the following:
Common name - IP address of the tunnel client hub
Password - Create a password to be entered when you install the certificate on the tunnel client hub
Expire days - Number of days the certificate will be valid
4. Click OK.
5. In the Issued Certificates field, click View to display the certificate information.
6. Click Copy. Open a text editor (such as Notepad) and paste the certificate into a new blank file. Save the file to a location where the
tunnel client can access it. Exit the Certificate Information dialog.
7. Click Apply, then click Yes when asked to restart the probe.
Tip: When making a restrictive list, start with a few Accept rules, then make a Deny rule that denies access for all others. When making a less
restrictive list, start with a few Deny rules, then make an Accept rule that gives access for all others.
To add a rule:
1.
1. In Infrastructure Manager, navigate to the hub that will be the tunnel client and open its hub probe in the configuration GUI.
2. On the General tab, select Enable Tunneling.
3. Go to the Tunnels and click Client Configuration.
4. Click New. In the New Tunnel Connection dialog:
Leave the Active Tunnel and Check Server Common Name options enabled.
Enter the IP address of the tunnel server hub.
Enter the password you created when you generated the certificate.
Copy the certificate text from the file you saved it to, and paste it in the Certificate field.
Enter the server port or leave it blank to let the system assign the port.
If desired, specify a keep-alive interval.
Click OK to close the dialog.
Important! You should be aware that when a tunnel is configured, it replaces the static hub and NAT setup in the hub configuration.
The Client certificate must be issued to (CommonName)--the IP address that is visible to the Server, which is in this case 10.2.1.111, not 193.71.5
5.111.
Server address in NAT environment
Uncheck the Check Server CommonName option in the Tunnel Client Setup Window. The reason for this is that the Server certificate has
10.1.1.1 as CommonName, not 202.1.1.1 which is what the client sees.
Server and Client addresses in NAT environment
Combine the two methods described above. The Client certificate must be issued to (CommonName)--the IP address that is visible to the Server
(10.2.1.111) and uncheck the Check Server CommonName option in the Tunnel Client Setup Window.
c. Select a message in the list to open a callout containing vital information about the message, such as QoS name, QoS source,
QoS target, the time the sample was recorded, the sample value, etc.
Tunnel allows you to view the commands performed on the active tunnels. Click Start to view the the following information for each
command:
IP address and port number of of the client that executed the command.
Start and stop time for the transfer.
Total time used by the request through the tunnel (both directions).
UIM address to which the command was sent.
Last command issued during the message transfer.
Number of bytes sent and received during the session.
Alarm Configuration
Advanced alarm configuration lets you trigger alarms in certain situations. Add alarm keys to the hub section.
You can trigger alarms to:
Send an alarm when there are a significant number of subscriptions or queues (which can decrease hub performance). Specify:
subscriber_max_threshold - the number of subscribers that triggers an alarm
subscriber_max_severity - alarm severity
Send an alarm when a queue reaches a certain size. Specify:
queue_growth_size - queue size that triggers an alarm
queue_growth_severity - alarm severity
queue_connect_severity - severity of queue connect failures
Force the hub to restart if a tunnel is hanging and is not being able to reconnect. Specify:
tunnel_hang_timeout - hang time that triggers an alarm
tunnel_hang_retries - number of times the tunnel will try to reconnect before the hub is restarted
The httpd probe (hyper-text transfer protocol daemon) provides a simple http server that can be used to share information across the intranet.
LDAP Configuration
You can add two keys in the /LDAP/server section to affect how the hub communicates with the LDAP protocol.
Timeout
Number of seconds to spend on each LDAP operation, such as searching or binding (authentication) operations. Default: 15 seconds.
codepage
Codepage to use when translating characters from UTF-8 encoding to ANSI. When text comes from the LDAP library as UTF-8, UIM
uses this codepage to translate the characters into ANSI.
Windows: Specify a valid codepage.For a list of codepages, go to . Note that the hub LDAP library uses the MultibyteToWideChar
and WideCharToMultiByte functions to translate to and from ANSI/UTF-8. These functions take a codepage as a parameter.
Default: 28591
Unix: Use iconv functions. Refer to http://www.gnu.org/software/libiconv.
Default: ISO-8859-1
The codepage key is not shipped with the hub configuration file. The default codepage is ISO 8859-1 Latin 1; Western European (ISO).
Important! This should not be modified without advice from support because it can cause a significant delay in monitoring
functions.
Alarm
Configuration
Advanced alarm configuration lets you trigger alarms in certain situations. Add alarm keys to the hub section.
You can trigger alarms to:
Send an alarm when there are a significant number of subscriptions or queues (which can decrease hub performance).
Specify:
subscriber_max_threshold - the number of subscribers that triggers an alarm
subscriber_max_severity - alarm severity
Send an alarm when a queue reaches a certain size. Specify:
queue_growth_size - queue size that triggers an alarm
queue_growth_severity - alarm severity
queue_connect_severity - severity of queue connect failures
Force the hub to restart if a tunnel is hanging and is not being able to reconnect. Specify:
tunnel_hang_timeout - hang time that triggers an alarm
tunnel_hang_retries - number of times the tunnel will try to reconnect before the hub is restarted
The httpd probe (hyper-text transfer protocol daemon) provides a simple http server that can be used to share information
across the intranet.
LDAP
Configuration
You can add two keys in the /LDAP/server section to affect how the hub communicates with the LDAP protocol.
Timeout
Number of seconds to spend on each LDAP operation, such as searching or binding (authentication) operations. Default:
15 seconds.
codepage
Codepage to use when translating characters from UTF-8 encoding to ANSI. When text comes from the LDAP library as
UTF-8, UIM uses this codepage to translate the characters into ANSI.
Windows: Specify a valid codepage.For a list of codepages, go to . Note that the hub LDAP library uses the
MultibyteToWideChar and WideCharToMultiByte functions to translate to and from ANSI/UTF-8. These functions take a
codepage as a parameter.
Default: 28591
Unix: Use iconv functions. Refer to http://www.gnu.org/software/libiconv.
Default: ISO-8859-1
The codepage key is not shipped with the hub configuration file. The default codepage is ISO 8859-1 Latin 1; Western
European (ISO).
Passive
Robot
Configuration
Advanced configuration lets you control how a hub communicates with its passive robots. Add any or all of the following keys
to the hub section.
passive_robot_threads
The hub communicates with a robot through worker threads from a thread pool. The default is 10, but we recommend
you set the pool to 50% of the number of passive robots you have.
passive_robot_interval
The number of seconds between trying to retrieve messages from a passive robot. Default is15 seconds.
passive_robot_messages
The number of messages the hub will accept from a passive robot in a single retrieval interval. Default is 1000.
passive_robot_comms_timeout
The amount of time the comms API will block a call to a robot that is not responding. Default is 15 seconds.
passive_robot_max_interval
The interval between retrying a non-responding passive robot will double every 10 minutes up to this value.
passive_robot_restart_wait
The hub will wait for this length of time for passive robot management threads to stop before it kills them. Default is 60
seconds.
Important! This should not be modified without advice from support because it can cause a significant delay in monitoring
functions.
General Tab
Hub Advanced Settings
General Advanced Settings
SSL Advanced Settings
LDAP Advanced Settings
Hubs Tab
Robots Tab
Name Services Tab
Queues Tab
Reserved UIM Subject IDs
Tunnels Tab
Server Configuration
Client Configuration
Access List
Advanced
Status Tab
General Tab
The General tab contains the basic hub information in the following sections.
Hub information provides the following:
Hub name
Domain to which the hub belongs
Hub IP address
Hub address in UIM format:/domain/hub_name/robot_name/hub
Version number and distribution date of the hub probe
Uptime, the length of time the hub probe has been running since the last time it was started
Modify (button) opens the Edit Hub Address dialog, enabling you to edit the hub name and domain (note that the hub's controller
probe restarts if you modify these parameters)
License information
Each hub maintains a license system used by all of the robots connected to the hub. An invalid license causes the message flow from the
hub to its subscribers (mostly service probes) to stop. It also stop the various robot spoolers from uploading their messages as long as
the license key is invalid. The license key is built based on the following fields:
Licenses in use shows the number of robots currently connected to this hub, and the number of robots the license allows.
Expire date specifies when the license expires. An asterisk (*) indicates an unlimited license.
Owner of the license.
Modify (button) opens the Edit License dialog, which contains the hub's license key. License keys are provided by CA and must be
entered exactly as specified.
Login Settings for this hub
This section lets you configure your log in settings for the hub.
Normal (login allowed) allows users to log on the hub from any robot.
Local machine only allows normal login on the hub if attempting to log in from the computer hosting the hub. Attempts to log in from
other robots are refused
No login disables login to the hub from either the local machine or a robot connected to the hub.
Log Level lets you specify the level of detail written to the log file.
Recommendation: Log as little as possible during normal operation to reduce disk consumption, and increase the level of detail when
debugging.
Log Size sets the size of the log file (default is 1024 KB). Large log files can cause performance issues, therefore use caution when
changing this size.
Advanced allows you to view and configure advanced options for the hub.
Enable tunneling activates the tunnel tab, where you can configure the tunnels. To disable tunnels, uncheck this option and click Ap
ply.
Disable IP validation. When a computer sends a request to a probe, the computers IP-address is validated. This option is typically
used when using NAT (Network Address Translation). See Setting up a Tunnel in a NAT Environment.
Statistics (button) displays traffic statistics for the hub for the previous 12 hours.
The graph shows the number of messages sent and received per minute, in addition to the number of requests. The Period section
lets you specify a time period. Click Get to update the values.
See Checking Traffic Statistics for more information.
Monitor (button) displays the current hub traffic. See Monitoring the Message Flow for more information.
View Log (button) displays the contents from the hubs log file. The log settings allow you to set the level of detail for the logging
function.
The Log Viewer window provides the following:
File lets you save or print the file.
Edit lets you copy the contents and search in the log file.
Actions lets you limit the output in the window and highlight text or the date within the log file.
Start and Stop buttons let start/stop the log file updates.
Settings (button) opens the Hub Advanced Settings dialog. See Hub Advanced Settings for more information.
General
SSL
LDAP
Important! These changes are not persistent, that is they do not survive over a hub stop and start.
Origin identifies where a message came from. QoS messages from probes are tagged with a name to identify the origin of the data. By
default, the name of the probes's parent hub is the origin. To override that value, you can:
Change the value here. The new value will be used for all QoS messages handled by the hub.
Set the origin at the robot level (in the Setup > Advanced section in controller configuration GUI).
Audit Settings for Robots allows the recording of important events, such as starting and stopping the robot. This setting is used for all
robots serviced by this hub. Select one of the following options:
Override: audit off
Override: audit on
Use audit settings from robot
SSL Advanced Settings
There are two configuration options for LDAP: direct LDAP or Nimsoft Proxy Hub.
Direct LDAP
The hub can be configured to forward login requests to a LDAP server. This makes it possible to log on to the UIM consoles with LDAP
credentials. Users belonging to different groups in LDAP can be assigned to different UIM Access Control Lists (ACLs).
Note: Direct LDAP is only available on Linux and Windows hubs, due to the availability of the LDAP library the hub uses. Native LDAP is
not supported on Solaris.
LDAP authentication includes:
Server Name: The hub can be configured to point to a specific LDAP server, using IP address or host name. A Lookup button lets you
test the communication. If you used a non-standard port for your hub, you must use the syntax "hostname:port" to indicate the server
name.
Note: You can specify multiple servers in this field, each separated with a space. The first entry acts as a primary server, while the
others act as secondary servers (taking over if the primary server goes down). Logins may take more time if a secondary server has
taken over.
Server Type: Choose an LDAP server type. Currently two server types are supported: Active Directory and eDirectory.
Authentication Sequence: Specify whether the hub authenticates using the LDAP login or Nimsoft Login first. For example, if you
select Nimsoft > LDAP, this means that the user will be verified against Nimsoft user credentials first. If this fails the hub will try to
verify the credentials against LDAP server credentials.
Use SSL: Select this option if you want to use SSL during LDAP communication. Most LDAP servers are configured to use SSL.
User and Password: You must specify a user name and a password to be used by the hub when accessing the LDAP server to
retrieve information:
Active Directory - the user can be specified as an ordinary user name.
eDirectory - the user must be specified as a path to the user in LDAP in the format CN=yyy,O=xxx, where CN is the user name and O
is the organization.
Group Container (DN): Specify a group container in LDAP to define where in the LDAP structure to search for groups. Click Test to
check if the container is valid.
User Container (DN): Specify a user container in LDAP to define more specifically where in the LDAP structure to search for users.
Nimsoft Proxy Hub
The hub can be configured to specify a UIM probe address to log in through.
Proxy Hub: The drop down list is empty by default. Click the Refresh icon next to the drop down list to perform a gethubs probe
request on the hub you are configuring, which will populate the drop down list with the hubs it knows about.
Proxy Retries: Specify the number of retries to perform in case of communication errors (network errors).
Authentication Sequence: Specify whether the hub authenticates using the LDAP login or Nimsoft Login first. For example, if you
select Nimsoft > LDAP, this means that the user will be verified against Nimsoft user credentials first. If this fails the hub will try to
verify the credentials against LDAP server credentials.
Proxy Timeout: Specify the time (in seconds per attempt) after which the proxy will be timed out.
Hubs Tab
This tab lists all known hubs and displays their information in different text colors:
Blue: Hub is within the same domain as the hub you are currently logged on to.
Black: Hub is outside the domain.
Red: Hub status is unknown, typically because the hub is not running.
Domain
Hub name
Version of the hub probe
Updated: shows when the hub was last updated
IP address for the hub
Port number for the hub
Right-clicking in the window displays four options:
Alive Check rechecks the status of the selected hub.
Response Check checks the response time (connect - reconnect, no transfer) between your hub and the one selected in the list.
Transfer Check transfers data from your hub to the selected hub, then checks the transfer rate.
Remove removes the selected hub from the hubs address list. The hub may appear later if it is running.
Robots Tab
The Robot tab lists lets you set the alarm level for robots and displays robot information.
Inactive Robot setup lets you set the severity level of the alarm issued if one of the robots in the list becomes unavailable.
Registered Robots displays the following information for each robot controlled by the hub:
Name
Type - regular or passive
IP address
Version of the robot software
Created - when the robot was installed
Last Update - when the software was last updated
Operating system of the robot's host system
Right-clicking in the window opens a small menu with the following options:
Restart
This option only re-reads the configuration file for the selected robot. It does not do a stop and restart of the robot. If you have made any
changes to the robot configuration you must stop and restart the robot.
Check
Checks the selected robot.
Remove
Removes the selected active or passive robot from the list. An active robot may show up later since active robots will periodically request
the hub add them to the Registered Robots list.
Add Passive Robot
Opens the dialog to add a passive robot.
Note the Synchronize option in the New Static Hub dialog. If this option is not checked, the parent hub will not send status info to the
static hub. The parent hub will still receive status info from the static hub, unless you disable the Synchronize option on the static hub as
well.
This option can be disabled to reduce network traffic if your network runs on a telephone line or ISDN.
Important! Do not connect a hub with both a tunnel and a static route. In some situation, data could be transmitted over the
insecure static route rather than over the secure tunnel. If you set up a static route so that you can tunnel to a hub, make sure
you delete the static route after the tunnel is configured.
Network Alias tells the local hub the return address on requests from a remote NAT hub.
On hub A, set up the From address and the To address for hub B.
On hub B, set up the From address and the To address for hub A.
When hub B sends a request to hub A, the request will contain hub Bs From address. Hub A then knows that hub Bs To address must
be used when returning a request to hub B.
Queues Tab
The Queues tab lists all defined message queues. These include the queues that are automatically deployed, and those that an administrator
adds as needed. For example, if alarms need to be communicated from a secondary hub to the primary hub, you would create an attach queue
named nas with the subject alarm to forward alarms.
To edit a message queue, double-click the message queue (or select the message queue and click the Edit button).
To define a new queue, click New.
A queue is a holding area for messages passing through the hub. Queues are temporary or permanent:
Permanent queue content survives a hub restart. Permanent queues are meant for service probes that need to pick up all messages,
regardless of whether the service probe was running when they were created.
Temporary queue content is cleared during restarts. These queues are typically used for events to management consoles.
All queues defined on the Queues tab are permanent queues. Permanent queues are given a name related to their purpose. The permanent
queue named NAS is attached to the NAS (Nimsoft Alarm Server). You can set up a permanent queue from one hub to another by defining it as a
post type queue with the full UIM address of the other hub.
A Post queue sends a directed stream of messages to the destination hub.
An Attach queue creates a permanent queue for a client to attach to.
A Get queue gets messages from a permanent attach queue on another hub.
The queue defined below (called get-hub-4) is defined as a get queue, getting messages from the queue xprone-attach defined on the
hub /HubTest/wm-hub-4/vm-hub-4/hub.
Tunnels Tab
This tab is enabled if the Enable Tunneling option is checked on the General tab of the hub configuration GUI.
Most companies have one or more firewalls in their networks, both internally between different networks and externally against a DMZ or Internet.
This makes it challenging to administer and monitor the whole network from a central location.
The solution is to set up a tunnel between two hubs that are separated by a firewall. The tunnel:
Sets up a VPN-like (Virtual Private Network) connection between the hubs. All UIM requests and messages are routed over the tunnel
and dispatched on the other side. This routing is transparent to all UIM users.
Requires that the firewall open one port for connection to the target hub.
Is implemented using the SSL (Secure Socket Layer) protocol.
Security is handled in two ways; certificates that authenticate the Client, and encryption to secure the network traffic over the Internet.
Authorization and Authentication
The tunnel provides authorization and authentication by using certificates. Both the Client and the Server need valid certificates issued by
the same CA (Certificate Authority) in order to set up a connection. In the case of setting up a tunnel, the machine receiving the
connection (the Server) is its own CA and will only accept certificates issued by itself.
Encryption
The encryption settings spans from None to High. No encryption means that the traffic is still authenticated and is therefore
recommended for tunnels within LANs and WANs. You should be careful when selecting higher encryption level since this will be more
resource intensive for the machines at both ends of the tunnel.
Important! Do not use static hubs (listed under the Name Services tab) when setting up a tunnel.
Configuration tasks under this tab are for server (listening) side of the tunnel. The tab contains the following fields and buttons.
Active activates the tunnel server. The Certificate Authority Setup dialog appears. See Setting up a Tunnel for more information.
Common name is the IP address of the hub on the server side.
Expire date is the date the server certificate expires.
Port specifies the port that the tunnel server is listening on. This is the port that you have to open in your router or firewall for incoming
connections.
Security settings lets you select None, Low, Medium, High, or Custom, where you define your own security setting. For custom
definition: See http://www.openssl.org/docs/apps/ciphers.html
Note: High gives the highest degree of encryption, but slows the data traffic a lot. Normally None will be sufficient, where data is not
encrypted, but still authenticated.
Start and Stop starts or stops the tunnel server.
Server displays the server details and the server certificate.
CA displays the CA details and CA certificate.
New opens the Client Certificate Setup dialog. You can create certificates for the clients you will open for access. When creating the
certificate, you must set a password. This password, the certificate (encrypted text) and the servers port number must be sent to the
client site.
Delete deletes the selected client certificate.
View displays the selected client certificate.
Client Configuration
The configuration tasks under this tab are for the client (connecting) side of the tunnel.
The fields are:
Server
The tunnel servers IP address or hostname.
Port
The tunnel servers port number.
Heartbeat
Keep-alive message interval.
Description
Brief description of the tunnel connection.
New button
Opens the New Tunnel Connection dialog. You can create a new Tunnel connection to the server that has generated the certificate.
Active Tunnel
Activates the defined tunnel connection.
Check Server CommonName
Uncheck this option to disable the Server IP address check on connection (see Setting up a Tunnel in a NAT Environment).
Description
Brief description of the tunnel connection.
Server
The IP address of the server on the server end of the tunnel.
Password
The password you have received with your certificate from the server side.
Server Port
The communication port on the server on the server side. Default is 48003. It is recommended that you do not change the default port.
Keep-alive interval
Small data packets are sent at the specified interval. This is to allow for firewall connection disruption on idle connections.
Certificate
You paste the received certificate in this field. See Creating Client Certificates for more information.
Edit button
Edits the selected server connection.
Delete button
Deletes the selected server connection.
Certificate button
Displays the selected client certificate.
Access List
This tab allows you to set access rules for tunnels. By default, all UIM requests and messages can be routed over the tunnel and dispatched on
the other side. This routing is transparent to UIM users.
The Access List lets you define a set of rules to restrict the access privileges for specified UIM addresses, commands, or users. The Access List
must be defined on the tunnel client hub.
You can create three types of rules:
Accept rules enable access. Set up a rule to give access to a UIM component (such as a probe, robot or hub) to execute one or more
specific commands for one or more users.
Deny rules disallow access for the specified addresses, commands or users.
Log rules log all requests through the tunnel. This is normally used for debugging purposes when testing commands against targets
before setting them up as accept or deny rules. The result can be viewed in the hub log file before your deny or accept rules.
The tab contains two sections:
Edit Rule lets you add, modify or remove access rules. Four criteria are used when defining rules, and a rule is triggered based on
matching all four criteria:
Source IP is the name of the source hub, robot or probe.
Destination Address is the address of the target hub, robot or probe.
Probe Command is the specific command you want to allow or deny. The command-set varies from probe to probe. To view a
command set, open the Probe Utility.
User is the user you want to allow or deny access.
Note: Regular expression is allowed.
The rules table displays all the rules you have created. The order of the rules defined is important. The first rule in the list is processed
first. Processing stops on the first rule that matches all four criteria.
Use the Move Up and Move Down buttons to change the order of the rules in the list.
Advanced
This tab allows you to assign the first tunnel port, establish the hanging timeout, and configure the SSL session cache.
The tab contains the following options:
Ignore first probe port settings from controller
It is not necessary to select this option if only one tunnel is defined, as the tunnel is automatically assigned the port number specified as F
irst probe port number on the Setup > Advanced tab in the controller configuration.
If more than one tunnel is defined, select this option to enable the First Tunnel Port field.
First Tunnel Port
If you have configured more than one tunnel, you should specify the first port in the range of ports to be used by the tunnels.
Important! Do NOT use the same port that is specified for the controller probe's first probe port.
If this field is blank, random ports will be assigned by the operating system.
Clients are assigned ports from the configured port range and keep that port as long as the hub is running.
Servers assign ports from the configured port number and increment the number for each new client connection. If there are no
active clients, the hub resets the counter.
Tunnel is Hanging Timeout
The hub continuously checks if one or more of the active tunnels are hanging. No new connections can be established through tunnels
that are hanging.
If one or more tunnels are hanging, the hub attempts to restart the tunnel(s). If the restart fails, the hub performs a restart after the
specified number of seconds.
SSL Session Cache
Use Server Cache enables caching of SSL sessions and reuse previous session credentials. This speeds up the connection time
between the client and the server (assuming Use Client Cache is enabled on the client).
Server Cache Timeout defines how long the cached sessions are valid for reuse by the client.
Server Cache Size defines how many sessions can be stored in the cache. When the cache is full, the oldest sessions are deleted as
new connections are established.
Use Client Cache enables caching of SSL sessions on the client hub.
Status Tab
This tab contains four subsections that provide status information about the queues, subjects and tunnels you have defined.
Subscribers/Queues displays a list with status information on all subscribers/queues on this hub. You can view the messages received
by the hub and forwarded to interested parties. This status can be used to assist with debugging and load monitoring for your hub. The
fields in the list are:
Name of the queue
Type of subscriber
Queued shows the number of messages waiting to be transferred (unless you use message spooling, this number should be 0 in
normal operation as long as the subscriber is alive)
Sent shows the number of messages sent
Bulk Size is the maximum number of messages sent at once
Subject/Queue is the name of the queue or subject that the subscriber subscribes to
ID for the connected probe or program
Established shows when the hub connected to the subscriber
Address of the subscriber
Connection is the address of the subscribers network connection
Subjects shows a count of all messages that have been transferred since the last (re)start, grouped by the subject. This information can
assist you with debugging and load monitoring for your hub.
Tunnel Status displays two windows. The upper window, which shows all tunnels that the hub is running, provides this information:
Peer Hub is the IP-address or hostname of the tunnels peer
Started shows the initial tunnel connection time
Last shows the time of the last connection through the tunnel
Connection stats (ms) are the statistics for the time taken to set up the tunnel connection (a low minimum value could indicate a low
bandwidth; a high minimum value and a high average value could indicate packet loss)
Connections shows the number of connections made
Traffic in/out shows the amount of data received/sent through the tunnel
When you select a tunnel in the upper window, its connections appear in the lower window, which displays:
State of the connection (idle or active)
Start time of the connection
Last transfer time
In and Out show the amount of data received or sent
Address is the hostname of the target of the request
Command specifies the command executed on the target of the connection
Tunnel Statistics has statistics on SSL and on various events (such as server start time or when the last connection was received). Use
the drop-down list to select the server or client you want to view.
If a non-functioning tunnel will significantly impact your operations, increase the level of alarm sent if a connection is lost or cannot be
made.
These settings are found on the Advanced > Tunnel Settings node in the Admin Console hub configuration GUI.
Queues
If the size of a get or post queue never shrinks to zero or if it always has many messages, increase the Bulk Size on the queue. This
allows the hub to transfer multiple messages in one packet.
Hub Troubleshooting
If your problem is not addressed here:
Look for a solution or ask other users for help on the CA UIM Community Forum.
Contact Support.
Send us feedback with the "rate this page" link below. We will strive to include a solution in the next release of this document.
The Microsoft Hyper-V Monitoring probe allows you to monitor the health and performance of the Microsoft Hyper-V servers. The probe collects
the necessary information about the following systems:
Host Operating System (OS)
Corresponding hypervisor system (Windows 2008/2012/2012 R2 Server + Hyper-V / Windows 2008 Server Core + Hyper-V)
Virtual Machines that are configured on the Host OS
Note: Version 3.0 or later of the probe is now available only through the web-based GUI. The Infrastructure Manager (IM) GUI is only
available for version 2.2 or earlier.
The probe allows you to define alarms and their corresponding threshold values. You can compare the actual data at customizable intervals using
generated QoS messages. The probe then generates alarms when the corresponding threshold values are breached.
The probe monitors the following entities on the host:
CPU
Memory
Disk
Network
Resource Pool
The probe also monitors the following entities of each Virtual Machine (VM) on the host:
CPU
Memory
Disk
Network
The 3.10 or later versions of the probe allow you to create configuration templates. The templates are applicable to only the specific instance of
the probe on the robot. Only existing profiles can be configured using templates.
More information:
hyperv (Microsoft Hyper-V Monitoring) Release Notes
Note: The hyperv probe is configured on the network system only. You cannot monitor the local host with this probe.
The probe can monitor Windows Server 2008, 2008 R2, 2012 and 2012 R2 hosts. The hyperv probe configures the host system and automatically
displays the list of virtual machines and their associated monitoring checkpoints.
Important! The hyperv role must be enabled on the Windows Server environment for the probe to connect and collect required
information.
The probe has counters to monitor the NUMA topology of the Windows Server 2012 and Windows Server 2012 R2. The NUMA topology is used
to optimize the performance of high-performing applications (like SQL Server) by efficiently scheduling threads and allocating memory.
The recommended counters for NUMA are as follows:
Note: The default monitoring interval is 10 minutes. It is recommended to keep it the same.
4. Create a profile with the authentication details of the hyper-v server to be monitored.
Refer Managing Profiles for more information.
Note: The profile goes into pending state to fetch the data for that particular host. On reloading the page or re-opening the GUI
after sometime, you are able to see the tree structure of the CPU, disk, network, resource pool and VMs for the particular host.
5. Activate the profile and save the configuration to establish the connection and discover resources.
The host, hypervisor, and VM resources are automatically discovered by the probe and displayed as nodes.
6. Configure the monitors for the required nodes.
7. Save the configuration to start monitoring.
Preconfiguration Requirements
The following are the preconfiguration requirements of the Microsoft Hyper-V probe:
Server Configuration
Client Configuration
HTTP Configuration
Migration Prerequisite
Note: If any of the following commands using single quotations fail, try to run the command with double quotes or without the quotation
marks.
1. Ensure that WinRM is enabled on the host.
Note: WinRM is enabled by default. However, if WinRM is disabled, use the following Microsoft links to enable WinRM on the
respective platforms:
Windows Server 2012 R2
Windows Server 2008 R2
If you follow the procedures in the links, skip the next step and proceed directly to step 3.
2. Open a PowerShell window on the server as an administrator and enter the following command:
Get-ExecutionPolicy
If the response says Restricted, you must change the setting to Unrestricted or RemoteSigned. For example, to set it to RemoteSigne
d:
a. Enter the following command:
Set-ExecutionPolicy RemoteSigned
b. Enter Y to accept the policy.
c. Enter the Get-ExecutionPolicy command again to verify the setting.
d. Enter the following command to enable all firewall exception rules:
Configure-SMRemoting.ps1 -force -enable
3. Open a Windows Powershell window on the server as the Administrator user.
4. Enter the following command:
winrm quickconfig
5. Enter Y to accept the changes.
This configures WinRM with default settings.
6. Enter the following command to check the authentication status:
winrm get winrm/config/service
You see a section in the response similar to the following:
RootSDDL = O:NSG:BAD:P(A;;GA;;;BA)S:P(AU;FA;GA;;;WD)(AU;SA;GWGX;;;WD)
MaxConcurrentOperations = 4294967295
MaxConcurrentOperationsPerUser = 15
EnumerationTimeoutms = 60000
MaxConnections = 25
MaxPacketRetrievalTimeSeconds = 120
AllowUnencrypted = false
Auth
Basic = false
Kerberos = true
Negotiate = true
Certificate = false
CredSSP = false
CbtHardeningLevel = Relaxed
DefaultPorts
HTTP = 5985
HTTPS = 5986
IPv4Filter = *
IPv6Filter = *
EnableCompatibilityHttpListener = false
EnableCompatibilityHttpsListener = false
CertificateThumbprint
7. Enter the following command to enable basic authentication:
winrm set winrm/config/service/auth '@{Basic="true"}'
8. Enter the following command to allow unencrypted data:
winrm set winrm/config/service '@{AllowUnencrypted="true"}'
9. Enter the following command to trust all hosts:
winrm set winrm/config/client '@{TrustedHosts="*"}'
To trust only specified hosts list the host names, as in the following example:
winrm set winrm/config/client '@{TrustedHosts="host1, host2, host3"}'
10. Enter the following command to provide sufficient memory, 1024 MB, for the probe to execute PowerShell commands on the server:
winrm set winrm/config/winrs '@{MaxMemoryPerShellMB="1024"}'
Note: If you see the message: "Process is terminated due to StackOverflowException," in the log file, increase the memory
value in this setting.
Note: If any of the following commands using single quotations fail, try to run the command without the quotation marks.
1. Open a PowerShell window on the client and enter the following command:
Get-ExecutionPolicy
If the response says Restricted, you must change the setting to Unrestricted or RemoteSigned. For example, to set it to RemoteSigne
d:
a. Enter the following command:
Set-ExecutionPolicy RemoteSigned
b. Enter Y to accept the policy.
c. Enter the Get-ExecutionPolicy command again to verify the setting.
2. Open a Command Prompt window on the client as the Administrator user.
3. Enter the following command:
winrm quickconfig
4. Enter Y to accept the changes.
This configures WinRM with default settings.
5.
Note: If you see the message: "Process is terminated due to StackOverflowException," in the log file, increase the memory
value in this setting.
Migration Prerequisite
You must clear the temporary files on each system that accesses the robot with the probe to be migrated if you are migration from an earlier
version of the probe to version 3.0.
1. Open the Run window.
2. Type %temp% and click OK.
The local Temp directory of the system opens.
3. Open the Util directory.
4. Delete all files in the directory.
The temporary file have now been cleared.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
Configure a Node
This procedure provides the information to configure a section within a node.
Each section within a node enables you to configure the properties of the probe for monitoring the response of user-defined queries.
Follow these steps:
1. Select the appropriate navigation path.
2. Update the field information and click Save.
The specified section of the probe is configured. You can now define connections and profiles for the monitored instance.
Managing Profiles
You can configure the probe to create, modify or delete a profile to monitor the hypervisor system. You can configure the profile to send the QoS
messages and generate alarms on response time.
Create Profile
You can create a profile to monitor a Microsoft Hyper-V environment.
Follow these steps:
1. Click the Options icon next to the hyperv node in the navigation pane.
2. Click the Add New Profile option.
3. Update the field information and click Submit.
The profile is displayed as a node.
4. Navigate to the <Profile Name> node.
5. Click Verify Selection from the Actions drop-down.
A connection is established and the authentication details are verified.
6. Click Save.
The profile is saved and the nodes are auto discovered for the profile.
Modify Profile
You can modify a profile to change the authentication and connection details of a hypervisor.
Follow these steps:
1. Click the <Profile Name> node to view Resource Setup.
2. Modify the required field information.
3. Click Save.
The profile is saved with the updated details.
Delete Profile
You can delete a profile that is no longer required.
Follow these steps:
1. Click the Options icon next to the <Profile Name> node in the navigation pane.
2. Click the Delete Profile option.
3. Click Save.
The profile is deleted from the probe.
This article describes the various GUI elements of the Microsoft Hyper-V Monitoring (hyperv) probe.
Note: The hypervisor and the hyperv probe must be in the same locale, while monitoring the hypervisor remotely.
Contents
The icons in the tree indicate the type of object the node contains. For VMs, the color of the icon also indicates the status of the VM.
- Closed folder. Organizational node used to group similar objects. Click the node to expand it.
- Open folder. Organizational node used to group similar objects. Click the triangle next to the folder to collapse it.!
- Resource
- Server
- VM that is running
- VM that is stopped
- VM that is paused
- Network interface
hyperv Node
This node lets you view the probe information and configure the logging properties.
Navigation: hyperv
View or modify the following values as required:
hyperv > Probe Information
This section displays information about the probe name, probe version, start time of the probe, and the probe vendor.
hyperv > Probe Setup
This section lets you configure the detail level of the log file.
Log Level: defines the level of detail written to the log file. The levels range from fatal errors only (0 - Fatal) to extremely detailed
information for troubleshooting (5 - Trace).
Default: 0 - Fatal
Note: Log as little as possible during normal operation to minimize disk consumption.
Password: defines a password to authenticate the given username to log in to the host system.
Server Version: specifies the version of the Windows Server hypervisor. The probe supports monitoring for version 2008, 2012, and 2012
R2.
Default: 2008
Note: Specify the correct version of the server, otherwise the values for some checkpoints is not populated correctly. For example,
Hyper-V Virtual IDE Controller.
Active: activates the profile for service monitoring. By default, the profile is active.
Interval (secs): defines the time interval (in seconds) after which the probe collects data from the monitors.
Default/Recommended: 600
Note: The minimum recommended interval for the probe is 300 seconds.
Alarm Message: specifies the alarm to be generated when the profile is not responding.
Example: The profile does not respond if there is a connection failure or inventory update failure
Default: ResourceCritical
Note: The performance counters are visible in a tabular form. You can select any one counter in the table and configure its properties.
Similarly, you can configure the other performance counters that are visible under the subsequent nodes.
CPU Node
This node allows you to configure the performance counters (in percentage) for all the CPUs present on the host machine.
Navigation: hyperv > Profile Name > Host Name > CPU
<CPU Name> Node
This node allows you to configure the performance counters for each CPU present on the host machine.
Navigation: hyperv > Profile Name > Host Name > CPU > CPU Name
Disk Node
This node allows you to configure the performance counters for the disk(s) present on the host machine.
Navigation: hyperv > Profile Name > Host Name > Disk
<Volume Name> Node
This node allows you to configure the performance counters for each disk volume on the host machine.
Navigation: hyperv > Profile Name > Host Name > Disk > Volume Name
Memory Node
This node allows you to configure the performance counters (in memory size) for the RAM present on the host machine.
Navigation: hyperv > Profile Name > Host Name > Memory
Network Node
This node allows you to configure the performance counters for all the network interfaces present on the host machine.
Navigation: hyperv > Profile Name > Host Name > Network
<Interface Name> Node
This node allows you to configure the performance counters for each network interface present on the host machine.
Navigation: hyperv > Profile Name > Host Name > Network > Interface Name
Resource Pool
Resource pools in hyper-V indicate the total resources available in the hypervisor for the installed VMs. The resource pool allows easy sharing
and monitoring of resources between the different VMs. This node groups all available resources on the host machine for the hypervisor.
Navigation: hyperv > Profile Name > Host Name > Resource Pool
The performance counters are divided into following categories:
CPU
Disk
Memory
Network
Each category is represented as a node under the Resource Pool node.
The Resource Pool and the associated CPU, Disk, Memory, and Network nodes do not have a section or field and are used to contain a
categorized list of resources in the resource pool.
<Resource Pool CPU Name> Node
This node allows you to configure the health, capacity, and availability performance counters for the shared virtual CPU in the resource pool.
Navigation: hyperv > Profile Name > Host Name > Resource Pool > CPU > Resource Pool CPU Name
<Resource Pool Disk Name> Node
This node allows you to configure the health, capacity, and availability performance counters for the shared virtual disks in the resource pool.
Navigation: hyperv > Profile Name > Host Name > Resource Pool > Disk > Resource Pool Disk Name
<Resource Pool Memory Name> Node
This node allows you to configure the health, capacity, and availability performance counters for the shared virtual RAM in the resource pool.
Navigation: hyperv > Profile Name > Host Name > Resource Pool > Memory > Resource Pool Memory Name
This node allows you to configure the health, capacity, and availability performance counters for the shared virtual network interfaces in the
resource pool.
Navigation: hyperv > Profile Name > Host Name > Resource Pool > Network > Resource Pool Network Name
Resources Node
Resources in the probe are the various network interfaces and Virtual Machines (VMs) present on the host system. This node allows you to
configure the performance counters of these resources.
Navigation: hyperv > Profile Name > Host Name > Resources
The performance counters are divided into following categories:
Network
Switch
VMs
Each category is represented as a node under the Resources node.
The Network, Switch, and VMs nodes do not have a section or field and are used to contain a categorized list of network resources and VMs.
<Resource Network Name> Node
This node allows you to configure the speed and throughput performance counters for the shared virtual network interfaces.
Navigation: hyperv > Profile Name > Host Name > Resources > Network > Resource Network Name
<Resource Switch Name> Node
This node allows you to configure the speed and throughput performance counters for the shared virtual network switch(es).
Navigation: hyperv > Profile Name > Host Name > Resources > Network > Resource Switch Name
<Resource VM Name> Node
This node allows you to configure the performance counters for the various virtual machines present in the hypervisor.
Navigation: hyperv > Profile Name > Host Name > Resources > Network > Resource VM Name
The performance counters are divided into following categories:
CPU
Disk
Memory
Network
Each category is represented as a node under the <Resource VM Name> node.
CPU Node
This node allows you to configure the performance counters for the virtual CPU allocated to the virtual machine.
Navigation: hyperv > Profile Name > Host Name > Resources > Resource VM Name > CPU
Disk Node
This node allows you to configure the performance counters for the virtual disk allocated to the virtual machine.
Navigation: hyperv > Profile Name > Host Name > Resources > Resource VM Name > Disk
Memory Node
This node allows you to configure the performance counters for the virtual RAM allocated to the virtual machine.
Navigation: hyperv > Profile Name > Host Name > Resources > Resource VM Name > Memory
Network Node
This node allows you to configure the performance counters for the virtual network interface allocated to the virtual machine.
Navigation: hyperv > Profile Name > Host Name > Resources > Resource VM Name > Network
This entire structure of nodes is repeated for each profile configured in the probe.
hyperv Metrics
This article describes the metrics that can be configured for the Microsoft Hyper-V Monitoring (hyperv) probe.
Contents
QoS Metrics
Alert Metrics Default Settings
QoS Metrics
The following table describes the checkpoint metrics that can be configured using the Microsoft Hyper-V Monitoring (hyperv) probe.
Note: The CPU Load Percentage QoS has been discontinued from version 3.0.
Monitor Name
Units
Description
Version
QOS_CPU_TIME_PCT
Percent
3.0
QOS_CPU_RESERVATION
Number
3.0
QOS_CPU_LIMIT
Number
3.0
QOS_CPU_SPEED
Megahertz
3.0
QOS_DISK_KBPS
Kilobytes/Second
3.0
QOS_DISK_IO_KB
Kilobytes
3.0
QOS_DISK_SECTOR_IO
Sectors/Second
3.0
QOS_DISK_SPACE_GB
Gigabytes
3.0
QOS_DISK_SPACE_PCT
Percent
3.0
QOS_NETWORK_KBPS
Kilobytes/Second
3.0
QOS_MEMORY_ALLOCATED
Megabytes
3.0
QOS_MEMORY_FREE
Megabytes
3.0
QOS_UPTIME
Seconds
3.0
QOS_RESOURCE_POOL_CAPACITY
Allocation Units
3.0
QOS_RESOURCE_POOL_STATUS
Status
3.0
QOS_RESOURCE_POOL_RESERVED
Allocation Units
3.0
QOS_CPU_HALTS
Halts/Second
3.0
QOS_CPU_HALTS_COSTS
Number
3.0
QOS_IO_INSTRUCTION
Instructions/Second
3.0
QOS_IO_INSTRUCTION_COST
Number
3.0
QOS_PAGE_FAULT
Faults/Second
3.0
QOS_PAGE_FAULT_COST
Number
3.0
QOS_INTERRUPTS
Interrupts/Second
3.0
QOS_PARTITION_PAGES
Number
3.0
QOS_VIRTUAL_TLB
Flushes/Second
3.0
QOS_GPA_SPACE
Modifications/Second
3.0
QOS_VIRTUAL_TLB_PAGES
Number
3.0
QOS_PAGE_SPACE_MB
Megabytes
3.0
QOS_NUMBER_PROCESSORS
Number
3.0
QOS_PARTITIONS
Number
3.0
QOS_ADDRESS_SPACE
Number
3.0
QOS_CONNECTED_CLIENTS
Number
3.0
QOS_TOTAL_PAGES
Number
3.0
QOS_NETWORK_PACKET_IO
Packets/Second
3.0
QOS_VIRTUAL_MACHINE_HEALTH
Status
3.0
QOS_NUMBER_VMS
Number
3.0
QOS_GPA_PAGES
Number
3.0
Warning
Threshold
Warning
Severity
Error
Threshold
Error
Severity
Description
ResourceCritical
None
None
None
Critical
MonitorWarning
None
None
None
Warning
MonitorCritical
None
None
None
Critical
CPUWarning
None
None
None
Warning
CPUCritical
None
None
None
Critical
MemoryWarning
None
None
None
Warning
MemoryGenericWarning
None
None
None
Warning
MemoryCritical
None
None
None
Critical
NetworkWarning
None
None
None
Warning
NetworkCritical
None
None
None
Critical
DiskWarning
None
None
None
Warning
DiskCritical
None
None
None
Critical
Status
None
None
None
Major
service_state
None
None
None
Major
ResourcePoolWarning
None
None
None
Warning
ResourcePoolCapacityWarning
None
None
None
Warning
ResourcePoolStatusWarning
None
None
None
Warning
ResourcePoolReservedWarning
None
None
None
Warning
EventAlarm
None
None
None
Major
The event source, the event code, the event category and the
event description is displayed.
UptimeWarning
None
None
None
Warning
The ibm_svc probe uses SMI-S provider and Command Line Interpreter (CLI) to retrieve status, configuration, and statistics data of the
IBM SVC storage server. The probe supports user authentication through SSH only. So, you require a password or a valid SSH key file to
access CLI.
More information:
ibm_svc (IBM SVC Monitoring) Release Notes
Contents
Verify Prerequisites
Set Up General Properties
Create and Configure a Profile
Verify Prerequisites
The ibm_svc probe requires a user account to access the web interface and command interpreter of the IBM System Storage SAN Volume
Controller (SVC). The user must have the privileges to retrieve the data from the storage server using SMI-S and CLI. Verify that required
hardware and software is available before you configure the probe. For more information, see ibm_svc (IBM SVC Monitoring) Release Notes.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
You can configure the monitors to set up alarms and QoS. The nodes in the profile tree have numeric as well as string monitors.
Numeric monitors: Used to set up alarms conditions and QoS messages for monitors that generate numeric metric data.
String monitors: Used to set up the alarm conditions for monitors that generate string metric data. String type monitors do not generate
QoS messages.
Follow these steps:
1. Navigate to the required node.
2. Select the monitor to configure the properties.
3. Set up the alarm conditions and QoS messages for monitors that generate numeric metric data.
Monitor (Numeric)
The following fields are configured before the alarm and threshold configuration:
Value Definition: select the type of values that the probe uses for alarms and QoS. The value definition can be Current Value, Av
erage Value Last n Samples, Delta Value (Current - Previous), or Delta Per Second.
Number of Samples: specify the number of values to consider for the probe metric calculations for QoS and alarms.
Monitor (String)
The following fields are configured before the alarm and threshold configuration:
High Operator: select an operator to match the retrieved value with the threshold for the maximum limit. The drop-down list has
the = (equal to) and the != (not equal to) options.
Default: !=
High Threshold: specify the maximum threshold for the monitor.
High Message Name: specify the alarm message that is displayed when the threshold is breached for the maximum limit.
Default: MonitorError
Low Operator: select an operator to match the retrieved value with the threshold for the minimum limit. The drop-down list has the
disabled, = (equal to), and the != (not equal to) options.
Default: disabled (indicates that the low threshold alarm is disabled)
Low Threshold: specify the minimum threshold for the monitor.
Low Message Name: specify the alarm message that is displayed when the threshold is breached for the minimum limit.
4. Click Save to save the configuration.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
ibm_svc Node
<Resource Name> Node
<Host Name> Node
Clusters Node
<Cluster Name> Node
ibm_svc Node
This node lets you view the probe information and configure the log properties of the probe.
Navigation: ibm_svc
Set or modify the following values, as needed:
ibm_svc > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the vendor who created the probe.
ibm_svc > Probe Setup
This section allows you to configure the general setup properties of the probe.
Log Level: specifies the level of details that are written to the log file.
Default: 5 - Trace
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when
debugging.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Note: The Low Operator, Low Threshold, and Low Message Name fields are used for configuring low thresholds.
Typically, the low threshold generates a warning alarm and the high threshold generates an error alarm.
Clusters Node
The Clusters node is used for classifying the list of arrays, which are available in the IBM SVC storage system. This node does not contain any
fields or sections for configuring the monitors of the resource.
Disk Controller
Hosts
IO Groups
MDisk
MDisk Groups
Nodes
Ports
VDisk
The cluster element nodes further contain sub nodes for configuring monitors of different sections of an element. All element nodes and their
corresponding sub nodes contain only the Monitors section. The Monitors section is used for configuring monitors of that IBM SVC component.
Contents
Verify Prerequisites
Create a Resource
Configure the Monitors
Configure Monitors Manually
Edit Monitor Properties
Use Templates
Use Auto Configurations
Add a Template to Auto Configurations
Add a Monitor to Auto Configurations
Manage Alarm Messages
Verify Prerequisites
The ibm_svc probe requires a user account to access the web interface and command interpreter of the IBM System Storage SAN Volume
Controller (SVC). The user must have the privileges to retrieve the data from the storage server using SMI-S and CLI. Verify that required
hardware and software is available before you configure the probe. For more information, see ibm_svc (IBM SVC Monitoring) Release Notes.
Create a Resource
You can create a resource to monitor the IBM SVC storage system. This resource establishes a connection between the storage system and the
probe to collect data about the storage system. This data is used to generate alarms and QoS messages. The ibm_svc probe enables you to
create more than one resource.
Follow these steps:
1. Click the Create New Resource icon on the toolbar. You can also right-click the Resources node in the navigation pane and select the
New Resource option.
The Resource dialog appears.
2. Enter the following information:
Hostname or IP Address: define the Hostname or IP address for accessing the IBM SVC storage system.
Port: define the port on which the SMI-S provider is listening. The default port is 5988 for HTTP, and 5989 for HTTPS.
Active: select the checkbox to enable the monitoring of the resource.
Username: define the user account for accessing the SMI-S provider and the command-line interface.
Password: define the password for the user account.
Alarm Message: specify the alarm message to be issued when the probe is not able to connect with the resource.
Check Interval: specify the time interval in seconds after which the probe retrieves data from the storage system. .
Default: 600
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Namespace: specifies the root/ibm as the namespace for the IBM SVC storage system. This field is read-only.
Storage Type: select the storage type to monitor. The available options are: SVC and V7000.
Use SSL: select the checkbox to enable the probe to use HTTPS to connect with the storage system.
3. Click Test.
The probe discovers the IBM SVC system.
4. Click Apply on the dialog.
5. Click OK to restart the probe when prompted.
The new resource, which connects the probe to the IBM SVC storage system, appears under the Resources node. You can also right-click an
existing resource and select the Edit option for updating the resource properties.
The Monitor Properties dialog allows you to define the thresholds for generating alarms and configure the QoS messages.
Follow these steps:
1. Click the desired component in the left pane.
The list of its monitors is displayed in the right pane.
2. Double-click the required monitor. You can also right-click the monitor and select Edit from the menu.
The Monitor Properties dialog appears.
3. Enter the following information in this dialog:
Resources: specify the monitor name.
Key: identify the component type and component property.
Description: specify the additional information about the monitor.
Value Definition: specify the type of value for comparing it with the threshold value and generate alarms. This value type is also used
in the QoS messages.
Samples: specify the number of samples to use for The average value last option in the Value Definition field. The maximum
number of samples is 4.
Active: select the checkbox to activate the Enable Alarming and Publish Quality of Service (QoS) fields for configuring the alarms
and QoS.
Enable Alarming: select the checkbox to activate the Operator, Threshold, Unit, and Message ID fields for configuring the alarms.
You can configure both high and low thresholds. The low threshold generates a warning alarm and the high threshold generates an
error alarm.
Initially you can set the high threshold to a default value or the current value and disable the low threshold.
Publish Quality of Service (QoS): allows you to select the QoS message that the monitor displays from the QoS Name drop-down
list,.
4. Configure the monitor properties and click OK.
The configuration properties of the monitor take effect to enable the probe for generating alarms and QoS accordingly.
Use Templates
Templates are reusable sets of monitors. You can create templates and drag-and-drop them to the Auto Configurations node. The template
monitors are applied to all the relevant components of a resource. You can also apply a template on individual component of a resource. You can
also edit an existing template for updating the list of monitors and their corresponding monitoring parameters.
Follow these steps:
1. Click the Create New Template icon on the toolbar.
You can also right-click the Templates node in the navigation pane and select the New Template option from the menu.
The Template Properties dialog appears.
2. Define the template name and description.
3. Click OK.
4. Add the monitors to the template in one of the following ways:
Drag-and-drop the monitors from the right pane onto the template node in the left pane.
Right-click the monitor and select Add to Template option from the menu.
5. Configure the monitor properties. For more information, see the Edit Monitor Properties section.
5.
Note: Select the Active and Enable Alarming options in the Monitor Properties dialog, which enables the monitor to collect
data.
6. Drag-and-drop the template on the Auto Configurations node and apply the template monitors to the resource.
You can also drag-and-drop the template on individual component of resource.
7. Click Apply.
8. Click OK to restart the probe when prompted.
The template is added in the left pane of probe.
Important! Adding excessive monitors or templates to the Auto Configurations node can overburden the system.
The auto configurations feature is implemented through two sub nodes of the All Resources node in the navigation pane:
Auto Configurations Node: Lists the templates and individual monitors for the resource. The probe searches through the resources and
applies relevant monitors to its components.
Auto Monitors Node: Lists the auto monitors for applying them on new devices. The properties of these monitors are inherited from the
monitors and templates of the Auto Configurations node.
You can add a template to the Auto Configurations node of a resource by applying all template monitors to all components of the resource. The
auto configuration is automatically applied to new devices for the resource.
Follow these steps:
1. Click the Templates node in the navigation pane.
The list of templates is displayed in the content pane.
2. Drag-and-drop the template from the content pane onto the Auto Configurations node.
Note: Drag-and-drop a template onto a component of the resource for applying the template only to that component.
3. Click the Auto Configurations node and verify that the template is listed in the content pane.
4. Click Apply.
5. Click OK to restart the probe when prompted.
The monitors of the template are applied to all components of the template.
Add a Monitor to Auto Configurations
You can add a single monitor to the Auto Configurations node of a resource to apply the monitor to all components of a resource.
Follow these steps:
1. Expand the All Resources node in the navigation pane and click a component.
The list of available monitors for the selected component is displayed in the content pane.
2. Drag-and-drop the required monitor on the Auto Configurations node.
3.
Note: Use the Alarm Message Variables and provide real-time information in the Error Alarm Text and Clear Alarm Text
fields.
Error Severity: specify the severity of the error alarm.
Subsystem Sting/Id: specify the subsystem Id of the alarm message. The watcher uses this subsystem Id.
Note: These messages are configured under all monitoring profiles of the ibm_svc probe.
Resources Node
Resource IP Address Node
Templates Node
The Right Pane
The Tool Buttons
General Setup
Alarm Message Variables
The Resource node displays a list of IBM SVC resources, which are configured in the probe for monitoring. Each resource is used for
establishing a connection between the probe and IBM SVC system for collecting and storing data for the monitored components. The Resources
node also displays the connection status for each resource:
= Connection to the host is OK.
= System is not available.
The Resource IP Address node for the IBM SVC storage has the following nodes:
Auto Configurations
Configures the unmonitored devices automatically. You can add one or more checkpoints (or templates) to this node using drag-and-drop
feature.
Auto Monitors
Lists the auto monitors and the location of the component in the Auto Configurations node where these are applied.
All Monitors
Lists all monitors, which are configured in the Auto Configurations and Auto Monitors node.
Clusters
Displays the cluster hierarchy of the storage system. You can drill down to view various elements such as Disk Controllers, Hosts, IO
Groups, MDisk, MDisk Groups, Nodes, Ports, and VDisk and devices with other physical elements of the IBM SVC storage system.
Templates Node
The Templates node displays a list of monitoring templates, which contain a list of monitors with their corresponding monitoring properties. You
can drag-and-drop a template monitor on a resource to apply all template monitors and start monitoring the resource.
The probe contains the default template: Default Auto Configurations.
If you select the Templates node in the navigation pane, the right pane lists Template definitions.
On right-clicking the content pane, a context menu appears. The menu items are New, Edit, Activate, and Rename for managing the selected
object.
Note: When a monitor is selected, the Refresh menu item refreshes the display only and not the updated values. The new values are
fetched after the poll interval of the selected resource. See the section for details.
The General Setup button of the toolbar displays the Setup dialog for configuring the log level. Specify one of the following log options:
0= Fatal
1= Errors
2= Warnings
3= Information
4= Debug info
5= Debug information
Alarm Message Variables
For the alarm messages, the Error Alarm Text and Clear Alarm Text fields of the Message Properties dialog, lets you to use variables. These
variables are resolved with values specific to each instance of the alarm message when the message is issued. The dollar sign ($) is placed
before these variables. You can use the following variables:
$Resource: resource referred to in an alarm message.
$Source: source IP address for alarms and QoS data for the resource.
$Monitor: monitor (checkpoint) referred to in the alarm message.
$Desc: description of the monitor.
$Key: monitor key (typically the same as the name of the monitor).
$Value: current value of the monitor.
$Oper: operand to be combined with the value and the threshold in the alarm message.
$Thr: threshold value of the alarm.
$Unit: unit to be combined with the value in the alarm message (for example, Boolean).
ibm_svc Metrics
The following table describes the metrics that can be configured using the IBM SVC Monitoring (ibm_svc) probe.
QoS Metrics
QoS Monitor Name
Units
Description
Version
QOS_STORAGE_RESOURCE_CONFIGURATION_EXECUTION_TIME
Seconds
v1.0
QOS_STORAGE_RESOURCE_DISCOVERY_EXECUTION_TIME
Seconds
v1.0
QOS_STORAGE_RESOURCE_STATS_EXECUTION_TIME
Seconds
v1.0
QOS_STORAGE_CLUSTER__PERCENT_FREE_CAPACITY
Percent
% Free Capacity
v1.0
QOS_STORAGE_CLUSTER__PERCENT_USED_CAPACITY
Percent
% used capacity
v1.0
QOS_STORAGE_CLUSTER__CONSOLE_PORT
Unit
Console Port
v1.0
QOS_STORAGE_CLUSTER__ALLOCATED_CAPACITY
Tera Bytes
Allocated Capacity
v1.0
QOS_STORAGE_CLUSTER__AVAILABLE_CAPACITY>
Tera Bytes
Available Capacity
v1.0
QOS_STORAGE_CLUSTER__TOTAL_CAPACITY
Tera Bytes
v1.0
QOS_STORAGE_CLUSTER__BACKEND_STORAGE_CAPACITY
Tera Bytes
v1.0
QOS_STORAGE_CLUSTER__STATISTICS_FREQUENCY
Minute
Statistics Frequency
v1.0
QOS_STORAGE_CLUSTER__MAX_NUMBER_OF_NODES
Count
v1.0
QOS_STORAGE_CLUSTER__OPERATIONAL_STATUS
State
v1.0
QOS_STORAGE_CLUSTER__POOL_CAPACITY
Tera Bytes
Pool Capacity
v1.0
QOS_STORAGE_CLUSTER__STATISTICS_STATUS
Unit
Statistics Status
v1.0
QOS_STORAGE_CLUSTER__TOTAL_USED_CAPACITY
Tera Bytes
v1.0
QOS_STORAGE_CLUSTER__TOTAL_OVERALLOCATION
Percent
Total Overallocation
v1.0
QOS_STORAGE_CLUSTER__TOTAL_VDISK_COPY_CAPACITY
Tera Bytes
v1.0
QOS_STORAGE_CLUSTER__CONNECTION_TYPE
Unit
Connection Type
v1.0
QOS_STORAGE_CLUSTER__FC_PORT_SPEED
MBps
v1.0
QOS_STORAGE_CLUSTER_COMPRESSION_CPU_PC
Percent
v1.0
QOS_STORAGE_CLUSTER_CPU_PC
Percent
v1.0
QOS_STORAGE_CLUSTER_FC_MB
MBps
v1.0
QOS_STORAGE_CLUSTER_FC_IO
Count/sec
v1.0
QOS_STORAGE_CLUSTER_SAS_MB
MBps
SAS throughput
v1.0
QOS_STORAGE_CLUSTER_SAS_IO
Count/sec
SAS IOPS
v1.0
QOS_STORAGE_CLUSTER_ISCSI_MB
MBps
v1.0
QOS_STORAGE_CLUSTER_ISCSI_IO
Count/sec
SCSI IOPS
v1.0
QOS_STORAGE_CLUSTER_WRITE_CACHE_PC
Count
v1.0
QOS_STORAGE_CLUSTER_TOTAL_CACHE_PC
Count
v1.0
QOS_STORAGE_CLUSTER_VDISK_MB
MBps
v1.0
QOS_STORAGE_CLUSTER_VDISK_IO
Count/sec
v1.0
QOS_STORAGE_CLUSTER_VDISK_MS
Milliseconds
v1.0
QOS_STORAGE_CLUSTER_MDISK_MB
MBps
v1.0
QOS_STORAGE_CLUSTER_MDISK_IO
Count/sec
v1.0
QOS_STORAGE_CLUSTER_MDISK_MS
Milliseconds
v1.0
QOS_STORAGE_CLUSTER_VDISK_W_MB
MBps
v1.0
QOS_STORAGE_CLUSTER_VDISK_W_IO
Count/sec
v1.0
QOS_STORAGE_CLUSTER_VDISK_W_MS
Milliseconds
v1.0
QOS_STORAGE_CLUSTER_MDISK_W_MB
MBps
v1.0
QOS_STORAGE_CLUSTER_MDISK_W_IO
Count/sec
v1.0
QOS_STORAGE_CLUSTER_MDISK_W_MS
Milliseconds
v1.0
QOS_STORAGE_CLUSTER_VDISK_R_MB
MBps
v1.0
QOS_STORAGE_CLUSTER_VDISK_R_IO
Count/sec
v1.0
QOS_STORAGE_CLUSTER_VDISK_R_MS
Milliseconds
v1.0
QOS_STORAGE_CLUSTER_MDISK_R_MB
MBps
v1.0
QOS_STORAGE_CLUSTER_MDISK_R_IO
Count/sec
v1.0
QOS_STORAGE_CLUSTER_MDISK_R_MS
Milliseconds
v1.0
QOS_STORAGE_CLUSTER_POWER_W
Watts
Power Consumed
v1.0
QOS_STORAGE_CLUSTER_TEMP_C
Celsius
Temperature in Celsius
v1.0
QOS_STORAGE_CLUSTER_TEMP_F
Fahrenheit
Temperature in Fahrenheit
v1.0
QOS_STORAGE_NODE__FAILOVER_ACTIVE
Unit
v1.0
QOS_STORAGE_NODE__IS_CONFIG_NODE
Unit
Configuration Node
v1.0
QOS_STORAGE_NODE__OPERATIONAL_STATUS
State
v1.0
QOS_STORAGE_NODE_COMPRESSION_CPU_PC
Percent
v1.0
QOS_STORAGE_NODE_CPU_PC
Percent
v1.0
QOS_STORAGE_NODE_FC_MB
MBps
v1.0
QOS_STORAGE_NODE_FC_IO
Count/sec
v1.0
QOS_STORAGE_NODE_SAS_MB
MBps
SAS throughput
v1.0
QOS_STORAGE_NODE_SAS_IO
Count/sec
SAS IOPS
v1.0
QOS_STORAGE_NODE_ISCSI_MB
MBps
v1.0
QOS_STORAGE_NODE_ISCSI_IO
Count/sec
iSCSI IOPS
v1.0
QOS_STORAGE_NODE_WRITE_CACHE_PC
Count
v1.0
QOS_STORAGE_NODE_TOTAL_CACHE_PC
Count
v1.0
QOS_STORAGE_NODE_VDISK_MB
MBps
v1.0
QOS_STORAGE_NODE_VDISK_IO
Count/sec
v1.0
QOS_STORAGE_NODE_VDISK_MS
Milliseconds
v1.0
QOS_STORAGE_NODE_MDISK_MB
MBps
v1.0
QOS_STORAGE_NODE_MDISK_IO
Count/sec
v1.0
QOS_STORAGE_NODE_MDISK_MS
Milliseconds
v1.0
QOS_STORAGE_NODE_VDISK_W_MB
MBps
v1.0
QOS_STORAGE_NODE_VDISK_W_IO
Count/sec
v1.0
QOS_STORAGE_NODE_VDISK_W_MS
Milliseconds
v1.0
QOS_STORAGE_NODE_MDISK_W_MB
MBps
v1.0
QOS_STORAGE_NODE_MDISK_W_IO
Count/sec
v1.0
QOS_STORAGE_NODE_MDISK_W_MS
Milliseconds
v1.0
QOS_STORAGE_NODE_VDISK_R_MB
MBps
v1.0
QOS_STORAGE_NODE_VDISK_R_IO
Count/sec
v1.0
QOS_STORAGE_NODE_VDISK_R_MS
Milliseconds
v1.0
QOS_STORAGE_NODE_MDISK_R_MB
MBps
v1.0
QOS_STORAGE_NODE_MDISK_R_IO
Count/sec
v1.0
QOS_STORAGE_NODE_MDISK_R_MS
Milliseconds
v1.0
QOS_STORAGE_MDISK_RB
Bytes
v1.0
QOS_STORAGE_MDISK__READ_I_OS_PER_SEC
Count/sec
v1.0
QOS_STORAGE_MDISK_RO
Count
v1.0
QOS_STORAGE_MDISK_WO
Count
v1.0
QOS_STORAGE_MDISK__WRITE_I_OS_PER_SEC
Count/sec
v1.0
QOS_STORAGE_MDISK__TOTAL_I_OS_PER_SEC
Count/sec
MDisk TotalIOPS
v1.0
QOS_STORAGE_MDISK_RE
Miliseconds
v1.0
QOS_STORAGE_MDISK_WE
Miliseconds
v1.0
QOS_STORAGE_MDISK__TOTAL_THROUGHPUT
KBps
v1.0
QOS_STORAGE_IOGROUP__FLASH_COPY_FREE_MEMORY
Mega Bytes
v1.0
QOS_STORAGE_IOGROUP__FLASH_COPY_TOTAL_MEMORY
Mega Bytes
v1.0
QOS_STORAGE_IOGROUP__MIRROR_FREE_MEMORY
Mega Bytes
v1.0
QOS_STORAGE_IOGROUP__MIRROR_TOTAL_MEMORY
Mega Bytes
v1.0
QOS_STORAGE_IOGROUP__R_A_I_D_FREE_MEMORY
Mega Bytes
v1.0
QOS_STORAGE_IOGROUP__R_A_I_D_TOTAL_MEMORY
Mega Bytes
v1.0
QOS_STORAGE_IOGROUP__NUMBER_OF_HOSTS
Count
v1.0
QOS_STORAGE_IOGROUP__NUMBER_OF_NODES
Count
v1.0
QOS_STORAGE_IOGROUP__NUMBER_OF_VOLUMES
Count
v1.0
QOS_STORAGE_IOGROUP__OPERATIONAL_STATUS
State
v1.0
QOS_STORAGE_IOGROUP__REMOTE_COPY_FREE_MEMORY
Mega Bytes
v1.0
QOS_STORAGE_IOGROUP__REMOTE_COPY_TOTAL_MEMORY
Mega Bytes
v1.0
QOS_STORAGE_IOGROUP__COMPRESSION_SUPPORTED
Unit
v1.0
QOS_STORAGE_POOL__PERCENT_FREE_CAPACITY
Percent
% Free Capacity
v1.0
QOS_STORAGE_POOL__PERCENT_USED_CAPACITY
Percent
% used Capacity
v1.0
QOS_STORAGE_POOL__NATIVE_STATUS
State
v1.0
QOS_STORAGE_POOL__NUMBER_OF_BACKEND_VOLUMES
Count
v1.0
QOS_STORAGE_POOL__NUMBER_OF_STORAGE_VOLUMES
Count
v1.0
QOS_STORAGE_POOL__OPERATIONAL_STATUS
State
v1.0
QOS_STORAGE_POOL__REMAINING_MANAGED_SPACE
Tera Bytes
v1.0
QOS_STORAGE_POOL__TOTAL_MANAGED_SPACE
Tera Bytes
v1.0
QOS_STORAGE_POOL__VIRTUAL_CAPACITY
Tera Bytes
v1.0
QOS_STORAGE_POOL__OVERALLOCATION
Tera Bytes
Pool Overalloaction
v1.0
QOS_STORAGE_POOL__PRIMORDIAL
Unit
Is Primordial?
v1.0
QOS_STORAGE_MDISK__AVAILABILITY
State
Availability
v1.0
QOS_STORAGE_MDISK__ACCESS
State
v1.0
QOS_STORAGE_MDISK__PERCENT_FREE_CAPACITY
Percent
% Free Capacity
v1.0
QOS_STORAGE_MDISK__PERCENT_USED_CAPACITY
Percent
% used Capacity
v1.0
QOS_STORAGE_MDISK__OPERATIONAL_STATUS
State
v1.0
QOS_STORAGE_MDISK__BLOCK_SIZE
Bytes
v1.0
QOS_STORAGE_MDISK__NUMBER_OF_BLOCKS
Blocks
v1.0
QOS_STORAGE_MDISK__CONSUMABLE_BLOCKS
Blocks
v1.0
QOS_STORAGE_MDISK__EXTENT_STATUS
State
MDisk ExtentStatus
v1.0
QOS_STORAGE_MDISK__NO_SINGLE_POINT_OF_FAILURE
Unit
v1.0
QOS_STORAGE_MDISK__DATA_REDUNDANCY
Unit
v1.0
QOS_STORAGE_MDISK__PACKAGE_REDUNDANCY
Unit
v1.0
QOS_STORAGE_MDISK__DELTA_RESERVATION>
Unit
v1.0
QOS_STORAGE_MDISK__PRIMORDIAL
Unit
MDisk is Primordail?
v1.0
QOS_STORAGE_MDISK__MODE
Unit
MDisk Model
v1.0
QOS_STORAGE_MDISK__MAX_PATH_COUNT
Count
v1.0
QOS_STORAGE_MDISK__PATH_COUNT
Count
v1.0
QOS_STORAGE_MDISK__SEQUENTIAL_ACCESS
Unit
v1.0
QOS_STORAGE_MDISK__QUORUM_INDEX
Unit
v1.0
QOS_STORAGE_MDISK__CAPACITY
Tear Bytes
MDisk Capacity
v1.0
QOS_STORAGE_MDISK__REMAINING_MANAGED_SPACE
Unit
v1.0
QOS_STORAGE_FCPORT__FULL_DUPLEX
Unit
v1.0
QOS_STORAGE_FCPORT__MAX_SPEED
GBps
v1.0
QOS_STORAGE_FCPORT__OPERATIONAL_STATUS
State
v1.0
QOS_STORAGE_FCPORT__PORT_NUMBER
Unit
v1.0
QOS_STORAGE_FCPORT__PORT_TYPE
Unit
v1.0
QOS_STORAGE_FCPORT__SPEED
GBps
v1.0
QOS_STORAGE_FCPORT__SUPPORTED_MAXIMUM_TRANSMISSION_UNIT
Giga Bytes
v1.0
QOS_STORAGE_FCPORT__USAGE_RESTRICTION
State
v1.0
QOS_STORAGE_PORT__NUMBER_OF_PORTS
Count
v1.0
QOS_STORAGE_PORT__NODE_LOGGED_IN_COUNT
Count
v1.0
QOS_STORAGE_PORT__PORT_AUTHENTICATED
Unit
v1.0
QOS_STORAGE_BACKENDCTR__OPERATIONAL_STATUS
State
v1.0
QOS_STORAGE_BACKENDCTR__VOLUME_LINK_COUNT
Count
v1.0
QOS_STORAGE_BACKENDCTR__VOLUME_MAX_LINK_COUNT
Count
v1.0
QOS_STORAGE_VDISK__AVAILABILITY
State
VDisk Availability
v1.0
QOS_STORAGE_VDISK__ACCESS
State
v1.0
QOS_STORAGE_VDISK__PERCENT_FREE_CAPACITY>
Percent
v1.0
QOS_STORAGE_VDISK__PERCENT_USED_CAPACITY
Percent
v1.0
QOS_STORAGE_VDISK__BLOCK_SIZE
Bytes
v1.0
QOS_STORAGE_VDISK__CONSUMABLE_BLOCKS
Blocks
v1.0
QOS_STORAGE_VDISK__CONTROLLED
Unit
VDisk Controlled
v1.0
QOS_STORAGE_VDISK__COPY_COUNT
Count
v1.0
QOS_STORAGE_VDISK__DATA_REDUNDANCY
Unit
v1.0
QOS_STORAGE_VDISK__NO_SINGLE_POINT_OF_FAILURE
Unit
v1.0
QOS_STORAGE_VDISK__NUMBER_OF_BLOCKS
Count
v1.0
QOS_STORAGE_VDISK__OPERATIONAL_STATUS
State
v1.0
QOS_STORAGE_VDISK__PRIMORDIAL
Unit
VDisk Primordial
v1.0
QOS_STORAGE_VDISK__THINLY_PROVISIONED
Unit
v1.0
QOS_STORAGE_VDISK__COMPRESSED
Unit
VDisk Compressed
v1.0
QOS_STORAGE_VDISK__THROTTLE_M_B_S
Unit
v1.0
QOS_STORAGE_VDISK__SYNC_RATE
Unit
v1.0
QOS_STORAGE_VDISK__FLASH_COPY_MAP_COUNT
Unit
v1.0
QOS_STORAGE_VDISK__CACHE_MODE
Unit
v1.0
QOS_STORAGE_VDISK__CACHE_STATE
State
v1.0
QOS_STORAGE_VDISK__CAPACITY
Giga Bytes
VDisk Capacity
v1.0
QOS_STORAGE_VDISK__REMAINING_MANAGED_SPACE
Unit
v1.0
QOS_STORAGE_HOST__ACCESS_GRANTED
Unit
v1.0
QOS_STORAGE_HOST__MAX_UNITS_CONTROLLED
Count
v1.0
QOS_STORAGE_HOST__OPERATIONAL_STATUS
State
v1.0
Array Groups
Logical Volumes
Disks
More information:
ibm-ds (IBM Disk Storage Systems) Monitoring Release Notes
ibm-ds AC Configuration
This article describes the configuration concepts and procedures to set up the IBM Disk Storage System Monitoring (ibm-ds) probe. The
ibm-ds probe is configured to monitor components, such as arrays, controllers, hosts, logical units, and disks. Once a connection is established
between the probe and the IBM disk storage system, you can configure monitors for generating the QoS and alarms.
The following diagram outlines the process to configure the probe to monitor the IBM disk storage systems.
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see ibm-ds (IBM Disk Storage
Systems Monitoring) Release Notes.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
ibm-ds Node
<Resource IP Address> Node
<IP Address> Node
Arrays Node
<Array Name> Node
ibm-ds Node
This node lets you view the probe information and configure the log properties of the probe.
Navigation: ibm-ds
Set or modify the following values, as required:
ibm-ds > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
ibm-ds > Probe Setup
This section lets you configure the log properties of the probe.
Log Level: specifies the level of details that are written to the log file.
Default: 3 - Info
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when
debugging.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Username: specifies a valid domain name and username to be used by the probe to access the SMI-S provider.
Password: specifies a valid password for the given username.
Alarm Message: specifies the alarm message to be generated when the probe is unable to connect with the IBM DSxxxx system.
Default: ResourceCritical
Use SSL: allows the probe to use HTTPS for connecting to the IBM SVC storage system.
Namespace: identifies the namespace that is supported by the Device Manager and the SMI-S version.
Note: The Low Operator, Low Threshold, and Low Message Name fields are used for configuring low thresholds. The
low threshold generates a warning alarm and the high threshold generates an error alarm.
Similarly, you can configure the Discovery, Statistics, and Status monitors.
Arrays Node
The Arrays node is used for classifying the list of arrays, which is available in the IBM disk storage system.
Note: This node does not contain fields or sections for configuring the monitors of the resource.
ibm-ds IM Configuration
This article describes the configuration concepts and procedures to set up the ibm-ds probe. The probe configuration includes establishing a
communication link between the probe and IBM DSxxxx system. The probe enables you to:
Define the IBM resource for monitoring.
Configure the monitors for getting required alarms and data.
Use templates and auto configuration options to configure the monitors.
Manage the alarm messages.
The following diagram outlines the process to configure the probe to monitor the IBM DSxxxx system.
Contents
Verify Prerequisites
Create a Resource
Set Up Verification
Set the Logging Level
Add Monitors
Using Auto Configurations
Using Templates
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see ibm-ds (IBM Disk Storage
Systems Monitoring) Release Notes.
Create a Resource
The probe requires a resource to connect to the IBM DS server to collect data about the server. This data is used to generate alarms and QoS
messages. The ibm-ds probe enables you to create more than one resource.
Follow these steps:
1. Click the New Resource icon on the toolbar. You can also right-click the Resources node in the navigation pane and select the New
Resource option.
The Resource[New] dialog appears.
2. Enter the following information:
Hostname or IP address: specifies the host name or the IP address of the IBM DSxxxx system that you want to monitor.
Port: specifies the port to connect to the SMI-S provider.
Default: 443
Username: specifies a valid domain name and username to be used by the probe to access the SMI-S provider.
Password: specifies a valid password for the given username.
Alarm Message: specifies the alarm message to be issued when the probe is unable to connect to the resource.
Default: ResourceCritical
Note: For more information about creating or modifying a message, see Using Message Pool Manager.
Check Interval: specifies the time interval after which the probe retrieves data from the IBM DS server. The value can be set in
seconds, minutes, or hours. CA recommends polling once every 10 minutes. The polling interval cannot be smaller than the time
required to collect the data.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
IMPORTANT! Even after correctly entering values for all the fields in the Resource dialog, if you are unable to connect to the
IBM DS system, you can use a tool to view and manipulate the CIM objects on the SMI-S server. You can use tools such as
CimNavigator (not affiliated with or endorsed by CA Unified Infrastructure Management). The tool can help you identify
configuration issues.
Set Up Verification
This section describes how to verify your configuration and communication with the IBM storage device. To verify your setup, perform the
following steps:
1. Verify the probe configuration.
2. Verify network communication.
Note: If you cannot collect performance data, ensure that the firmware version of the IBM system is 7.0 or later. Upgrade the firmware,
if required.
Add Monitors
You can manually select monitors to be measured.
Follow these steps:
1.
Note: You can also select the All Monitors node to list all active monitors in the content pane. Then, you can select monitors,
as required.
When the monitors for a resource are visible in the content pane, the Value column shows the current values for the monitors. To enable the
probe to send QoS data or alarms for threshold breaches, you can use the information in the Value column to modify the properties for each
monitor.
Edit Monitor Properties
To edit the properties of a monitor, double-click the monitor, or right-click it and select Edit. The Monitor Properties dialog appears.
Set or modify the following values, as required:
Value Definition: allows you to select the value that is used to generate alarms and QoS.
Samples: allows you to select the number of samples to use for The average value last option in the Value Definition field. The
maximum number of samples is 4.
Active: allows you to select the checkbox to activate or deactivate the monitor without changing any of its settings.
Enable Alarming: allows you to select this option to enable the alarm capability of the monitor.
Note: Enabling this option in the Monitor Properties dialog also enables the monitor in the content pane. You can enable or
disable monitoring of the checkpoint from the content pane or with this field.
Alarms: specifies the alarm properties of the monitor by defining high and low threshold. By default, the high threshold is set to a default
value, or the current value. Set this value to match your organizational requirements. The low threshold is initially disabled. You can
select and configure an operator other than "disabled" from the list to match your organizational requirements.
Operator: allows you to select the operator to use when setting the alarm threshold for the measured value. For example, >= 90 means
the monitor is in alarm condition if the measured value is above 90. = 90 means the monitor is in alarm condition when the measured
value is exactly 90.
Threshold: define the alarm threshold value. An alarm message is sent when this threshold is exceeded.
Unit: indicates a unit for the monitored value.
Example: %, Mbytes, or KB. The field is read-only.
Message ID: allows you to select the alarm message to be issued when the specified threshold value is breached. These messages
reside in the message pool. You can manage the messages in the Message Pool Manager.
Using Message Pool Manager
You can add, remove, or modify alarm messages. These messages are sent when a QoS threshold is breached.
Add or Edit an Alarm Message
Follow these steps:
1. Click the Message Pool Manager button on the toolbar.
The Message Pool dialog appears.
2. Click Add to add an alarm message or click Edit to modify an existing message.
The Message Properties dialog appears.
3. Set or modify the following values, as required:
Identification Name: specifies a name for the message.
Token: specifies the type of alarm, either "monitor_error" or "resource_error".
3.
Error Alarm Text: specifies the alarm text that is sent when a violation occurs. You can also use variables in this field.
Example: $monitor
This variable puts the actual monitor name in the alarm text. Examples of available variables are $resource, $host, $port, $descr,
$key, $unit, $value, $oper, and $thr.
Clear Alarm Text (OK): specifies the text that is sent when an alarm is cleared.
Error Severity: specifies the severity of the alarm.
Subsystem string/id: specifies the alarm subsystem ID that defines the alarm source.
4. Click OK to save the new message.
The message is added or modified.
IMPORTANT! Adding too many monitors or templates to the Auto Configurations node can overburden the system.
Auto Monitors
This node lists the auto monitors created for previously unmonitored devices. The values are based on the content added to the Auto
Configurations node. Auto monitors are only created for devices that are not being monitored when the device is discovered.
Using Templates
Templates let you define reusable sets of monitors to apply to the available auto configurations. Applying monitoring with templates provides
consistent monitoring across multiple devices.
You can create your own templates and can define a set of monitors belonging to them. You can then apply these templates to the Auto
Configurations node in the navigation pane by dragging and dropping the checkpoints of the template.
If you drag-and-drop a checkpoint of a template to the Auto Configurations node, the monitors are applied to all relevant components of the IBM
DSxxxx system and new DSxxxx components added.
Create a Template
You can create a template:
Click the New Template button on the toolbar.
Right-click the Templates node in the navigation pane, and select New Template.
In the Template Properties dialog that appears, enter a name and description for the new template and click OK.
To edit an existing template, right-click the template under the Templates node in the navigation pane, and select Edit.
Apply a Template
Follow these steps:
1. Click the Templates node in the navigation pane to list all available templates in the content pane.
2. Select the desired template from the list in the content pane.
3. Drag-and-drop it on the Auto Configurations node in the navigation pane.
4. Click the Auto Configurations node to verify that the content was successfully added to the template.
Resources Node
The Resources node displays a list of IBM DS resources, which are configured in the probe for monitoring. Each resource is an SMI-S provider
that can discover DSxxxx systems and provide a connection to them, allowing the probe to collect and store data for the monitored components.
The Resources node also displays the connection status for each resource as follows:
= Connection to the host is OK.
= System is not available.
= System is trying to connect.
Hosts, Service Processors, Storage Pools, and, Volumes with other physical elements of the IBM DS storage system.
Templates Node
The Templates node displays a list of monitoring templates that contains a list of monitors with their corresponding monitoring properties. You
can drag-and-drop a template monitor on a resource to apply all template monitors and start monitoring the resource.
The probe contains the following default templates:
Status Monitors
Controller Monitors
Disk Monitors
Port Monitors
Array Monitors
Pool Monitors
Array Monitors
Default Auto Configurations
Logical Volume Monitors
Note: When a monitor is selected, the Refresh menu item refreshes the display only and not the updated values. The new values are
retrieved after the poll interval of the selected resource.
ibm-ds Metrics
This section describes the metrics for the IBM Disk Storage System Monitoring (ibm-ds) probe.
Contents
QoS Metrics
QoS Operational Status Values
QoS Metrics
This section contains the QoS metrics for the probe.
QoS Monitor
Unit
Description
Version
QOS_STORAGE_ARRAY_KBYTES_READ
KB
1.0
QOS_STORAGE_ARRAY_KBYTES_READ_RATE
KB/Sec
1.0
QOS_STORAGE_ARRAY_KBYTES_TRANSFERRED
KB
1.0
QOS_STORAGE_ARRAY_KBYTES_TRANSFERRED_RATE
KB/Sec
1.0
QOS_STORAGE_ARRAY_KBYTES_WRITTEN
KB
1.0
QOS_STORAGE_ARRAY_KBYTES_WRITTEN_RATE
KB/Sec
1.0
QOS_STORAGE_ARRAY_MAX_HOT_SPARES
Count
1.0
QOS_STORAGE_ARRAY_MAX_SNAPSHOTS
Count
1.0
QOS_STORAGE_ARRAY_MAX_STORAGE_VOLUMES
Count
1.0
QOS_STORAGE_ARRAY_MAX_VOL_COPYS
Count
1.0
The operational status of the array. For more information, see QoS
Operational Status Values.
1.0
QOS_STORAGE_ARRAY_OPERATIONAL_STATUS
QOS_STORAGE_ARRAY_READ_HIT_IOS
Count
1.0
QOS_STORAGE_ARRAY_READ_HIT_RATIO
Percent
1.0
QOS_STORAGE_ARRAY_READ_IOS
Count
1.0
QOS_STORAGE_ARRAY_READ_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_ARRAY_REMAINING_MANAGED_SPACE
Gigabytes
1.0
QOS_STORAGE_ARRAY_TOTAL_IOS
Count
1.0
QOS_STORAGE_ARRAY_TOTAL_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_ARRAY_TOTAL_MANAGED_SPACE
Gigabytes
1.0
QOS_STORAGE_ARRAY_WRITE_HIT_IOS
Count
1.0
QOS_STORAGE_ARRAY_WRITE_HIT_RATIO
Percent
1.0
QOS_STORAGE_ARRAY_WRITE_IOS
Count
1.0
QOS_STORAGE_ARRAY_WRITE_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_ARRAY_USED_MANAGED_SPACE
Gigabytes
1.0
QOS_STORAGE_ARRAY__PERCENT_USED_CAPACITY
1.0
QOS_STORAGE_COMPONENT_OPERATIONAL_STATUS
1.0
QOS_STORAGE_COMPONENT_TOTAL_OUTPUT_POWER
1.0
QOS_STORAGE_DISK_AVAILABILITY
Disk Availability
1.0
QOS_STORAGE_DISK_BLOCK_SIZE
1.0
QOS_STORAGE_DISK_CAPABILITIES
Disk Capabilities
1.0
1.0
1.0
QOS_STORAGE_DISK_CAPACITY
Percent
Gigabytes
QOS_STORAGE_DISK_CONSUMABLE_BLOCKS
QOS_STORAGE_DISK_CONSUMABLE_CAPACITY
Gigabytes
1.0
QOS_STORAGE_DISK_KBYTES_READ
KB
1.0
QOS_STORAGE_DISK_KBYTES_READ_RATE
KB/Sec
1.0
QOS_STORAGE_DISK_KBYTES_TRANSFERRED
KB
1.0
QOS_STORAGE_DISK_KBYTES_TRANSFERRED_RATE
KB/Sec
1.0
QOS_STORAGE_DISK_KBYTES_WRITTEN
KB
1.0
QOS_STORAGE_DISK_KBYTES_WRITTEN_RATE
KB/Sec
1.0
QOS_STORAGE_DISK_NOMINAL_ROTATION_RATE
RPM
1.0
QOS_STORAGE_DISK_NUMBER_OF_BLOCKS
1.0
QOS_STORAGE_DISK_OPERATIONAL_STATUS
The operational status of the disk. For more information, see QoS
Operational Status Values.
1.0
QOS_STORAGE_DISK_READ_IOS
Count
1.0
QOS_STORAGE_DISK_READ_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_DISK_READ_TIME_MAX
ms
1.0
QOS_STORAGE_DISK_RECOVERED_ERRORS
Count
1.0
QOS_STORAGE_DISK_RETRIED_IOS
Count
1.0
1.0
QOS_STORAGE_DISK_SPARE_STATUS
QOS_STORAGE_DISK_TIMEOUTS
Count
1.0
QOS_STORAGE_DISK_TOTAL_IOS
Count
1.0
QOS_STORAGE_DISK_TOTAL_IOS_RATE
Count/Sec
QOS_STORAGE_DISK_UNRECOVERED_ERRORS
Count
1.0
QOS_STORAGE_DISK_WRITE_IOS
Count
1.0
QOS_STORAGE_DISK_WRITE_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_DISK_WRITE_TIME_MAX
ms
1.0
QOS_STORAGE_POOL_CAPACITY_USED_PERCENT
Percent
1.0
The operational status of the storage pool. For more information, see
QoS Operational Status Values.
1.0
QOS_STORAGE_POOL_OPERATIONAL_STATUS
QOS_STORAGE_POOL_REMAINING_MANAGED_SPACE
Gigabytes
1.0
QOS_STORAGE_POOL_TOTAL_MANAGED_SPACE
Gigabytes
1.0
QOS_STORAGE_PORT_KBYTES_READ
KB
1.0
QOS_STORAGE_PORT_KBYTES_READ_RATE
KB/Sec
1.0
QOS_STORAGE_PORT_KBYTES_TRANSFERRED
KB
1.0
QOS_STORAGE_PORT_KBYTES_TRANSFERRED_RATE
KB/Sec
1.0
QOS_STORAGE_PORT_KBYTES_WRITTEN
KB
1.0
QOS_STORAGE_PORT_KBYTES_WRITTEN_RATE
KB/Sec
1.0
QOS_STORAGE_PORT_MAINT_OP
Count
1.0
QOS_STORAGE_PORT_MAX_SPEED
1.0
QOS_STORAGE_PORT_OPERATIONAL_STATUS
The operational status of the port. For more information, see QoS
Operational Status Values.
1.0
QOS_STORAGE_PORT_PORT_NUMBER
1.0
QOS_STORAGE_PORT_PORT_TYPE
1.0
QOS_STORAGE_PORT_READ_IOS
Count
1.0
QOS_STORAGE_PORT_READ_IOS_RATE
Count/Sec
1.0
1.0
QOS_STORAGE_PORT_SPEED
QOS_STORAGE_PORT_TOTAL_IOS
Count
The total number of IO operations from the port in the sample period
1.0
QOS_STORAGE_PORT_TOTAL_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_PORT_WRITE_IOS
Count
1.0
QOS_STORAGE_PORT_WRITE_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_SP_CACHE_MEMORY_SIZE
MB
1.0
QOS_STORAGE_SP_KBYTES_READ
KB
1.0
QOS_STORAGE_SP_KBYTES_READ_RATE
KB/Sec
1.0
QOS_STORAGE_SP_KBYTES_TRANSFERRED
KB
1.0
QOS_STORAGE_SP_KBYTES_TRANSFERRED_RATE
KB/Sec
1.0
QOS_STORAGE_SP_KBYTES_WRITTEN
KB
1.0
QOS_STORAGE_SP_KBYTES_WRITTEN_RATE
KB/Sec
1.0
1.0
QOS_STORAGE_SP_OPERATIONAL_STATUS
QOS_STORAGE_SP_PROCESSOR_MEMORY_SIZE
MB
1.0
QOS_STORAGE_SP_READ_HIT_IOS
Count
1.0
QOS_STORAGE_SP_READ_HIT_RATIO
Percent
1.0
QOS_STORAGE_SP_READ_IOS
Count
1.0
QOS_STORAGE_SP_READ_IOS_RATE
Count/Sec
1.0
1.0
QOS_STORAGE_SP_RESET_CAPABILITY
QOS_STORAGE_SP_TOTAL_IOS
Count
1.0
QOS_STORAGE_SP_TOTAL_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_SP_WRITE_HIT_IOS
Count
1.0
QOS_STORAGE_SP_WRITE_HIT_RATIO
Percent
1.0
QOS_STORAGE_SP_WRITE_IOS
Count
1.0
QOS_STORAGE_SP_WRITE_IOS_RATE
Count/Sec
1.0
1.0
QOS_STORAGE_VOL_BLOCK_SIZE
QOS_STORAGE_VOL_CAPACITY
Gigabytes
1.0
QOS_STORAGE_VOL_CONSUMABLE_BLOCKS
Blocks
1.0
QOS_STORAGE_VOL_CONSUMABLE_CAPACITY
Gigabytes
1.0
QOS_STORAGE_VOL_KBYTES_READ
KB
1.0
QOS_STORAGE_VOL_KBYTES_READ_RATE
KB/Sec
1.0
QOS_STORAGE_VOL_KBYTES_TRANSFERRED
KB
1.0
QOS_STORAGE_VOL_KBYTES_TRANSFERRED_RATE
KB/Sec
1.0
QOS_STORAGE_VOL_KBYTES_WRITTEN
KB
1.0
QOS_STORAGE_VOL_KBYTES_WRITTEN_RATE
KB/Sec
1.0
QOS_STORAGE_VOL_NUMBER_OF_BLOCKS
Count
1.0
The operational status of the LUN. For more information, see QoS
Operational Status Values.
1.0
QOS_STORAGE_VOL_OPERATIONAL_STATUS
QOS_STORAGE_VOL_READ_HIT_IOS
Count
1.0
QOS_STORAGE_VOL_READ_HIT_RATIO
Percent
1.0
QOS_STORAGE_VOL_READ_IOS
Count
1.0
QOS_STORAGE_VOL_READ_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_VOL_READ_TIME_MAX
ms
1.0
QOS_STORAGE_VOL_TOTAL_IOS
Count
1.0
QOS_STORAGE_VOL_TOTAL_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_VOL_WRITE_HIT_IOS
Count
1.0
QOS_STORAGE_VOL_WRITE_HIT_RATIO
Percent
1.0
QOS_STORAGE_VOL_WRITE_IOS
Count
1.0
QOS_STORAGE_VOL_WRITE_IOS_RATE
Count/Sec
1.0
QOS_STORAGE_VOL_WRITE_TIME_MAX
ms
1.0
Starting(8)
Stopping(9)
Stopped(10)
In Service(11)
No Contact(12)
Lost Communication(13)
Aborted(14)
Dormant(15)
Supporting Entity in Error(16)
Completed(17)
More information:
ibmvm (IBM Virtualization Monitoring) Release Notes
Configuration Overview
Add Resource Profiles
Add Monitoring
Alarm Thresholds
Configuration Overview
At a high level, configuring the probe consists of the following steps:
1. Add a Resource profile for each IBM virtualization enabled system that you want to monitor.
2. Add monitors to the appropriate system components and configure monitor data.
3.
Note: You must follow the Java standard of enclosing an IPv6 address in square brackets. For example: The input string
[f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace error that includes the
exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
Active
Select this checkbox to activate monitoring of the resource.
Port
The SSH port for the target system.
Interval (secs)
The time to wait for connection to establish.
Username
A valid username to be used by the probe to log in to the IBM server.
Password
A valid password to be used by the probe to log in to the IBM server.
Alarm Message
The alarm message to be sent if the resource does not respond.
4. Click Submit.
The profile is created and appears in the tree.
Add Monitoring
Once you add a resource profile, the components of the resource are displayed in the tree. Click a node in the tree to see any associated
monitors for that component. Configure the QoS measurements that you want to collect data for, and any alarms or events you want, by modifying
the appropriate fields.
Note: Users of CA Unified Infrastructure Management Snap can skip this step. The Default configuration is automatically applied when you
activate the probe in Snap.
Follow these steps:
1. Go to ibmvm > profile name > resource name.
2. Click a managed system, device, virtual I/O server (VIO), or virtual machines (VMs) name. It might be necessary to expand the node in
the tree to view the monitors and QoS metrics.
The available monitors appear in a table on the right side of the screen.
3. Select the monitor that you want to modify in the table.
4. Change monitor settings in the fields below the table.
5. Click Save at the top of the screen.
When the new configuration is loaded, a Success dialog appears.
6. Click OK.
The tree is updated with the new configuration.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Important! Alarm threshold settings are dependent on the baseline_engine probe. If you do not have the correct version of
baseline_engine configured, you will not see the additional threshold options.
This article describes how to apply monitoring for the IBM VM Monitoring (ibmvm) probe with templates.
Note: This article describes how to apply monitoring with templates for a single probe. For more information about how to use policies
to configure templates for multiple probes, see Configure Probes with Policies in the CA Unified Infrastructure Management wiki.
Contents
Overview
Verify Prerequisites
Enable Bulk Configuration
Using Templates
Apply a Default Template
Modify and Apply a Default Template
Create a Template
Create Template Filters
Add Rules to a Template Filter
Add Monitors to a Template Filter
Activate a Template
Using the Template Summary View
View Configuration in the All Monitors Table
Edit Configuration in Context
Overview
Applying monitoring with templates saves time compared to manual monitor configuration and provides consistent monitoring across multiple
devices. At a high level, applying monitoring with templates is a process in which you:
1. Enable bulk configuration
Before using the template editor, you first enable bulk configuration. This feature is disabled by default. It is also disabled if the probe has
been previously configured. Bulk configuration is possible only on a newly deployed probe (v2.3 or later) with no configuration.
2. Use the template editor
Once bulk configuration is enabled, you can copy and modify a default template or create a new template to define unique monitoring
configurations for an individual device or groups of devices in your environment.
Verify Prerequisites
Important! Bulk configuration is only possible on a newly deployed v2.3 (or later) probe with no previous configuration. When you
enable bulk configuration, Infrastructure Manager is disabled and the Template Editor appears in the Admin Console GUI. Once you
enable bulk configuration, there is no supported process for going back to manual configuration. Be sure that you fully understand and
accept the consequences of enabling bulk configuration before enabling it.
Using Templates
The template editor allows you to configure and apply monitoring templates. Templates reduce the time that you need for manual configuration
and provide consistent monitoring across the devices in your network. You can configure monitoring on many targets with a well-defined template.
You can also create multiple templates to define unique configurations for all devices and groups of target devices in your environment.
You can use the template editor to:
Copy, modify, and apply a default template
Create and apply a new template
You can customize any template that you copy or create by configuring:
Precedence
Precedence controls the order of template application. The probe applies a template with a precedence of one after a template with a
precedence of two. If there are any overlapping configurations between the two templates, then the settings in the template with a
precedence of one overrides the settings in the other template. If the precedence numbers are equal, then the templates are applied in
alphabetical order.
Note: The default template is applied with a precedence of 100. Be sure to set the precedence of your other templates to a
number lower than 100 so that the probe applies them at a higher priority than the default template. We recommend using incre
ments of 10. Using increments of 10, you can easily add custom templates and assign them a precedence that results in the
probe applying them in the order that you desire.
Filters
Filters let you control how the probe applies monitors based on attributes of the target device.
Rules
Rules apply to a device filter to create divisions within a group of systems or reduce the set of devices that the probe monitors.
Monitors
Monitors collect quality of service (QoS), event, and alarm data.
Note: Wait for the component discovery process to complete before using templates. Some QoS metrics are only applied to
components on specific devices. Determine what device types exist in your environment before you activate a template.
The default templates contain settings fora recommended monitor configuration. These default configurations include high-value metrics for
supported interfaces and network devices. Using these default configurations helps you to quickly start collecting data for your environment. To
save these recommended monitor configurations, the default templates are read-only. Because a default template is read-only, you first copy and
rename it before you apply it.
You can modify a default template to meet your specific needs. When your modifications are complete, you activate the template. The probe then
applies the template to the appropriate devices and components.
Follow these steps:
1. In Admin Console, select the probe and click Configure.
2. Click ibmvm > Template Editor.
3. Select either UMP_Metrics or VM_and_Host_Template.
4. Click Options (...) > Copy.
5. Enter the name of the template and a description.
6. (Optional) Determine if you want to modify the default precedence setting.
7. Click Submit.
8. Click Save.
9. (Optional) Create template filters.
10. (Optional) Add rules to a template filter.
11. (Optional) Add monitors to a template filter.
12. In the navigation tree, select the template that you created in steps 1-5.
The template set up dialog appears in the detail pane.
13. Check Active.
The probe applies the template with the modified settings.
Create a Template
Note: Do not activate the template if you want to configure template monitoring filters or rules. If you change the template state
to active, the probe immediately applies the configuration.
7. Click Submit.
8. Click Save.
The system creates a template that you can configure and activate.
For more information, see Create Template Filters, Add Rules to a Template Filter, Add Monitors to a Template Filter, and Activate a
Template.
Create Template Filters
Filters let you control how the probe applies monitors based on attributes of the target device.
Follow these steps:
1. In the template editor, select the template that you created.
2. Choose any node that has the Options (...) icon next to it.
3. Enter a descriptive name for the filter and a precedence.
4. Repeat the previous steps for every template that requires filters.
5. Click Submit.
The template filter is created.
Note: You must activate the template for the probe to apply the filter configuration. When you change the template state to
active, the probe immediately applies all template configuration, including filters, rules, and monitors.
A filter allows you to control which devices and monitors are associated with a particular template. You specify more device criteria by using rules.
Filters usually contain one or more rules to define the types of devices for the template. You can add rules to a device filter to create divisions
within a group of systems or reduce the set of devices that the probe monitors. For example, you can add a rule to apply a configuration to all
devices with the name Cisco.
Note: If no rules exist, the probe always applies the monitor configuration in an active template to all applicable devices.
Note: You must activate the template for the probe to apply the rule configuration. When you change the template state to active, the
probe immediately applies all template configuration, including filters, rules, and monitors.
Device filters contain a set of commonly used monitors that you can configure to meet you specific needs.
Follow these steps:
1. Click ibmvm > Template Editor >ibmvm probe > template name.
2. Click the desired device filter.
3. In the Detail pane, in the Monitors section, select any monitor.
A configuration dialog for the monitor appears below.
4. Check Include in Template to turn on the monitor.
5. (Optional) Check Publish Data.
6. (Optional) Check Publish Alarms.
Enter the desired settings for these required fields:
Value Definition
High Threshold
High Message Name
Low Operator
Low Threshold
Low Message Name
See Configure Alarm Thresholds for details.
7. Click Save.
The monitor is added.
Note: You must activate the template for the probe to apply the monitor configuration. When you change the template state to active,
the probe immediately applies all template configuration, including filters, rules, and monitors.
Activate a Template
The probe does not automatically apply the template configuration. The probe only applies templates in an active state. The template icon in the
navigation pane indicates the state of the template. The states are inactive (
to apply it.
) and active (
Important! The monitor settings in a template override any monitor settings in the probe configuration GUI.
Note: You cannot modify a configured monitor in the probe configuration GUI once you activate a template.
1.
Template Name
Description
Precedence
Activation Status
2. The middle section shows the All Monitors table that lists all monitors available in the template.
3. The bottom section shows the detailed configuration for a monitor selected in the table.
View Configuration in the All Monitors Table
You can view all of the monitors included in any template configuration in the All Monitors table.
Follow these steps:
1. In the left-hand navigation tree, select a template.
The Template Summary view appears in the detail pane. In the Monitors Included in Template table, you see all of the monitors available
for the template and their configuration status.
2. To see details for a specific monitor, click on it.
The configuration details appear below the table.
Edit Configuration in Context
When you copy a default read-only template or create your own new template, you can edit the template's monitors from the Template Summary
View.
Note: For a default read-only template, you can view but cannot edit the template's monitor configuration in Template Summary View.
Tree Hierarchy
Tree Icons
ibmvm Node
Template Editor
Profiles Node
Resource Node
<Managed System> Node
<Device> Nodes
VIOs Node
VMs Node
Tree Hierarchy
Once a Resource profile is added, the components of the Resource are displayed in the tree. Click a node in the tree to see the alarms, events, or
monitors available for that component. The tree contains a hierarchical representation of the components that exist in the IBM virtualization
environment.
Tree Icons
The icons in the tree indicate the type of object the node contains. For VMs, the color of the icon also indicates the status of the VM.
- Closed folder. Organizational node that is used to group similar objects. Click the node to expand it.
- Open folder. Organizational node that is used to group similar objects. Click the triangle next to the folder to collapse it.
- The profile icon indicates the status of subcomponent discovery.
- Unknown.
- Storage pool
- Datastore
The VM icon indicates the state of the VM.
- Running
- Stopped
- Paused
- VIOs or VMs
ibmvm Node
Navigation: ibmvm Node
The ibmvm node contains the following sections:
ibmvm > Probe Information
This section displays read-only information about the probe.
ibmvm > Probe Setup
You can set up the probe's Log Level. The default level is 3 (Recommended). You can set the log level to 4 or 5 for troubleshooting and
return it to 3 for normal operations.
ibmvm >
You can manually add a device profile. The profile icon indicates the status of subcomponent discovery.
Fields to know:
Hostname
The hostname or IP address of the IBM server you want to monitor.
Active
Select this checkbox to activate monitoring of the resource.
Port
The SSH port for the target system.
Interval (secs)
The time to wait for connection to establish.
Username
A valid username to be used by the probe to log in to the IBM server.
Password
A valid password to be used by the probe to log in to the IBM server.
Alarm Message
The alarm message to be sent if the resource does not respond.
Template Editor
You use the template editor to apply monitoring templates, with optional filters, rules, and monitors, to one or more resources.
Note: For more information about how to use the Template Editor, see v2.3 Apply Monitoring with Templates.
Profiles Node
This section describes how to manage your device configuration. You can modify or delete a device profile, and validate the credentials for each
device.
Navigation: ibmvm Node > profile name
Note: If you use an IPv6 address, you must follow the Java standard of enclosing the IPv6 address in square brackets. For
example: The input string [f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack
trace error that includes the exception: Caused by: java.lang.NumberFormatException: For input string:
"f0d0:0:0:0:0:0:0:10.0.00.0".
Active
Select this checkbox to activate monitoring of the resource.
Port
The SSH port for the target system.
Interval (secs)
The time to wait for connection to establish.
Username
A valid username to be used by the probe to log in to the IBM server.
Password
A valid password to be used by the probe to log in to the IBM server.
Alarm Message
The alarm message to be sent if the resource does not respond.
Resource Node
The Resource node contains the managed systems that are associated with the resource.
Navigation: ibmvm > profile name > resource name
Click to view a Hosts node containing the managed systems that are on the resource.
Note: The value definition effects the QoS publication interval. For example: If you set the value definition to an "average
of n," the probe will wait n cycles before it sends any QoS data to the Discovery server. If you set the value definition to
"delta," the probe will wait two cycles before it sends any QoS data to the Discovery server.
Number of Samples
The count of items for the Value Definition when set to Average Value Last n Samples .
Operator
The operator to be used when setting the high or low alarm threshold for the measured value. You must select Publish Alarms to
enable this setting.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The high or low alarm threshold value. An alarm message is sent if this threshold is exceeded.
Message Name
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message
pool.
<Device> Nodes
<Device> nodes exist for managed systems, VIOs, and VMs. Possible devices are CPUs, CPU Pools, disks, memory, network interfaces, and
storage pools.
Navigation options:
Managed system device -- ibmvm Node > profile name > resource name > Hosts > managed system name > device name
VIO device -- ibmvm Node > profile name > resource name > Hosts > managed system name > VIOs > VIO name > device name
VM device -- ibmvm Node > profile name > resource name > Hosts > managed system name > VMs > VM name > device name
Note: If you click a <Device> node, you might see a table with QoS metrics or more named device nodes. You must click the named device
nodes to view a table with QoS metrics.
<Device> > Monitors
You can change monitor settings in the fields below the table. Select a monitor in the table to view the monitor configuration fields.
Fields to know:
QoS Name
The name to be used on the QoS message issued. The field is read-only.
Description
This is a read-only field, describing the monitor.
Units
The unit of the monitored value (for example %, Mbytes etc.). The field is read-only.
Metric Type Id
Identifies a unique Id of the QoS.
Publish Data
Select this option if you want QoS messages to be issued on the monitor.
Publish Alarms
Select this option if you want to activate alarms.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
Current Value -- The most current value that is measured
Delta Value (Current - Previous) -- The delta value that is calculated from the current and the previous measured sample
Delta Per Second -- The delta value that is calculated from the samples that are measured within a second
Average Value Last n Samples -- The user specifies a count and the value is averaged based on the last "count" items
Number of Samples
The count of items for the Value Definition when set to Average Value Last n Samples .
Operator
The operator to be used when setting the high or low alarm threshold for the measured value. You must select Publish Alarms to
enable this setting.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The high or low alarm threshold value. An alarm message is sent if this threshold is exceeded.
Message Name
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message
pool.
VIOs Node
The VIOs node contains the VIO resources on a managed system.
Navigation: ibmvm Node > profile name > resource name > Hosts > managed system name > VIOs
Click to view the VIO resources. Each VIO resource contains nodes for the devices that are associated with the VIO. For more information about
devices, see <Device> Nodes.
VMs Node
The VMs node contains the VM resources on a managed system.
Navigation: ibmvm Node > profile name > resource name > Hosts > managed system name > VMs
Click to view the VM resources. Each VM resource contains nodes for the devices that are associated with the VIM. For more information about
devices, see <Device> Nodes.
Important!
Thresholds for the following three types of monitors can only be configured in Admin Console:
Event Forwarding Monitors
Alarm Forwarding Monitors
Monitors with Boolean, Enumeration, or String-only Metrics
Overview
Probe Configuration
General Setup
Create a New Resource
Message Pool Manager
Add a New Alarm Message
Delete an Alarm Message
Edit an Alarm Message
Adding Monitors
Manually Selecting Monitors to be Measured
Enabling the Monitors for QoS and Alarming
Edit Monitor Properties
Using Templates
Copy a Default Template
Create a New Template
Add Monitors to a Template
Apply a Template
Using Automatic Configurations
Adding a Template to the Auto Configurations Node
Adding a Monitor to the Auto Configurations Node
Exploring the Contents of the Auto Configurations Node
Overview
The probe does not monitor anything automatically. You need to define what to monitor. Perform the following steps to configure a probe:
1. Connect to a resource (Hardware Management Console or Integrated Virtualization Manager).
2. Add monitors (checkpoints).
3. Configure the checkpoints to send QoS data and alarms if the thresholds specified are breached.
The following component entities can be monitored on the Managed System:
CPU
CPU Pool
Disk
Memory
Network
Storage Pool
The following component entities can be monitored on the virtual I/O server (VIOs) and virtual machines (VMs):
CPU
Disk
Memory
Network
Important! Configuration of the probe -- through the Unified Management Portal (UMP), using the Admin Console portlet (AC) -- is not
compatible with the configuration through the Infrastructure Manager interface described here. Do not mix or interchange configuration
methods! If you do, the result will be unpredictable monitoring of your system.
Probe Configuration
Double-click the line representing the probe in Infrastructure Manager to open the probe GUI.
When the probe is initially installed, this screen displays with the following:
An empty Resources node
An empty Templates node.
General Setup
Click the General Setup button to set the level of details written to the log file for the ibmvm probe. The log level is a sliding scale. Level 1 logs
only fatal errors. Level 5 logs extremely detailed information used for debugging purposes. The default level is 3 (recommended). You can set the
log level to 4 or 5 for troubleshooting and return it to 3 for normal operations. Log as little as possible during normal operation to minimize disk
consumption.
Click the Apply button to implement the new log level immediately.
Note: The probe allows you to change the log level without restarting the probe.
Right click Resources in the navigation pane and select New Resource.
The Resource (New) dialog box appears. Enter the appropriate field information:
Hostname or IP address
The hostname or IP address of the IBM server you want to monitor.
Note: You must follow the Java standard of enclosing an IPv6 address in square brackets. For example: The input string
[f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace error that includes the
exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
Port
The SSH port for the target system.
Active
Select this checkbox to activate monitoring of the resource.
SSH connection timeout (sec)
Time to wait for connection to establish.
Check interval
How often the probe checks the values of the monitors.
Username
A valid username to be used by the probe to log on to the IBM server.
Password
A valid password to be used by the probe to log on the IBM server.
Group
The group you want the resource associated with.
Alarm Message
The alarm message to be sent if the resource does not respond.
Note: You can edit the message or create your own message using Message Pool Manager.
Check Interval
The check interval defines how often the probe checks the values of the monitors. This can be set in seconds, minutes or hours. The IBM
server data is updated once per minute. We recommend polling once every 10 minutes. The polling interval should not be smaller than
the time required to collect the data.
Test button
Click the Test button to verify the connection to the Resource.
Note: The very first time you click the Test button after you have created your first resource, you may receive an error message. In that
case, wait for at least 20 seconds and click the Test button again.
After completing the fields and testing that the connection works, click OK to add the Resource.
3.
Token
The type of alarm, either "monitor_error" or "resource_error".
Error Alarm Text
The alarm text sent when a violation occurs. Variables can be used in this field.
Example: $monitor
This variable will put the actual monitor name in the alarm text. There are several available variables: $resource, $host, $port, $descr,
$key, $unit, $value, $oper, and $thr.
Clear Alarm Text (OK)
The text sent when an alarm is cleared.
Error Severity
Severity of the alarm.
Subsystem string/id
The ID within NAS that corresponds to the probe or a common component such as CPU. This is used for categorizing alarms.
4. Click OK to save the new message.
Delete an Alarm Message
Adding Monitors
There are three different ways to add monitors to ibmvm entities:
Manually select the monitors
To manually select and enable monitors, navigate to the target entity within the Resource. This lists its monitors in the right pane. Use the
available check-boxes to enable QoS monitoring for the selected metrics. To enable Alarm thresholding, you will need to launch the Edit
Monitor dialog.
Use Templates
Templates let you define reusable sets of monitors to apply to various ibmvm monitored entities.
See the section Using Templates for further information.
Use Auto Configurations
Auto Configuration is a powerful way to automatically add monitors to be measured. Monitors are created for new devices (that is, ones
not currently monitored) that would otherwise need manual configuration to be monitored.
See the section Using Automatic Configurations for further information.
Manually Selecting Monitors to be Measured
To select a monitor you want to be measured for a Resource, click the Resource node in the navigation pane, and navigate through the
Resources hierarchy. Select a folder in the hierarchy to see the monitors for it, listed in the right pane. Click the check box beside the Monitors
you want to be active.
Note: You can also add monitors to be measured using templates (see the section Using Templates).
Select the All Monitors node to list all monitors currently being measured in the right pane. You can select or deselect monitors here as well.
Selecting the checkbox next to a monitor name only enables the monitor. To configure the probe to send QoS data and/or send alarms you must
modify the properties for each monitor.
Double-click a monitor (or right-click and select Edit) to launch the monitors properties dialog. See Edit Monitor Properties for further information.
Edit Monitor Properties
Double-click a monitor (or right-click and select Edit) to launch the monitors properties dialog.
Update the fields as necessary. The fields are:
Resources
This is a read-only field, that contains the name of the resource associated with the name of a monitor. The monitor name is displayed
when the monitor is retrieved from the IBM Server.
Key
This is a read-only field, describing the monitor key.
Description
This is a read-only field. describing the monitor. This description appears when the monitor is retrieved from the IBM Server.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
The current value -- The most current value measured will be used.
The delta value (current - previous) -- The delta value calculated from the current and the previous measured sample will be used.
Delta per second -- The delta value calculated from the samples measured within a second will be used.
The average value (cur + prev)/2 -- The current plus the previous value of the sample divided by 2.
The average value last... -- The user specifies a count. The value is then averaged based on the last "count" items.
Active
Activates monitoring on the probe.
Enable Alarming
Activates alarming.
Note that the monitor will also be selected in the list of monitors in the right pane when this option is selected. You can enable/disable
monitoring from that list.
Operator
The operator to be used when setting the alarm threshold for the measured value.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The alarm threshold value. An alarm message will be sent if this threshold is exceeded.
Unit
The unit of the monitored value (for example %, Mbytes etc.). The field is read-only.
Message ID
The alarm message to be issued if the specified threshold value is breached. These messages are kept in the message pool. The
Using Templates
Templates provide an easy way to consistently monitor of your IBM virtualization environment by allowing you to predefine reusable sets of
monitors.
These default templates are included with the probe:
UMP Metrics
VM and Host Template
The default templates contain commonly used metric configurations that let you quickly apply monitoring.
You can also create your own templates and define a set of monitors belonging to each. You can then apply these templates to anything in the
Resources or Auto Configurations hierarchies in the navigation pane by dragging the template and dropping it on the appropriate item. This
assigns the template monitors to the drop point and everything below it
Copy a Default Template
You can apply a default template as you do any other template. However, you may want to copy the default template and then apply the copy.
Copying the default template allows you to make modifications to the copies without losing the original default template's monitor settings.
Follow these steps:
1. Click the Templates node in the navigation pane.
2. Select the default template.
3. Right-click>Copy Template.
The Templates Properties dialog appears.
4. Give the copy a name and description.
The default template is copied and appears under the Templates node and in the content pane.
Create a New Template
After you create a template and define a set of monitors for that template, you can perform the following actions:
Drag and drop the template into the resource hierarchy where you want to monitor.
This action applies the monitors to a resource entity and any subordinate items. Any templates applied within the Resources hierarchy
are static monitors. The static monitors override any auto monitors for that specific resource entity.
Drag and drop the template monitors into the Auto Configurations node to add the template contents to the list of auto configuration
monitors.
This action applies the monitors to any new devices that are not currently monitored when the probe searches the environment for new
devices.
You can perform both actions within a single probe. You can place general-purpose templates into Auto Configuration, and apply special-purpose
templates to override the Auto Configuration templates on specific nodes, for specific purposes. See Using Automatic Configurations for details on
Auto Configuration.
Follow these steps:
1. Click the Templates node in the navigation pane to list all available templates in the content pane.
2. Select the desired template from the list in the content pane.
3. Drag and drop it on the Auto Configurations node in the navigation pane.
4. Click the Auto Configurations node to verify that the template's content was successfully added.
Important! If you are experiencing performance problems, we recommend increasing the polling cycle and/or the memory configuration
for the probe. Increase memory when the probe is running out of memory. Increase polling cycle when the collection takes longer than
the configured interval.
You can add a single monitor (checkpoint) to the Auto Configurations node.
To list available monitors:
1. Select the Resource node in the navigation pane and navigate to the point of interest.
To verify that the monitors were successfully added, click the Auto Configurations node in the navigation pane.
To edit the properties for a monitor, right-click in the list and choose Edit from the menu. See the section To Edit Monitor Properties for
detailed information.
To delete a monitor from the list, right-click in the list and choose Delete from the menu.
Note: You must click the Apply button and restart the probe to activate configuration changes.
All defined Auto Monitors are listed under the Auto Monitors node. When you restart the probe, it searches through the Resource's entities. For
each one that is currently not monitored, an Auto Monitor is created for each of the monitors listed under the Auto Configurations node.
This article describes the fields and features in the Infrastructure Manager interface for the IBM VM Monitoring (ibmvm) probe.
Contents
Note: The probe is only configurable in Admin Console if you see the following message: "The probe is running in bulk configure mode.
This GUI is not supported in bulk configure mode. Exiting." For more information, see the v2.3 ibmvm AC Configuration guide.
Toolbar Buttons
The Resource is configured as a link to an IBM virtualization enabled system. The icon for each Resource node indicates the status of the
resource:
Note: If you use an IPv6 address for a resource, you must follow the Java standard of enclosing the IPv6 address in square brackets.
For example: The input string [f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace
error that includes the exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
Inactive
Marked for deletion
Unable to connect
New (not yet saved)
Connected and inventory is ready to browse
Loading inventory and not ready to browse
Each Resource node contains the following sub-hierarchies:
Auto Configurations
The monitors under this node are used to automatically configure newly discovered devices. Drag-and-drop individual monitors or
templates (which define a set of monitors) onto this node.
Auto Monitors
Lists the monitors that were created based on the Auto-Configuration entries and the inventory available on the Resource.
All Monitors
Lists all monitors for the Resource, including Auto Monitors and manually configured monitors.
IBM Virtualization Environment hierarchy
Lists the Machine Resources, CPUs, CPU Pools, Disks, Memory, Network, Storage Pools, MSPPs, VIOs, and VMs available in the IBM
virtualization environment for monitoring.
Note: A node for Network will only appear if the device can be monitored. If the node does not appear, and the log level is set
to 2, a message is generated indicating that the network adapter does not return valid output. This is not an issue.
Templates
Templates are a useful tool for defining checkpoints to be monitored to on the various managed systems or Logical Partitions. This node contains
the following default templates:
VM and Host Templates
UMP Metrics
The default templates contain commonly used monitoring configurations that enable you to quickly apply monitoring.
For more information about how to apply monitoring with templates, see the Using Template section of the v2.3 ibmvm IM Configuration article.
This section contains configuration details that are specific to the ibmvm probe.
Contents
Configuration Overview
Add Resource Profiles
Add Monitoring
Alarm Thresholds
Configuration Overview
At a high level, configuring the probe consists of the following steps:
1. Add a Resource profile for each IBM virtualization enabled system you want to monitor.
2. Add monitors to the appropriate system components and configure monitor data.
3.
Add Monitoring
Once you add a resource profile, the components of the resource are displayed in the tree. Click a node in the tree to see any associated
monitors for that component. Configure the QoS measurements you want to collect data for, and any alarms or events you want, by modifying the
appropriate fields.
Note: Users of CA Unified Infrastructure Management Snap can skip this step. The Default configuration is automatically applied when you
activate the probe in Snap.
Follow these steps:
1. Go to ibmvm > profile name > resource name.
2. Click on a managed system, device, virtual I/O server (VIO) or virtual machines (VMs) name. It might be necessary to expand the node in
the tree to view the monitors and QoS metrics.
The available monitors appear in a table on the right side of the screen.
3. Select the monitor you want to modify in the table.
4. Change monitor settings in the fields below the table.
5. Click Save at the top of the screen.
When the new configuration is loaded, a Success dialog appears.
6. Click OK.
The tree is updated with the new configuration.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Important! Alarm threshold settings are dependent on the baseline_engine probe. If you do not have the correct version of
baseline_engine configured, you will not see the additional threshold options.
Contents
Tree Hierarchy
Tree Icons
ibmvm Node
Profiles Node
Resource Node
<Managed System> Node
<Device> Nodes
VIOs Node
VMs Node
Tree Hierarchy
Once a Resource profile is added, the components of the Resource are displayed in the tree. Click a node in the tree to see the alarms, events, or
monitors available for that component. The tree contains a hierarchical representation of the components that exist in the IBM virtualization
environment.
Tree Icons
The icons in the tree indicate the type of object the node contains. For VMs, the color of the icon also indicates the status of the VM.
- Closed folder. Organizational node used to group similar objects. Click the node to expand it.
- Open folder. Organizational node used to group similar objects. Click the triangle next to the folder to collapse it.
- The profile icon indicates the status of subcomponent discovery.
- OK. Discovery of subcomponents is completed.
- Unknown.
- Storage pool
- Datastore
- The VM icon indicates the state of the VM.
- Running
- Stopped
- Paused
- Network interface
- VIOs or VMs
ibmvm Node
Navigation: ibmvm Node
Set or modify the following values based on your requirements.
ibmvm > Probe Information
This section displays read-only information about the probe.
ibmvm > Probe Setup
Fields to know:
Log Level: Select the amount of log information you would like to collect for this probe.
Default: 3 (Recommended)
ibmvm >
Profiles Node
This section describes how to manage your device configuration. You can modify or delete a device profile, and validate the credentials for each
device.
Navigation: ibmvm Node > profile name
Active
Select this checkbox to activate monitoring of the resource.
Port
The SSH port for the target system.
Interval (secs)
The time to wait for connection to establish.
Username
A valid username to be used by the probe to log on to the IBM server.
Password
A valid password to be used by the probe to log on the IBM server.
Alarm Message
The alarm message to be sent if the resource does not respond.
Resource Node
The Resource node contains the managed systems associated with the resource.
Navigation: ibmvm > profile name > resource name
Click to view a Hosts node containing the managed systems on the resource.
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message
pool.
<Device> Nodes
<Device> nodes exist for managed systems, VIOs, and VMs. Possible devices are CPUs, CPU Pools, disks, memory, network interfaces, and
storage pools.
Navigation options:
Managed system device -- ibmvm Node > profile name > resource name > Hosts > managed system name > device name
VIO device -- ibmvm Node > profile name > resource name > Hosts > managed system name > VIOs > VIO name > device name
VM device -- ibmvm Node > profile name > resource name > Hosts > managed system name > VMs > VM name > device name
Note: If you click a <Device> node, you might see a table with QoS metrics or additional named device nodes. You must click the named device
nodes to view a table with QoS metrics.
<Device> > Monitors
You can change monitor settings in the fields below the table.Select a monitor in the table to view the monitor configuration fields.
Fields to know:
QoS Name
The name to be used on the QoS message issued. The field is read-only.
Description
This is a read-only field, describing the monitor.
Units
The unit of the monitored value (for example %, Mbytes etc.). The field is read-only.
Metric Type Id
Identifies a unique Id of the QoS.
Publish Data
Select this option if you want QoS messages to be issued on the monitor.
Publish Alarms
Select this option if you want to activate alarms.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
Current Value -- The most current value measured will be used.
Delta Value (Current - Previous) -- The delta value calculated from the current and the previous measured sample will be used.
Delta Per Second -- The delta value calculated from the samples measured within a second will be used.
Average Value Last n Samples -- The user specifies a count. The value is then averaged based on the last "count" items.
Number of Samples
The count of items for the Value Definition when set to Average Value Last n Samples .
Operator
The operator to be used when setting the high or low alarm threshold for the measured value. You must select Publish Alarms to
enable this setting.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The high or low alarm threshold value. An alarm message will be sent if this threshold is exceeded.
Message Name
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message
pool.
VIOs Node
The VIOs node contains the VIO resources on a managed system.
Navigation: ibmvm Node > profile name > resource name > Hosts > managed system name > VIOs
Click to view the VIO resources. Each VIO resource contains nodes for the devices associated with the VIO. For more information about devices,
see <Device> Nodes.
VMs Node
The VMs node contains the VM resources on a managed system.
Navigation: ibmvm Node > profile name > resource name > Hosts > managed system name > VMs
Click to view the VM resources. Each VM resource contains nodes for the devices associated with the VIM. For more information about devices,
see <Device> Nodes.
Monitoring Capabilities
The following component entities can be monitored on the Managed System:
CPU
CPU Pool
Disk
Memory
Network
Storage Pool
The following component entities can be monitored on the virtual I/O server (VIOs) and virtual machines (VMs):
CPU
Disk
Memory
Network
Probe Configuration
Double-click the line representing the probe in the Infrastructure Manager open the IBM virtualization probe GUI.
When the probe is initially installed, this screen displays with the following:
An empty Resources node
An empty Templates node
General Setup
Click the General Setup button to set the level of details written to the log file for the ibmvm probe. Log as little as possible during normal
operation to minimize disk consumption. This is a sliding scale with the range of information logged being fatal errors all the way to extremely
detailed information used for debugging purposes.
Click the Apply button to implement the new log level immediately.
Note: The probe allows you to change the log level without restarting the probe.
Note: You must follow the Java standard of enclosing an IPv6 address in square brackets. For example: The input string
[f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace error that includes the
exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
Port
The SSH port for the target system.
Active
Select this checkbox to activate monitoring of the resource.
SSH connection timeout (sec)
Time to wait for connection to establish.
Check interval
How often the probe checks the values of the monitors.
Username
A valid username to be used by the probe to log in to the IBM server.
Password
A valid password to be used by the probe to log in to the IBM server.
Group
The group that you want the resource associated with.
Alarm Message
The alarm message to be sent if the resource does not respond.
Note: You can edit the message or create your own message using Message Pool Manager.
Check Interval
The check interval defines how often the probe checks the values of the monitors. This can be set in seconds, minutes or hours. The IBM
server data is updated once per minute. We recommend polling once every 10 minutes. The polling interval should not be smaller than
the time required to collect the data.
Test button
Click the Test button to verify the connection to the Resource.
Note: The very first time you click the Test button after you have created your first resource, you may receive an error message. In that
case, wait for at least 20 seconds and click the Test button again.
After completing the fields and testing that the connection works, click OK to add the Resource.
Adding Monitors
There are three different ways to add monitors to ibmvm entities:
Manually select the monitors
To manually select and enable monitors, navigate to the target entity within the Resource. This lists its monitors in the right pane. Use the
available check-boxes to enable QoS monitoring for the selected metrics. To enable Alarm thresholding, launch the Edit Monitor dialog.
See the section Manually Selecting Monitors to be Measured.
Use Templates
Templates let you define reusable sets of monitors to apply to various ibmvm monitored entities.
See the section Using Templates for further information.
Use Auto Configurations
Auto Configuration is a powerful way to automatically add monitors to be measured. Monitors are created for new devices (that is, ones
not currently monitored) that would otherwise need manual configuration to be monitored.
See the section Using Automatic Configurations for further information.
Manually Selecting Monitors to be Measured
To select a monitor you want to be measured for a Resource, click the Resource node in the navigation pane, and navigate through the
Resources hierarchy. Select a folder in the hierarchy to see the monitors for it, listed in the right pane. Click the check box next to the Monitors
you want to be active.
Note: You can also add monitors to be measured using templates (see the section Using Templates).
Select the All Monitors node to list all monitors currently being measured in the right pane. You can select or clear monitors here, too.
Green icon - the monitor is configured and active
Gray icon - the monitor is configured but not active
Black icon - the monitor is not configured
Note: If a monitor name is in italics you have changed the configuration however have not applied the changes.
Selecting the checkbox next to a monitor name only enables the monitor. To configure the probe to send QoS data or send alarms you must
modify the properties for each monitor.
Double-click a monitor (or right-click and select Edit) to launch the monitors properties dialog.
Double-click a monitor (or right-click and select Edit) to launch the monitors properties dialog.
Update the fields as necessary. The fields are:
Resources
This is a read-only field, that contains the name of the resource associated with the name of a monitor. The monitor name is displayed
when the monitor is retrieved from the IBM Server.
Key
This is a read-only field, describing the monitor key.
Description
This is a read-only field. describing the monitor. This description appears when the monitor is retrieved from the IBM Server.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
The current value -- The most current value measured will be used.
The delta value (current - previous) -- The delta value calculated from the current and the previous measured sample will be used.
Delta per second -- The delta value calculated from the samples measured within a second will be used.
The average value (cur + prev)/2 -- The current plus the previous value of the sample divided by 2.
The average value last... -- The user specifies a count. The value is then averaged based on the last "count" items.
Active
Activates monitoring on the probe.
Enable Alarming
Activates alarming.
Note that the monitor will also be selected in the list of monitors in the right pane when this option is selected. You can enable/disable
monitoring from that list.
Operator
The operator to be used when setting the alarm threshold for the measured value.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The alarm threshold value. An alarm message will be sent if this threshold is exceeded.
Unit
The unit of the monitored value (for example %, Mbytes etc.). The field is read-only.
Message ID
The alarm message to be issued if the specified threshold value is breached. These messages are kept in the message pool. The
messages can be modified in the Message Pool Manager.
Publish Quality of Service
Select this option if you want QoS messages to be issued on the monitor.
QoS Name
The name to be used on the QoS message issued.
Using Templates
Templates provide an easy way to consistently monitor of your IBM virtualization environment by allowing you to predefine reusable sets of
monitors. After you create a template and define a set of monitors for that template, you can perform the following actions:
Drag and drop the template into the resource hierarchy where you want to monitor.
This action applies the monitors to a resource entity and any subordinate items. Any templates applied within the Resources hierarchy
are static monitors. The static monitors override any auto monitors for that specific resource entity.
Drag and drop the template monitors into the Auto Configurations node to add the template contents to the list of auto configuration
monitors.
This action applies the monitors to any new devices that are not currently monitored when the probe searches the environment for new
devices.
You can perform both actions within a single probe. You can place general-purpose templates into Auto Configuration, and apply special-purpose
templates to override the Auto Configuration templates on specific nodes, for specific purposes. See Using Automatic Configurations for details on
Auto Configuration.
Create a New Template
).
Right-click the Templates node in the navigation pane, and select New Template from the menu.
In the resulting Template Properties dialog, specify a Name and a Description for the new template.
Note: You can also edit an existing template: Select one of the templates defined under the Templates node in the navigation pane,
right-click it, and select Edit from the menu.
Add Monitors to a Template
Drag the template to the Auto Configurations node or the Resource entity (For example, VMs, Memory, Storage Pool, or VIO) where you want it
applied, and drop it there.
Note: You can drop the template on an object containing multiple subordinate objects. This applies the template to the entity and all its
subordinate entities. A static monitor is created for this entity.
Important! If you are experiencing performance problems, we recommend increasing the polling cycle or the memory configuration for
the probe. Increase memory when the probe is running out of memory. Increase polling cycle when the collection takes longer than the
configured interval.
You can add a single monitor (checkpoint) to the Auto Configurations node.
To list available monitors:
1. Select the Resource node in the navigation pane and navigate to the point of interest.
2. Select an object to list its monitors in the right pane.
3. Add the monitor to the Auto Configurations node by dragging the monitor to the Auto Configurations node and dropping it there.
4. Click the Auto Configurations node and verify that the monitor was successfully added.
Note: You must click the Apply button and restart the probe to activate configuration changes.
Exploring the Contents of the Auto Configurations Node
To verify that the monitors were successfully added, click the Auto Configurations node in the navigation pane.
To edit the properties for a monitor, right-click in the list and select Edit from the menu. See the section Edit Monitor Properties for
detailed information.
To delete a monitor from the list, right-click in the list and select Delete from the menu.
Note: You must click the Apply button and restart the probe to activate configuration changes.
Checking the Auto Monitors Node
All defined Auto Monitors are listed under the Auto Monitors node. When you restart the probe, it searches through the Resource's entities. For
each one that is currently not monitored, an Auto Monitor is created for each of the monitors listed under the Auto Configurations node.
Toolbar Buttons
The configuration tool also contains a row of toolbar buttons:
Note: If you use an IPv6 address for a resource, you must follow the Java standard of enclosing the IPv6 address in square brackets.
For example: The input string [f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace
error that includes the exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
The icon for each Resource node indicates the status of the resource:
Inactive
Marked for deletion
Unable to connect
New (not yet saved)
Connected and inventory is ready to browse
Loading inventory and not ready to browse
Each Resource node contains the following sub-hierarchies:
Auto Configurations
The monitors under this node are used to automatically configure newly discovered devices. Drag-and-drop individual monitors or
templates (which define a set of monitors) onto this node.
Auto Monitors
Lists the monitors that were created based on the Auto-Configuration entries and the inventory available on the Resource.
All Monitors
Lists all monitors for the Resource, including Auto Monitors and manually configured monitors.
IBM Virtualization Environment hierarchy
Lists the Machine Resources, CPUs, CPU Pools, Disks, Memory, Network, Storage Pools, MSPPs, VIOs, and VMs available in the IBM
virtualization environment for monitoring.
Note: A node for Network will only appear if the device can be monitored. If the node does not appear, and the log level is set
to 2, a message is generated indicating that the network adapter does not return valid output. This is not an issue.
Templates
Templates are a useful tool for defining checkpoints to be monitored to on the various managed systems or Logical Partitions.
This section contains configuration details that are specific to the ibmvm probe.
Contents
Configuration Overview
Add Resource Profiles
Add Monitoring
Alarm Thresholds
Configuration Overview
At a high level, configuring the probe consists of the following steps:
1. Add a Resource profile for each IBM virtualization enabled system you want to monitor.
2. Add monitors to the appropriate system components and configure monitor data.
Add Monitoring
Once you add a resource profile, the components of the resource are displayed in the tree. Click a node in the tree to see any associated
monitors for that component. Configure the QoS measurements you want to collect data for, and any alarms or events you want, by modifying the
appropriate fields.
Note: Users of CA Unified Infrastructure Management Snap can skip this step. The Default configuration is automatically applied when you
activate the probe in Snap.
Follow these steps:
1. Go to ibmvm > profile name > resource name.
2. Click on a managed system, device, virtual I/O server (VIO) or virtual machines (VMs) name. It might be necessary to expand the node in
the tree to view the monitors and QoS metrics.
The available monitors appear in a table on the right side of the screen.
3. Select the monitor you want to modify in the table.
4. Change monitor settings in the fields below the table.
5. Click Save at the top of the screen.
When the new configuration is loaded, a Success dialog appears.
6. Click OK.
The tree is updated with the new configuration.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Important! Alarm threshold settings are dependent on the baseline_engine probe. If you do not have the correct version of
baseline_engine configured, you will not see the additional threshold options.
Tree Hierarchy
Tree Icons
ibmvm Node
Profiles Node
Resource Node
<Managed System> Node
<Device> Nodes
VIOs Node
VMs Node
Tree Hierarchy
Once a Resource profile is added, the components of the Resource are displayed in the tree. Click a node in the tree to see the alarms, events, or
monitors available for that component. The tree contains a hierarchical representation of the components that exist in the IBM virtualization
environment.
Tree Icons
The icons in the tree indicate the type of object the node contains. For VMs, the color of the icon also indicates the status of the VM.
- Closed folder. Organizational node used to group similar objects. Click the node to expand it.
- Open folder. Organizational node used to group similar objects. Click the triangle next to the folder to collapse it.
- Unknown.
- Storage pool
- Datastore
- The VM icon indicates the state of the VM.
- Running
- Stopped
- Paused
- Network interface
- VIOs or VMs
ibmvm Node
Navigation: ibmvm Node
Set or modify the following values based on your requirements.
ibmvm > Probe Information
This section displays read-only information about the probe.
ibmvm > Probe Setup
Fields to know:
Log Level: Select the amount of log information you would like to collect for this probe.
Default: 3 (Recommended)
ibmvm >
Note: You must follow the Java standard of enclosing an IPv6 address in square brackets. For example: The input string
[f0d0:0:0:0:0:0:0:10.0.00.0] works. But the input string f0d0:0:0:0:0:0:0:10.0.00.0 causes a stack trace error that includes the
exception: Caused by: java.lang.NumberFormatException: For input string: "f0d0:0:0:0:0:0:0:10.0.00.0".
Active
Select this checkbox to activate monitoring of the resource.
Port
The SSH port for the target system.
Interval (secs)
The time to wait for connection to establish.
Username
A valid username to be used by the probe to log on to the IBM server.
Password
A valid password to be used by the probe to log on the IBM server.
Alarm Message
The alarm message to be sent if the resource does not respond.
The profile icon indicates the status of subcomponent discovery.
Profiles Node
This section describes how to manage your device configuration. You can modify or delete a device profile, and validate the credentials for each
device.
Navigation: ibmvm Node > profile name
Resource Node
The Resource node contains the managed systems associated with the resource.
Navigation: ibmvm > profile name > resource name
Click to view a Hosts node containing the managed systems on the resource.
<Device> Nodes
<Device> nodes exist for managed systems, VIOs, and VMs. Possible devices are CPUs, CPU Pools, disks, memory, network interfaces, and
storage pools.
Navigation options:
Managed system device -- ibmvm Node > profile name > resource name > Hosts > managed system name > device name
VIO device -- ibmvm Node > profile name > resource name > Hosts > managed system name > VIOs > VIO name > device name
VM device -- ibmvm Node > profile name > resource name > Hosts > managed system name > VMs > VM name > device name
Note: If you click a <Device> node, you might see a table with QoS metrics or additional named device nodes. You must click the named device
nodes to view a table with QoS metrics.
<Device> > Monitors
You can change monitor settings in the fields below the table.Select a monitor in the table to view the monitor configuration fields.
Fields to know:
QoS Name
The name to be used on the QoS message issued. The field is read-only.
Description
This is a read-only field, describing the monitor.
Units
The unit of the monitored value (for example %, Mbytes etc.). The field is read-only.
Metric Type Id
Identifies a unique Id of the QoS.
Publish Data
Select this option if you want QoS messages to be issued on the monitor.
Publish Alarms
Select this option if you want to activate alarms.
Value Definition
Value to be used for alarming and QoS.
You have the following options:
Current Value -- The most current value measured will be used.
Delta Value (Current - Previous) -- The delta value calculated from the current and the previous measured sample will be used.
Delta Per Second -- The delta value calculated from the samples measured within a second will be used.
Average Value Last n Samples -- The user specifies a count. The value is then averaged based on the last "count" items.
Number of Samples
The count of items for the Value Definition when set to Average Value Last n Samples .
Operator
The operator to be used when setting the high or low alarm threshold for the measured value. You must select Publish Alarms to
enable this setting.
Example:
=> 90 means alarm condition if the measured value is above 90.
= 90 means alarm condition if the measured value is exact 90.
Threshold
The high or low alarm threshold value. An alarm message will be sent if this threshold is exceeded.
Message Name
The alarm message to be issued if the specified high or low threshold value is breached. These messages are kept in the message
pool.
VIOs Node
The VIOs node contains the VIO resources on a managed system.
Navigation: ibmvm Node > profile name > resource name > Hosts > managed system name > VIOs
Click to view the VIO resources. Each VIO resource contains nodes for the devices associated with the VIO. For more information about devices,
see <Device> Nodes.
VMs Node
The VMs node contains the VM resources on a managed system.
Navigation: ibmvm Node > profile name > resource name > Hosts > managed system name > VMs
Click to view the VM resources. Each VM resource contains nodes for the devices associated with the VIM. For more information about devices,
see <Device> Nodes.
ibmvm Metrics
The following table lists the metrics you can collect with the IBM Virtualization Monitoring (ibmvm) probe.
Resource
Metric
Name
Unit
Description
Version
RESOURCE
Build
String
v1.0
RESOURCE
Release
String
v1.0
RESOURCE
Response
Time
Milliseconds
v1.0
RESOURCE
Version
String
v1.0
RESOURCE_MEM
Buffers
Megabytes
v2.1
RESOURCE_MEM
Percent
Used
Percent
v2.1
RESOURCE_MEM
Total
Megabytes
v2.1
RESOURCE_MEM
Used
Megabytes
v2.1
RESOURCE_SWAP
Cached
Megabytes
v2.1
RESOURCE_SWAP
Percent
Used
Percent
v2.1
RESOURCE_SWAP
Total
Megabytes
v2.1
RESOURCE_SWAP
Used
Megabytes
v2.1
RESOURCE_CPU_GROUP Average
Kernel
Utilization
Percent
v2.1
RESOURCE_CPU_GROUP Average
User
Utilization
Percent
v2.1
RESOURCE_CPU
I/O Waiting
Percent
v2.1
RESOURCE_CPU
Idle Task
Percent
v2.1
RESOURCE_CPU
System
Mode
Percent
v2.1
RESOURCE_CPU
User Mode
Percent
v2.1
RESOURCE_DISK
Filesystem
String
v2.1
RESOURCE_DISK
Percent
Used
Percent
v2.1
RESOURCE_DISK
Total
Megabytes
v2.1
RESOURCE_DISK
Used
Megabytes
v2.1
HOST
Active
Memory
Sharing
Capable
Boolean
v1.0
HOST
CoD Memory
Capable
Boolean
Capacity on Demand. The ability to add compute capacity in the form of CPU or
memory to a running system by simply activating it. The resources must be
pre-staged in the system prior to use and are (typically) turned on with an
activation key. Valid values:
0 - not capable
1 - capable
v1.0
HOST
CoD
Processor
Capable
Boolean
Capacity on Demand. The ability to add compute capacity in the form of CPU or
memory to a running system by simply activating it. The resources must be
pre-staged in the system prior to use and are (typically) turned on with an
activation key. Valid values:
0 - not capable
1 - capable
v1.0
HOST
Managed
System
Execution
State
String
v1.0
HOST
Managed
System
Execution
State Value
Integer
v1.0
HOST
Maximum
LPARs
Supported
Number
v1.0
HOST
Model
String
v1.0
HOST
Serial
Number
String
v1.0
HOST
VIO Total
Number
v1.0
HOST
VM Total
Number
v1.0
HOST
VMs
Stopped
Number
v1.0
HOST
Virtual Fiber
Channel
Capable
Boolean
v1.0
HOST
Virtual I/O
Server
Capable
Boolean
v1.0
HOST
Virtual
Switch
Capable
Boolean
v1.0
HOST_CPU
Available
Processing
Units
Number
Current number of configurable processing units on the managed system that are
not assigned to partitions.number of available processing units.
v1.0
HOST_CPU
Configurable
Processing
Units
Number
v1.0
HOST_CPU
Deconfigured
Processing
Units
Number
The number of processing units on the managed system that have been
unconfigured. This includes processing units that have been unconfigured by the
system due to hardware failure, and processing units that have been manually
unconfigured.
v1.0
HOST_CPU
Installed
Processing
Units
Number
v1.0
HOST_MEMORY
Assigned
Memory
Megabytes
v1.0
HOST_MEMORY
Configurable
Memory
Megabytes
v1.0
HOST_MEMORY
Deconfigured
Memory
Megabytes
The amount of memory, in megabytes, on the managed system that has been
unconfigured. This includes memory that has been unconfigured by the system
due to hardware failure, and memory that has been manually unconfigured.
v1.0
HOST_MEMORY
Installed
Memory
Megabytes
v1.0
HOST_MEMORY
Percent
Memory
Assigned
Percent
v1.0
HOST_MEMORY
System
Firmware
Current
Memory
Megabytes
v1.0
HOST_MEMORY
Unassigned
Memory
Megabytes
v1.0
HOST_COD
Available
CoD
processors
Count
v2.1
HOST_COD
CoD Memory
Activated
Megabytes
v2.1
HOST_COD
CoD Memory
Available
Megabytes
v2.1
HOST_COD
CoD Memory
Requested
Days
Available
Days
v2.1
HOST_COD
CoD Memory
Requested
Days Left
Days
v2.1
HOST_COD
CoD Memory
Requested
Hours Left
Hrs
v2.1
HOST_COD
CoD Memory
State
String
The memory state for CoD. For example, "Available" or "Code Not Entered."
v2.1
HOST_COD
CoD
Processor
State
String
The processor state for CoD. For example, "Available" or "Code Not Entered."
v2.1
HOST_COD
CoD
Processors
Activated
Count
v2.1
HOST_COD
CoD
Processors
Available
Count
v2.1
HOST_COD
CoD
Processors
Day Hours
Left
Hrs
v2.1
HOST_COD
CoD
Processors
Requested
Days
Available
Days
v2.1
HOST_COD
CoD
Processors
Requested
Days Left
Days
v2.1
HOST_COD
Total
Permanent
CoD Memory
Megabytes
v2.1
HOST_COD
Unreturned
CoD Memory
Megabytes
v2.1
HOST_COD
Unreturned
CoD
Processors
Count
v2.1
HOST_DISK_GROUP
Disks
Missing
Metrics
Number
v2.1
HOST_DISK_GROUP
Failed Disks
Number
v2.1
HOST_DISK
Disk Parent
String
v1.0
HOST_DISK
Disk Size
Megabytes
v1.0
HOST_DISK
Disk Status
String
v1.0
HOST_DISK
Storage Pool
String
v1.0
HOST_DISK
Disk
Bandwidth
Used (%)
Percent
v2.3
HOST_DISK
Disk Data
Transfer
Rate
Kilobytes/Second
v2.3
HOST_DISK
Disk KB
Read
KB
v2.1
HOST_DISK
Disk KB
Write
KB
v2.1
HOST_DISK
Transfers
Per Second
tps
Number of transfers per second issued to the physical disk. A transfer is an I/O
request to the physical disk. Multiple logical requests can be combined into a
single I/O request to the disk. A transfer is an indeterminate size.
v2.1
HOST_NIC
Description
String
v1.0
HOST_NIC
Kilobytes
Received
Kilobytes
v1.0
HOST_NIC
Kilobytes
Sent
Kilobytes
v1.0
HOST_NIC
Last Reset
Time
String
v1.0
HOST_NIC
Packets
Received
Packets
v1.0
HOST_NIC
Packets Sent
Packets
v1.0
HOST_NIC
Status
String
v1.0
CPU_POOL
Global
Shared
Processor
Pool Size
Number
Size of the global shared processor pool. The Global Shared Processor Pool
consists of the processors that are not already assigned to running partitions with
dedicated processors.
v1.0
CPU_POOL
Global
Shared
Processor
Pool
Utilization
Percent
v1.0
CPU_MSPP
Shared
Processor
Pool Size
Number
Size of the shared processor pool. Shared-processor pools have been available
since the introduction of IBM's POWER5 processor-based systems. This
technology allows you to share a group of processors between multiple LPARs.
Shared pools allow LPARs to juggle processor usage to balance out overall
processor usage. Originally, POWER5 systems allowed only one shared
processor pool. POWER6 processors enable the capability to have multiple
shared-processor pools. To employ multiple shared-processor pools, you must
use PowerVM Standard or Enterprise Edition.
v2.0
CPU_MSPP
Shared
Processor
Pool
Utilization
Percent
v2.0
SR
Number of
Backing
Devices
Count
v1.0
SR
Storage Pool
Free
Megabytes
v1.0
SR
Storage Pool
Size
Megabytes
v1.0
SR
Storage Pool
Used
Megabytes
v1.0
SR
Storage Pool
Utilization
Percent
v1.0
VIO
Average
CPU
Utilization
Percent
v1.0
VIO
Disk Paging
Pages
v1.0
VIO
Primary
HMC
String
v1.0
VIO
Primary
HMC IP
String
v1.0
VIO
Secondary
HMC
String
v1.0
VIO
Secondary
HMC IP
String
Size of the global shared processor pool. The Global Shared Processor Pool
consists of the processors that are not already assigned to running partitions with
dedicated processors.
v1.0
VIO
System Disk
Data
Transfer
Rate
Kilobytes/Second
v1.0
VIO
VIOS
Version
String
Size of the shared processor pool. Shared-processor pools have been available
since the introduction of IBM's POWER5 processor-based systems. This
technology allows you to share a group of processors between multiple LPARs.
Shared pools allow LPARs to juggle processor usage to balance out overall
processor usage. Originally, POWER5 systems allowed only one shared
processor pool. POWER6 processors enable the capability to have multiple
shared-processor pools. To employ multiple shared-processor pools, you must
use PowerVM Standard or Enterprise Edition.
v1.0
VIO
VM
Execution
State
String
v1.0
VIO
VM
Execution
State Value
Integer
v1.0
VIO_CPU
Active
Processors
Processing Units
Number of processors or virtual processors that are varied on for the partition.
v2.1
VIO_CPU
Assigned
Processors
Processing Units
v2.1
VIO_CPU
Current
Processing
Units
Processing Units
v2.1
VIO_CPU
Maximum
Processors
Processing Units
v2.1
VIO_CPU
Minimum
Processors
Processing Units
v2.1
VIO_CPU
Physical
Processors
Consumed
Processing Units
v2.1
VIO_CPU
Processing
Mode
String
v2.1
VIO_CPU
Processor
Entitlement
Consumed
Percent
v2.1
VIO_CPU
Processor
Frequency
megahertz
v2.1
VIO_CPU
Processor
Type
String
v2.1
VIO_CPU
Runtime
Processing
Units
Processing Units
v2.1
VIO_CPU
SMT
Enabled
Boolean
v2.1
VIO_CPU
SMT
Threads
Count
v2.1
VIO_CPU
Sharing
Mode
String
v2.1
VIO_DISK_GROUP
Disks
Missing
Metrics
Number
v2.1
VIO_DISK_GROUP
Failed Disks
Number
v2.1
VIO_DISK
Disk
Bandwidth
Used (%)
Percent
v1.0
VIO_DISK
Disk Data
Transfer
Rate
Kilobytes Per
Second
v1.0
VIO_DISK
Disk KB
Read
KB
v1.0
VIO_DISK
Disk KB
Write
KB
v1.0
VIO_DISK
Disk Size
Megabytes
v2.1
VIO_DISK
Disk Status
String
v2.1
VIO_DISK
Storage Pool
String
v1.0
VIO_DISK
Transfers
Per Second
TPS
Number of transfers per second issued to the physical disk. A transfer is an I/O
request to the physical disk. Multiple logical requests can be combined into a
single I/O request to the disk. A transfer is an indeterminate size.
v1.0
VIO_DISK
Virtual
Storage
Adapter
String
v2.1
VIO_NIC_GROUP
HEA Total
Number
v2.1
VIO_NIC_GROUP
Interfaces
Configured
Number
v2.1
VIO_NIC_GROUP
Interfaces
DOWN
Number
v2.1
VIO_NIC_GROUP
Interfaces
Detached
Number
v2.1
VIO_NIC_GROUP
Interfaces
UP
Number
v2.1
VIO_NIC_GROUP
SEA Defined
Number
v2.1
VIO_NIC_GROUP
SEA Total
Number
v2.1
VM
VM
Execution
State
String
v1.0
VM
VM
Execution
State Value
Integer
v1.0
VM_CPU
Active
Processors
Processing Units
v1.0
VM_CPU
Assigned
Processors
Processing Units
Number of processors or virtual processors that are varied on for the partition.
v1.0
VM_CPU
Current
Processing
Units
Processing Units
v1.0
VM_CPU
Maximum
Processors
Processing Units
v1.0
VM_CPU
Minimum
Processors
Processing Units
v1.0
VM_CPU
Physical
Processors
Consumed
Processing Units
v1.0
VM_CPU
Processing
Mode
String
v1.0
VM_CPU
Processor
Entitlement
Consumed
Percent
v1.0
VM_CPU
Runtime
Processing
Units
Processing Units
v1.0
VM_CPU
Sharing
Mode
String
v1.0
VM_MEMORY
Assigned
Memory
Megabytes
v1.0
VM_MEMORY
Maximum
Memory
Megabytes
v1.0
VM_MEMORY
Minimum
Memory
Megabytes
v1.0
VM_MEMORY
Used
Memory
Megabytes
v1.0
VM_DISK
Disk
Bandwidth
Used (%)
Percent
v1.0
VM_DISK
Disk Data
Transfer
Rate
Kilobytes/Second
v1.0
VM_DISK
Disk KB
Read
KB
v1.0
VM_DISK
Disk KB
Write
KB
v1.0
VM_DISK
Disk Size
Megabytes
v1.0
VM_DISK
Disk Status
String
v1.0
VM_DISK
Transfers
Per Second
TPS
Number of transfers per second issued to the physical disk. A transfer is an I/O
request to the physical disk. Multiple logical requests can be combined into a
single I/O request to the disk. A transfer is an indeterminate size.
v1.0
VM_DISK
Virtual
Storage
Adapter
String
v1.0
VM_DISK_GROUP
Count
Problem
Storage
Volumes
Number
v1.0
VM_DISK_GROUP
Disks
Missing
Metrics
Number
v1.0
VM_NIC
Kilobytes
Received
Kilobytes
v1.0
VM_NIC
Kilobytes
Sent
Kilobytes
v1.0
VM_NIC
Packets
Received
Packets
v1.0
VM_NIC
Packets Sent
Packets
v1.0
VM_NIC
Status
String
v1.0
Macro after logged on: total calculated time between the completion of the login process and completion of the macro execution
The probe also supports QoS (Quality of Service) data to generate trending data over a period. The QoS data reflects the actual accessibility as
experienced by end users attempting to log in the Citrix ICA servers. The probe generates the QoS data on the following metrics:
Connect time
Session time
Login time
Logout time
Startup publish application time
Run macro script time
ICA ping
Total profile time
More information:
ica_response (Citrix Client Response Monitoring) Release Notes
ica_response AC Configuration
This section describes the configuration concepts and procedures for setting up the Citrix Client Response Monitoring probe. This probe is
configured to monitor the performance of the Citrix terminal server client connection, launch an application, and run a macro script. The following
diagram outlines the process to configure the probe to connect to the Citrix ICA servers, create a monitoring profile and configure alarms and
QoS.
Contents
Verify Prerequisites
Create a Monitoring Profile
Activate Monitors
Record a Macro
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see ica_response (Citrix
Client Response Monitoring) Release Notes.
Verify that you have installed any one of the following Citrix Client software on the host system.
Citrix Receiver
Citrix Online Plug-in
Web Client Plug-in
Determine the user credentials to access the Citrix server.
Verify that PPM 2.80 or later is running on the primary hub.
Verify that you made the required registry changes
Verify that you have changed the citrixfilepath variable value in the CFG file to mention the correct installed path of the Citrix receiver.
Note: These registry changes are mandatory for the ICA functionality to work properly.
Note: For 64-bit, the hierarchy is HKEY_LOCAL_MACHINE --> SOFTWARE --> Wow6432Node -->Citrix --> ICA Client
6. Right-click on the ICA Client folder and select the New --> Key option.
A new folder gets created as a child of ICA Client parent folder.
7. Rename the new folder as CCM, right-click on this folder and select the NEW --> DWORD (32-bit) Value option.
A new attribute is created in the right-side pane of the Registry Editor window.
8. Rename the new attribute as AllowSimulationAPI.
9. Double-click the AllowSimulationAPI.
The Edit DWORD (32-bit) Value dialog appears.
10. Define 1 in the Value data field and click OK.
11. Restart your system.
1. Change the value for citrixfilepath variable, in the CFG file, with the path where the Citrix client software is installed. For example, if it is
installed at C:\ drive, then for:
32-bit Windows:
The value for citrixfilepath variable is the C:\Program Files\Citrix\ICA Client\.
64-bit Windows:
The value for citrixfilepath variable is C:\Program Files (x86)\Citrix\ICA Client\.
Activate Monitors
After creating a monitoring profile you must activate the required monitors to fetch monitoring data. All these monitors allow you to generate
alarms and generate QoS data when the specified thresholds are breached.
Follow these steps:
1. Navigate to the Monitors node under the profile name node.
2.
Note: The Total Profile Time monitor only generates QoS messages. The probe does not generate alarms even if Publish
Alarms is selected.
4. Select the Publish Data option, if available, for generating QoS data for the monitor.
5. Click Save to apply these changes.
The probe restarts and reloads the probe GUI. The monitoring profile fetches the monitoring data of the configured monitors for generating alarms
and QoS.
Note: The probe takes considerable time after saving and reloading the configuration, so avoid saving the configuration frequently. The
recommendation is to configure all necessary monitors at once and then save the configuration.
Record a Macro
Macro recording is a functionality that records user actions for playback at a later time. In ica_response, for using the macro recording
functionality, a conf_ica_macro_recording application is provided with the probe. This application is available in the location where the probe is
deployed on your system.
You must first get connected with the ICA Server that you have defined in the selected profile. Once connected, you can open the conf_ica_macr
o_recording application and record the steps to execute certain functionality. These steps can be saved for future references. To enable the
macro recording functionality in the probe, you must enable the Run macro script option in the Macro Configuration section under the Profile
Name node.
After the connection is established, the Connect button gets disabled and Log off gets enabled, in the ICA Macro Script Recorder di
alog.
b. For example, you have to work on Notepad and record a session containing some steps. So, click Start on the desktop that appears
in the window and select the Notepad application.
The Notepad application starts in the window.
c. Click Record to start the recording and start performing the steps.
After you click Record, recording starts and Stop button gets enabled. Also, Record, Play step, and Clear buttons get disabled.
If you want mouse move events to be included in the macro script, you can select the Record mouse move events checkbox. The
mouse events create many commands.
The recorded commands are listed at the bottom of the application window.
d. After the required session has been recorded, click Stop in the dialog.
The macro recording stops.
Note: The Record, Play step, and Clear buttons get enabled again. Click Log off to log you off the ICA server and disconnect.
e. Click Save and Exit to save the macro script and exit from the macro recording functionality. The Save macro script dialog appears
which allows you to save the macro script.
Click Play step to play the steps selected from the commands list at the bottom of the application window.
Click Clear to clear all recorded commands.
Click Exit to exit the macro recorder without saving the macro script.
Click Start -->Log off -->Disconnect, in the desktop that appear in the window, to disconnect you from the ICA server.
f. Enter the name of the recording in the File box and click OK. The macro script gets saved at the selected location with .rec extension.
This location now appears in the Filename text box in the Application tab of the Profiles tab, when you click the ellipsis (...) button.
Similarly, you can perform the other functionalities - Delete, New before, and New after.
Select New before to add a new entry before the selected line.
Select New after to add a new entry after the selected line.
The command line format is: Elapsed time;Command;Arg1;Arg2;Arg..
Probe Interface
ica_response Node
<Profile Name> Node
Monitors Node
Probe Interface
The probe interface is divided into a navigation pane and a details pane. The navigation pane contains a hierarchical representation of the probe
inventory which includes monitoring targets and configurable elements. The details pane usually contains information based on your selection in
the navigation pane.
ica_response Node
The ica_response node is used to configure the general settings of the Citrix Client Response Monitoring probe, which are applicable to all
monitoring profiles of the probe.
Navigation: ica_response
Set or modify the following values as needed:
ica_response > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
ica_response > General Configuration
This section lets you configure the default log level and other probe-related alarms.
Log Level: specifies the level of details that are written to the log file.
Default: 0 - Fatal
Username: defines the user name having administrator rights on the host system, where the probe is installed. This user can be a
local user or a domain user and must have the appropriate rights to launch ica_response_poll.exe. If you do not enter any value in
these fields, then the exe is launched using the system account.
Password: defines the password of the corresponding user name.
Domain: defines the network domain name of the user.
Maximum Concurrent Sessions: specifies the number of profiles, which the probe can execute simultaneously. Each profile creates a
different session of the given user on the Citrix server.
Default: 5
Maximum Concurrent Sessions Reached Alarm: specifies the alarm message when the maximum number of concurrent sessions
reached.
Default: MaxSessionsReached
Maximum Concurrent Sessions Reached Clear: specifies the clear alarm message when the number of concurrent sessions is less
than maximum concurrent sessions limit.
Default: MaxSessionsOK
Profile Already Running Alarm: specifies the alarm message when the probe attempts to run an already running profile. For example,
a profile runs for a longer session than a monitoring interval and the probe runs the profile again before the previous session is
finished.
Default: ProfileAlreadyRunning
Profile Already Running Clear: specifies the clear alarm message when the profile already running alarm situation is rectified.
Default: ProfileAlreadyRunningOK
Use ICA File: enable the probe to fetch necessary Citrix server connection settings from the ICA file.
Default: Not selected
Filename: defines the ICA file path. You can use the Browse button for navigating to the ICA file.
Override Address Setting in ICA file: overrides the Citrix server address details in the ICA file with the General Profile Configuration sec
tion details.
Override Authentication Setting in ICA file: overrides the user authentication details in the ICA file with the General Profile
Configuration section details.
Override Application Setting in ICA file: overrides the application configuration details in the ICA file with the Application Configuration
section details.
Encryption Level: defines the encryption level for communication between the probe and the Citrix server.
Default: None
Monitors Node
The Monitors node lets you configure QoS data and alarms for the monitoring profile. This node contains a table displaying the list of all monitors.
You can select one or more than one monitor from the list. Then you can activate QoS data and and configure the alarm thresholds and
appropriate error alarm messages for the selected monitor.
Navigation: ica_response > profile name > Monitors
Notes:
Though the Publish Alarms option is visible for the Total Profile Time monitor, only QoS is generated for this monitor.
If the alarms are ON against any QoS, then these alarms are generated based on both Threshold Value fields. Thus, it is
mandatory to provide values for both the thresholds to generate alarms.
ica_response IM Configuration
This article describes the configuration concepts and procedures for setting up the ica_response probe. This probe is configured to monitor the
performance of the Citrix terminal server client connection, launch an application, and run a macro script. The following diagram outlines the
process to configure the probe to connect to the Citrix ICA servers, create a monitoring profile and configure alarms and QoS.
Contents
Verify Prerequisites
Create Profile
Record a Macro
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see ica_response (Citrix
Client Response Monitoring) Release Notes.
Verify that you have installed any one of the following Citrix Client software on the host system:
Citrix Receiver
Citrix Online Plug-in
Web Client Plug-in
Determine the user credentials to access the Citrix server.
Verify that you made the required registry changes.
Verify that you have changed the citrixfilepath variable value in the CFG file to mention the correct installed path of the Citrix receiver.
Important! These registry changes are mandatory for the ICA functionality to work properly.
Note: For 64-bit, the hierarchy is HKEY_LOCAL_MACHINE --> SOFTWARE --> Wow6432Node -->Citrix --> ICA Client
6. Right-click on the ICA Client folder and select the New --> Key option.
A new folder gets created as a child of ICA Client parent folder.
7. Rename the new folder as CCM, right-click on this folder and select the NEW --> DWORD (32-bit) Value option.
A new attribute is created in the right-side pane of the Registry Editor window.
8. Rename the new attribute as AllowSimulationAPI.
9. Double-click the AllowSimulationAPI.
The Edit DWORD (32-bit) Value dialog appears.
10. Define 1 in the Value data field and click OK.
11. Restart your system.
2.
Create Profile
This section describes the process to create a monitoring profile. You can create one or more monitoring profiles for monitoring a specific Citrix
server and cater to different monitoring requirements. For example, one monitoring profile executes application A and monitors the response time;
while another monitoring profile executes a macro and monitors the end user experience of application B.
Follow these steps:
1. Open the probe configuration interface.
The contents of the Profiles tab are displayed, by default.
2. Right-click in the profile pane, and click New to create a new profile.
The ICA Profile Properties dialog appears. Specify the values in the following fields:
Name: the name of the profile.
Connection: authentication details such as server IP for communication between the client and Citrix ICA server.
Check Interval: time interval between each response check. Do not overload the Citrix ICA server by checking too often. A minimum
time interval of 15 minutes is recommended.
3. Click the Alarms Tab to list the different alarm situations.
Define the Response properties (for alarm situations related to response time) or Event properties (for the other alarm situations) for
the selected alarm situation.
4. Click the Misc. tab. Select the QoS messages that you want to send and the timeout values for the different tasks performed by the
probe.
5. Click OK.
The new monitoring profile is created.
Record a Macro
Macro recording is a functionality that records user actions for playback at a later time.
You must first get connected with the ICA Server which you have defined in the selected profile by clicking Connect or by clicking Click here to
connect to server. Once connected, you can open the required application and record the steps to execute certain functionality. These steps can
be saved for future references. To enable the macro recording functionality in the ica_response probe, you are required to enable the Run macro
script option under the Application tab of the Profiles tab. This action enables the fields in this section and you can start using the macro
recording functionality to create and save a macro script.
You can either:
Create a new macro script using Macro recorder, or
Browse and locate the saved recording from the Filename option to select the path.
Click Macro recorder to perform other functionalities (playing the recording, modifying commands, and so on).
Follow these steps:
1. Select the Run macro script check box.
A warning message is displayed as shown in the following dialog:
2. Click OK.
3. Select the Start point value at which the macro is to be started in the test sequence. Available values are Before connect, After
connect, and After login.
4. Click Macro recorder.
The ICA Macro Script Recorder dialog appears.
Note: Initially, the main window of the application is empty.
5. Click Connect in this dialog or click on the link Click here to connect to server to connect you to the ICA server defined in the selected
profile.
The server desktop appears. If the profile is configured to start a published application, this application appears in the window, when
started.
After the connection is established, the Connect button gets disabled and Log off gets enabled, in the ICA Macro Script Recorder dialo
g.
6. For example, you have to record a session containing some steps on a Notepad, click Start on the desktop and select the Notepad appli
6.
cation.
The Notepad application appears.
7. Click Record to start the recording and start performing the steps.
After you click Record, recording starts and Stop button gets enabled. Also, Record, Play step, and Clear buttons get disabled.
If you want mouse move events to be included in the macro script, you can select the Record mouse move events checkbox. The
mouse events create many commands.
The recorded commands are listed at the bottom of the application window.
8. After the required session has been recorded, click Stop in the dialog.
The macro recording stops.
The Record, Play step, and Clear buttons get enabled again. Click Log off to log you off the ICA server and disconnect.
9. Click Save and Exit to save the macro script and exit from the macro recording functionality.
The Save macro script dialog appears which allows you to save the macro script.
Click Play step to play the steps selected from the commands list at the bottom of the application window.
Click Clear to clear all recorded commands.
Click Exit to exit the macro recorder without saving the macro script.
Click Start -->Log off -->Disconnect, in the desktop that appear in the window, to disconnect you from the ICA server.
10. Enter the name of the recording in the File box and click OK.
The macro script gets saved at the selected location with .rec extension.
Note: This location appears in the Filename text box in the Application tab of the Profiles tab.
Notes:
Select New before to add a new entry before the selected line.
Select New after to add a new entry after the selected line.
The command line format is: Elapsed time;Command;Arg1;Arg2;Arg.. See the section Macro Functions Overview for a
description of the commands and arguments available.
General Tab
Profiles Tab
General Tab
Application Tab
Alarms Tab
Client Settings Tab
Misc. Tab
The Message Pool Tab
The Status Tab
Macro Functions Overview
General Tab
Note: The ica_response_poll.exe is a child exe for each profile that is created by the user. For example, if the user creates four
profiles, then, there will be one ica_response.exe file in the probe installation directory and four ica_response_poll.exe files in
that directory.
Alarms
Maximum concurrent sessions reached
Select the alarm message to be issued when the probe exceeds the maximum number of concurrent sessions. You can also select
the Clear message to be issued when the number of concurrent sessions is below the maximum threshold.
Profile already running
Select the alarm message to be issued if the probe attempts to run a profile that is already running. This happens when the profile
runs a long session, and the check interval instructs the probe to run the profile again before the previous session is finished.
Also select the Clear message to be issued when the alarm situation is cleared.
Maximum concurrent sessions
Specifies the maximum number of sessions (profiles) allowed running simultaneously.
Log Level
Sets the level of details written to the log file. Log as little as possible during normal operation, to minimize disk consumption.
Profiles Tab
This tab lists all the configured profiles and is used to add, modify, copy, or delete a profile. When you add or modify a profile, the Profile
Properties dialog is displayed with tabs - General, Application, Alarms, Client Settings, and Miscellaneous where you need to enter/select
the required information associated with the profile. Each entry in the Profiles list defines a monitoring profile for one Citrix ICA server logon/logoff
connection. The check boxes beside the active profiles are shown as selected. In order to define a valid profile, login user/password credentials
are mandatory. The password is encrypted in the probe configuration file.
Note: You must use different login users for different profiles defined to avoid conflicts if unexpected disconnects occur.
The following commands are available when you right-click in the profile pane:
New
Enables you to create a new profile by displaying the ICA Profile Properties dialog.
Edit
Enables you to edit profile properties for the selected profile by displaying the ICA Profile Properties dialog
Copy
Makes a copy of the selected profile. The Profile Properties dialog appears with all the properties copied from the selected profile.
Rename the copied profile and click the OK button.
Delete
Deletes the selected profile. A confirmation dialog is displayed for the deletion.
When you select the New or Edit option as explained above, the ICA Profile Properties dialog appears. This dialog has five tabs - General, App
lication, Alarms, Client Settings and, Misc. All these tabs are explained in the subsequent topics.
General Tab
Authentication
Username
Specifies user credentials for communication between the client and the Citrix ICA server. You must specify a valid user, password
and domain. The probe uses users, configured in the profiles, to login to the Citrix server.
The user can be a local user on the Citrix server or domain user on the Citrix server.
The users must be part of group Remote Desktop Users on the Citrix server.
The users must be part of group Terminal Server Computers on the Citrix server.
The user must be able to publish the application manually.
Domain
Specifies the domain to use in ICA login.
Password
Defines the password to be used in ICA login. Note that this password must also be entered in the Confirm password field.
Check interval
Check Interval
Specifies the time interval between each response check. Do not overload the Citrix ICA server by checking too often. A minimum
time interval of 15 minutes is recommended.
Application Tab
This tab lists the different alarm situations. You can define the Response properties (for alarm situations related to response time) or Event
properties (for the other alarm situations) for the selected alarm situation.
This dialog contains the following fields:
Response / Event properties
Response properties
Enables you to set the response properties, after selecting an alarm situation related to response time. You can specify/edit the
threshold for the selected alarm situation by right-clicking in the Thresholds window.
The New Threshold dialog lets you select an operand and a threshold value, an alarm message to be issued if the threshold is
breached and a severity level for the alarm message.
You can also select a clear message (and severity level) to be issued when the threshold value is no longer breached.
Event properties
Selecting an alarm situation related to other events than response time, you can set the event properties.
Select an Alarm message to be issued if the selected error situation occurs and a severity level for the alarm message.
Further, select a Clear message (and severity level) to be issued when the alarm situation is cleared.
Client Settings Tab
This tab allows you to select which QoS messages to send and timeout values for the different tasks performed by the probe.
The fields are explained below:
Send QoS
Check the QoS values you want to be sent:
Connect time
Logoff time
ICA ping time
Session time
Startup publish application time
Logon time
Run macro script time
Total profile time
Measures time taken for the total operations for the profile.
Gives NULL if:
Connect fails, Logon fails, Macro fails, Published application fails, ICA file fails, Logoff fails, or Session exit fails.
Timeout
Select the timeout values for different tasks performed by the probe. These are:
Connect
Logoff
Session
Publish application
Logon
Macro script
Miscellaneous
Logoff delay
Specify a logoff delay. This is the time the probe waits from the application start command has been executed until starting the logoff
session.
If no application is to be started, the delay is the time from successful login completed until starting the logoff session.
The Message Pool Tab
The Message Pool lists all available alarm messages. You can also add, modify, or delete alarm messages.
This list displays the add, edit and delete options on right-clicking. The following screen appears on clicking the New or Edit option.
The fields are explained as follows:
Name
Specifies the identification name of the alarm message. This name will appear in the pull-down list when selecting an alarm message on
the Alarms tab in the Profiles dialog.
Text
Specifies the alarm message text.
Typing $ in this field, lists all valid variables.
Subsystem
Specifies the ID of the subsystem being the source of this alarm. This id is managed by the nas.
Severity
Specifies the severity of the alarm (clear, information, warning, minor, major or critical).
The Status Tab
The graph displays the total session time for the selected profile. All active profiles are listed below the graph.
Elapsed time;Command;Arg1;Arg2;Arg..
Connect
Arguments
1. Horizontal resolution.
2. Vertical resolution.
Example: 0;Connect;800;600
Disconnect
No arguments
KeyDown
Arguments: see KeySend
KeySend
Arguments
1. Keynumber
Example: 0;KeySend;65
Special key codes:
Shift = 16
Ctrl = 17
Alt = 18
AltGr = 17, 18 (Ctrl + Alt)
Caps Lock = 20
Samples:
Send Abc:
0; KeyDown;16
100;KeySend;65
200;KeyUp;16
300;KeySend;66
400;KeySend;67
//
//
//
//
//
Press shift
a (Down and up)
Release shift
b (Down and up)
c (Down and up)
0;KeyDown;16
100;KeySend;49
200;KeyUp;16
// Press shift
// 2
// Release shift
KeyUp
Arguments: see KeySend
LogOff
No arguments.
MouseDown
Arguments:
1. Button id
2. Modifiers
3. X Position
4. Y Position
Button IDs: 1 - Left, 2 - Right, 4 - Middle.
Modifier = 0.
Example: 0;MouseDown;1;0;200;400
MouseMove
Arguments: See mouse down
MouseUp
Contents
Verify Prerequisites
Create a Monitoring Profile
Activate Monitors
Record a Macro
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see ica_response (Citrix
Client Response Monitoring) Release Notes.
Verify that you have installed any one of the following Citrix Client software on the host system.
Citrix Receiver
Citrix Online Plug-in
Web Client Plug-in
Determine the user credentials to access the Citrix server.
Verify that PPM 2.80 or later is running on the primary hub.
Verify that you made the required registry changes
Verify that you have changed the citrixfilepath variable value in the CFG file to mention the correct installed path of the Citrix receiver.
Perform Registry Changes
The ica_response probe requires certain registry changes on the host system (where the probe is installed) to establish connection with the Citrix
Server.
Note: These registry changes are mandatory for the ICA functionality to work properly.
1.
recommended because these registry settings are used by the probe.
2. Install the appropriate Citrix client software on the computer where the probe is deployed.
3. Click Start, type regedit in the Search programs and files text box and press ENTER.
4. Select regedit.exe from the search results.
The Registry Editor window appears.
5. Open the hierarchy for HKEY_LOCAL_MACHINE as HKEY_LOCAL_MACHINE --> SOFTWARE --> Citrix --> ICA Client.
Note: For 64-bit, the hierarchy is HKEY_LOCAL_MACHINE --> SOFTWARE --> Wow6432Node -->Citrix --> ICA Client
6. Right-click on the ICA Client folder and select the New --> Key option.
A new folder gets created as a child of ICA Client parent folder.
7. Rename the new folder as CCM, right-click on this folder and select the NEW --> DWORD (32-bit) Value option.
A new attribute is created in the right-side pane of the Registry Editor window.
8. Rename the new attribute as AllowSimulationAPI.
9. Double-click the AllowSimulationAPI.
The Edit DWORD (32-bit) Value dialog appears.
10. Define 1 in the Value data field and click OK.
11. Restart your system.
Change the citrixfilepath Variable Value
You must change the citrixfilepath variable value in the CFG file using raw configuration option to mention the correct installed path of the Citrix
receiver.
Follow these steps:
1. Change the value for citrixfilepath variable, in the CFG file, with the path where the Citrix client software is installed. For example, if it is
installed at C:\ drive, then for:
32-bit Windows:
The value for citrixfilepath variable is the C:\Program Files\Citrix\ICA Client\.
64-bit Windows:
The value for citrixfilepath variable is C:\Program Files (x86)\Citrix\ICA Client\.
5.
Active: activates the monitoring profile.
Profile Name: defines a unique name for the profile.
Connection: specifies the connection type of the profile, if the profile connects to the Citrix server or the published application.
Address: defines the IP address of the Citrix server for the profile to establish connection and capture the monitoring data. This field
specifies the name of the Application if Published Application option is selected in the Connection drop down.
Browser Address: defines the web URL or the IP address of the Citrix server for accessing the hosted application. This option, does
not allow you to define the application separately in the Application Configuration section and enable QoS data for Startup Publish
Application Time monitor.
Network Protocol: specifies the network protocol for locating the Citrix server or application.
Username: defines the user name that has appropriate access to the Citrix Server.
Password: defines the password for authenticating the Citrix server user.
Domain: defines the domain name of the Citrix server.
Check Interval: defines the time frequency for executing the monitoring profile. Do not overload the Citrix ICA server by checking too
often. A minimum time interval of 15 minutes is recommended.
Check Interval Unit: specifies the measurement unit of Check Interval field value.
6. Click Submit.
The profile appears as a separate node under the ica_response node in the navigation pane.
7. Click Save to update the probe configuration file with the new profile details.
The profile details are saved to probe configuration file and a Monitors node appears under the profile name node in the navigation
pane.
Activate Monitors
After creating a monitoring profile you must activate the required monitors to fetch monitoring data. All these monitors allow you to generate
alarms and generate QoS data when the specified thresholds are breached.
Follow these steps:
1. Navigate to the Monitors node under the profile name node.
2. Select the desired monitor from the Monitors section.
The related fields of the monitors are displayed below the Monitors section.
3. Select the Publish Alarms option and configure the alarm fields: Threshold Operator, Threshold Value, and Alarm Message.
Note: The Total Profile Time monitor only generates QoS messages. The probe does not generate alarms even if Publish
Alarms is selected.
4. Select the Publish Data option, if available, for generating QoS data for the monitor.
5. Click Save to apply these changes.
The probe restarts and reloads the probe GUI. The monitoring profile fetches the monitoring data of the configured monitors for generating alarms
and QoS.
Note: The probe takes considerable time after saving and reloading the configuration, so avoid saving the configuration frequently. The
recommendation is to configure all necessary monitors at once and then save the configuration.
Record a Macro
Macro recording is a functionality that records user actions for playback at a later time. In ica_response, for using the macro recording
functionality, a conf_ica_macro_recording application is provided with the probe. This application is available in the location where the probe is
deployed on your system.
You must first get connected with the ICA Server that you have defined in the selected profile. Once connected, you can open the conf_ica_macr
o_recording application and record the steps to execute certain functionality. These steps can be saved for future references. To enable the
macro recording functionality in the probe, you must enable the Run macro script option in the Macro Configuration section under the Profile
Name node.
Create a Macro Script
Select the Run macro script option in the Macro Configuration section to start using the macro recording functionality to create and save a macro
script.
You can either:
Create a new macro script using Macro recorder, or
Browse and locate the saved recording from the File option.
Click Macro recorder to perform other functionalities (playing the recording, modifying commands, and so on).
Follow these steps to create a new macro script:
1. Select the Run macro script check box in the Macro Configuration section under the Profile Name node.
2. Locate and run the conf_ica_macro_recording application.
The Macro Recorder dialog appears.
3. Click Macro Recorder.
The ICA Macro Script Recorder dialog appears.
a. Click Connect in the above dialog or click inside the empty window, marked Click here to connect to server to connect you to the
ICA server defined in the selected profile.
The server desktop appears in the window. If the profile is configured to start a published application, the application appears in the
window when started.
After the connection is established, the Connect button gets disabled and Log off gets enabled, in the ICA Macro Script Recorder di
alog.
b. For example, you have to work on Notepad and record a session containing some steps. So, click Start on the desktop that appears
in the window and select the Notepad application.
The Notepad application starts in the window.
c. Click Record to start the recording and start performing the steps.
After you click Record, recording starts and Stop button gets enabled. Also, Record, Play step, and Clear buttons get disabled.
If you want mouse move events to be included in the macro script, you can select the Record mouse move events checkbox. The
mouse events create many commands.
The recorded commands are listed at the bottom of the application window.
d. After the required session has been recorded, click Stop in the dialog.
The macro recording stops.
Note: The Record, Play step, and Clear buttons get enabled again. Click Log off to log you off the ICA server and disconnect.
e. Click Save and Exit to save the macro script and exit from the macro recording functionality. The Save macro script dialog appears
which allows you to save the macro script.
Click Play step to play the steps selected from the commands list at the bottom of the application window.
Click Clear to clear all recorded commands.
Click Exit to exit the macro recorder without saving the macro script.
Click Start -->Log off -->Disconnect, in the desktop that appear in the window, to disconnect you from the ICA server.
f. Enter the name of the recording in the File box and click OK. The macro script gets saved at the selected location with .rec extension.
This location now appears in the Filename text box in the Application tab of the Profiles tab, when you click the ellipsis (...) button.
Edit a Macro Script
To edit a macro script, you must modify the commands listed at the bottom of the application window, while recording the macro script. When you
right-click a command line in the macro script, four options are displayed - New before, New after, Edit and Delete. Select the Edit option to
modify the macro script.
Follow these steps:
1. Right-click the command that you want to modify and select the Edit option.
The Edit macro script line dialog appears.
Probe Interface
ica_response Node
<Profile Name> Node
Monitors Node
Probe Interface
The probe interface is divided into a navigation pane and a details pane. The navigation pane contains a hierarchical representation of the probe
inventory which includes monitoring targets and configurable elements. The details pane usually contains information based on your selection in
the navigation pane.
ica_response Node
The ica_response node is used to configure the general settings of the Citrix Client Response Monitoring probe, which are applicable to all
monitoring profiles of the probe.
Navigation: ica_response
Set or modify the following values as needed:
ica_response > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
ica_response > General Configuration
This section lets you configure the default log level and other probe-related alarms.
Log Level: specifies the level of details that are written to the log file.
Default: 0 - Fatal
Username: defines the user name having administrator rights on the host system, where the probe is installed. This user can be a
local user or a domain user and must have the appropriate rights to launch ica_response_poll.exe. If you do not enter any value in
these fields, then the exe is launched using the system account.
Password: defines the password of the corresponding user name.
The profile name node defines the monitoring parameters and configures appropriate QoS data and alarms for each profile.
Navigation: ica_response > profile name
Set or modify the following values, as needed:
profile name > General Profile Configuration
This section defines all the properties of the monitoring profile.
Active: activates the monitoring profile.
Default: Not selected.
Profile Name: defines a unique name of the profile.
Connection: specifies the connection type of the profile when the profile connects to the Citrix server or the published application.
Default: Server
Address: defines the IP address of the Citrix server for the profile for establishing connection and capture the monitoring data. This field
defines the name of the application if Published Application is selected in the Connection drop-down.
Browser Address: defines the web URL of the Citrix server for accessing the hosted application. This option disables the options for
defining the application separately in the Application Configuration section and enable QoS data for Startup Publish Application
Time monitor.
Network Protocol: specifies the network protocol for locating the Citrix server or application.
Default: HTTPonTCP
Username: defines the user name which is having appropriate access to the Citrix Server.
Password: defines the password for authenticating the Citrix Server user.
Domain: defines the domain name of the Citrix server.
Check Interval: defines the time frequency for executing the monitoring profile.
Default: 60 Min
Check Interval Unit : defines the check interval unit.
Default: Hours
profile name > Application Configuration
This section defines the application details for monitoring application startup time during server monitoring.
Start Published Application: enables the monitoring profile to execute an application in each monitoring interval.
Application Name: defines the application name for the monitoring profile for executing for monitoring application startup time.
Arguments: defines the argument, which can be required, for the application startup.
Logoff Delay: defines the waiting time before logging off the monitoring session from the Citrix server after completing the monitoring
activities.
Default: 0
profile name > Macro Configuration
This section is used for running the Macro Recorder script on the Citrix ICA server.
Run Macro Script: enables you to run the Macro script on the Citrix ICA server.
File enables you to locate the Macro script to be run on the Citrix ICA server.
Start Point: enables you to select at which point in the test sequence the macro is started. The options are: Before connect, After
connect and After login.
profile name > Client Settings
This section is used for configuring an ICA file which contains setting for connecting to the Citrix server, user authentication details, and
application configuration details. This section is optional.
Use ICA File: enable the probe to fetch necessary Citrix server connection settings from the ICA file.
Default: Not selected
Filename: defines the ICA file path. You can use the Browse button for navigating to the ICA file.
Override Address Setting in ICA file: overrides the Citrix server address details in the ICA file with the General Profile Configuration sec
tion details.
Override Authentication Setting in ICA file: overrides the user authentication details in the ICA file with the General Profile
Configuration section details.
Override Application Setting in ICA file: overrides the application configuration details in the ICA file with the Application Configuration
section details.
Encryption Level: defines the encryption level for communication between the probe and the Citrix server.
Default: None
Monitors Node
The Monitors node lets you configure QoS data and alarms for the monitoring profile. This node contains a table displaying the list of all monitors.
You can select one or more than one monitor from the list. Then you can activate QoS data and and configure the alarm thresholds and
appropriate error alarm messages for the selected monitor.
Navigation: ica_response > profile name > Monitors
Notes:
Though the Publish Alarms option is visible for the Total Profile Time monitor, only QoS is generated for this monitor.
If the alarms are ON against any QoS, then these alarms are generated based on both Threshold Value fields. Thus, it is
mandatory to provide values for both the thresholds to generate alarms.
Contents
Verify Prerequisites
Create Profile
Record a Macro
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see ica_response (Citrix
Client Response Monitoring) Release Notes.
Verify that you have installed any one of the following Citrix Client software on the host system:
Citrix Receiver
Citrix Online Plug-in
Web Client Plug-in
Determine the user credentials to access the Citrix server.
Verify that you made the required registry changes.
Verify that you have changed the citrixfilepath variable value in the CFG file to mention the correct installed path of the Citrix receiver.
Perform Registry Changes
The ica_response probe requires certain registry changes on the host system (where the probe is installed) to establish connection with the Citrix
Server.
Important! These registry changes are mandatory for the ICA functionality to work properly.
Note: For 64-bit, the hierarchy is HKEY_LOCAL_MACHINE --> SOFTWARE --> Wow6432Node -->Citrix --> ICA Client
6. Right-click on the ICA Client folder and select the New --> Key option.
A new folder gets created as a child of ICA Client parent folder.
7. Rename the new folder as CCM, right-click on this folder and select the NEW --> DWORD (32-bit) Value option.
7.
A new attribute is created in the right-side pane of the Registry Editor window.
8. Rename the new attribute as AllowSimulationAPI.
9. Double-click the AllowSimulationAPI.
The Edit DWORD (32-bit) Value dialog appears.
10. Define 1 in the Value data field and click OK.
11. Restart your system.
Change the citrixfilepath Variable Value
You must change the citrixfilepath variable value in the CFG file using raw configuration option to mention the correct installed path of the Citrix
receiver.
Follow these steps:
1. Change the value for citrixfilepath variable, in the CFG file, with the path where the Citrix client software is installed. For example, if it is
installed at C:\ drive, then for:
32-bit Windows:
The value for citrixfilepath variable is the C:\Program Files\Citrix\ICA Client\.
64-bit Windows:
The value for citrixfilepath variable is C:\Program Files (x86)\Citrix\ICA Client\.
Note: The paths must end with "\".
Create Profile
This section describes the process to create a monitoring profile. You can create one or more monitoring profiles for monitoring a specific Citrix
server and cater to different monitoring requirements. For example, one monitoring profile executes application A and monitors the response time;
while another monitoring profile executes a macro and monitors the end user experience of application B.
Follow these steps:
1. Open the probe configuration interface.
The contents of the Profiles tab are displayed, by default.
2. Right-click in the profile pane, and click New to create a new profile.
The ICA Profile Properties dialog appears. Specify the values in the following fields:
Name: the name of the profile.
Connection: authentication details such as server IP for communication between the client and Citrix ICA server.
Check Interval: time interval between each response check. Do not overload the Citrix ICA server by checking too often. A minimum
time interval of 15 minutes is recommended.
3. Click the Alarms Tab to list the different alarm situations.
Define the Response properties (for alarm situations related to response time) or Event properties (for the other alarm situations) for
the selected alarm situation.
4. Click the Misc. tab. Select the QoS messages that you want to send and the timeout values for the different tasks performed by the
probe.
5. Click OK.
The new monitoring profile is created.
Record a Macro
Macro recording is a functionality that records user actions for playback at a later time.
You must first get connected with the ICA Server which you have defined in the selected profile by clicking Connect or by clicking Click here to
connect to server. Once connected, you can open the required application and record the steps to execute certain functionality. These steps can
be saved for future references. To enable the macro recording functionality in the ica_response probe, you are required to enable the Run macro
script option under the Application tab of the Profiles tab. This action enables the fields in this section and you can start using the macro
recording functionality to create and save a macro script.
You can either:
To edit a macro script, you need to modify the commands listed at the bottom of the application window, while recording the macro script. When
you right-click a command line in the macro script, four options are displayed - New before, New after, Edit and Delete.
You can, then, select the Edit option to modify the macro script.
Follow these steps:
1. Right-click the command that you want to modify and select the Edit option.
The Edit macro script line dialog appears.
2. Perform the required modifications and click OK.
The selected macro script is now modified.
Similarly, you can perform the other functionalities - Delete, New before, and New after.
Notes:
Select New before to add a new entry before the selected line.
Select New after to add a new entry after the selected line.
The command line format is: Elapsed time;Command;Arg1;Arg2;Arg.. See the section Macro Functions Overview for a
description of the commands and arguments available.
General Tab
Profiles Tab
General Tab
Application Tab
Alarms Tab
Client Settings Tab
Misc. Tab
The Message Pool Tab
The Status Tab
Macro Functions Overview
General Tab
Note: The ica_response_poll.exe is a child exe for each profile that is created by the user. For example, if the user creates four
profiles, then, there will be one ica_response.exe file in the probe installation directory and four ica_response_poll.exe files in
that directory.
Alarms
Maximum concurrent sessions reached
Select the alarm message to be issued when the probe exceeds the maximum number of concurrent sessions. You can also select
the Clear message to be issued when the number of concurrent sessions is below the maximum threshold.
Profile already running
Select the alarm message to be issued if the probe attempts to run a profile that is already running. This happens when the profile
runs a long session, and the check interval instructs the probe to run the profile again before the previous session is finished.
Also select the Clear message to be issued when the alarm situation is cleared.
Maximum concurrent sessions
Specifies the maximum number of sessions (profiles) allowed running simultaneously.
Log Level
Sets the level of details written to the log file. Log as little as possible during normal operation, to minimize disk consumption.
Profiles Tab
This tab lists all the configured profiles and is used to add, modify, copy, or delete a profile. When you add or modify a profile, the Profile
Properties dialog is displayed with tabs - General, Application, Alarms, Client Settings, and Miscellaneous where you need to enter/select
the required information associated with the profile. Each entry in the Profiles list defines a monitoring profile for one Citrix ICA server logon/logoff
connection. The check boxes beside the active profiles are shown as selected. In order to define a valid profile, login user/password credentials
are mandatory. The password is encrypted in the probe configuration file.
Note: You must use different login users for different profiles defined to avoid conflicts if unexpected disconnects occur.
The following commands are available when you right-click in the profile pane:
New
Enables you to create a new profile by displaying the ICA Profile Properties dialog.
Edit
Enables you to edit profile properties for the selected profile by displaying the ICA Profile Properties dialog
Copy
Makes a copy of the selected profile. The Profile Properties dialog appears with all the properties copied from the selected profile.
Rename the copied profile and click the OK button.
Delete
Deletes the selected profile. A confirmation dialog is displayed for the deletion.
When you select the New or Edit option as explained above, the ICA Profile Properties dialog appears. This dialog has five tabs - General, App
lication, Alarms, Client Settings and, Misc. All these tabs are explained in the subsequent topics.
General Tab
This tab is used to specify:
the name or IP address of the Citrix ICA Server
authentication details for communication between the client and Citrix ICA server
time interval between each response check
The General tab contains the following fields:
Name
Specifies a unique profile name.
Connection
Server
Specifies the name or IP address of the Citrix ICA Server.
Published application
Specifies the name of the published application (as it is published on the ICA Server) to be launched through this profile. This field
is optional.
If you use this option:
The probe uses the Browser address (see below) and searches for servers running the published application specified. The Publi
shed application fields on the Application tab will be disabled.
You cannot measure Startup publish application time (this option will be disabled on the Misc tab).
Browser addr.
Defines the address the ICA Client object uses to locate the application or server.
Network protocol
Defines the protocol the ICA Client object uses to locate the application or server.
Valid protocols are:
HTTPonTCP
IPX
NetBIOS
SPX
UDP
Default: HTTPonTCP
Authentication
Username
Specifies user credentials for communication between the client and the Citrix ICA server. You must specify a valid user, password
and domain. The probe uses users, configured in the profiles, to login to the Citrix server.
The user can be a local user on the Citrix server or domain user on the Citrix server.
The users must be part of group Remote Desktop Users on the Citrix server.
The users must be part of group Terminal Server Computers on the Citrix server.
The user must be able to publish the application manually.
Domain
Specifies the domain to use in ICA login.
Password
Defines the password to be used in ICA login. Note that this password must also be entered in the Confirm password field.
Check interval
Check Interval
Specifies the time interval between each response check. Do not overload the Citrix ICA server by checking too often. A minimum
time interval of 15 minutes is recommended.
Application Tab
This tab is used to:
enter the published application that should start immediately after successful login, and / or
select a script to run on the Citrix ICA server.
This tab contains the following fields:
Start published application
Selecting this option enables the Published application fields for input. Thus, you can select a published application that should start
immediately after the login.
Published application
Application
Defines the name of the published application that should start immediately after the login.
Arguments
Provides the argument list for the published application, if any.
Run macro script
Selecting this option enables the Macro script fields for input. Thus, you can select a script to be run on the Citrix ICA Server.
Macro script
Filename
The name of the script to be run. Either specify the name of the script with full path, or use the browse button
to locate the script. Click the Macro recorder button to create your own script.
Start point
Use this drop-down menu to select at which point in the test sequence the macro is started.
The options are:
Before connect
After connect
After login
ICA Client Object version
The probe detects and displays the ICA Client Object version running on the client computer.
Alarms Tab
This tab lists the different alarm situations. You can define the Response properties (for alarm situations related to response time) or Event
properties (for the other alarm situations) for the selected alarm situation.
This dialog contains the following fields:
Response / Event properties
Response properties
Enables you to set the response properties, after selecting an alarm situation related to response time. You can specify/edit the
threshold for the selected alarm situation by right-clicking in the Thresholds window.
The New Threshold dialog lets you select an operand and a threshold value, an alarm message to be issued if the threshold is
breached and a severity level for the alarm message.
You can also select a clear message (and severity level) to be issued when the threshold value is no longer breached.
Event properties
Selecting an alarm situation related to other events than response time, you can set the event properties.
Select an Alarm message to be issued if the selected error situation occurs and a severity level for the alarm message.
Further, select a Clear message (and severity level) to be issued when the alarm situation is cleared.
Client Settings Tab
This tab is used to select whether to use an ICA file or not.
The fields are explained below:
Use ICA file
Select this option if you want to use configuration settings other than as specified in the ica_response probe configuration file. If the
option is not selected, all other fields are grayed out and not active. If selected, you can:
Click Browse to browse for the ICA file you want to use.
You can also select to override certain sections of the ICA file specified:
The address settings.
The authentication settings.
The application settings.
Encryption level
It is used to set the encryption level for the communication between the probe and the server.
Valid options are:
None
Basic
128 bits login only (only login session encrypted)
40 bits
56 bits
128 bits
Misc. Tab
This tab allows you to select which QoS messages to send and timeout values for the different tasks performed by the probe.
The fields are explained below:
Send QoS
Check the QoS values you want to be sent:
Connect time
Logoff time
ICA ping time
Session time
Startup publish application time
Logon time
Run macro script time
Total profile time
Measures time taken for the total operations for the profile.
Gives NULL if:
Connect fails, Logon fails, Macro fails, Published application fails, ICA file fails, Logoff fails, or Session exit fails.
Timeout
Select the timeout values for different tasks performed by the probe. These are:
Connect
Logoff
Session
Publish application
Logon
Macro script
Miscellaneous
Logoff delay
Specify a logoff delay. This is the time the probe waits from the application start command has been executed until starting the logoff
session.
If no application is to be started, the delay is the time from successful login completed until starting the logoff session.
The Message Pool Tab
The Message Pool lists all available alarm messages. You can also add, modify, or delete alarm messages.
This list displays the add, edit and delete options on right-clicking. The following screen appears on clicking the New or Edit option.
The fields are explained as follows:
Name
Specifies the identification name of the alarm message. This name will appear in the pull-down list when selecting an alarm message on
the Alarms tab in the Profiles dialog.
Text
Specifies the alarm message text.
Typing $ in this field, lists all valid variables.
Subsystem
Specifies the ID of the subsystem being the source of this alarm. This id is managed by the nas.
Severity
Specifies the severity of the alarm (clear, information, warning, minor, major or critical).
The Status Tab
The graph displays the total session time for the selected profile. All active profiles are listed below the graph.
Macro Functions Overview
This section details a list of all macro functions that can be used by the macro recorder.
Command line format:
Elapsed time;Command;Arg1;Arg2;Arg..
Connect
Arguments
1. Horizontal resolution.
2. Vertical resolution.
Example: 0;Connect;800;600
Disconnect
No arguments
KeyDown
0; KeyDown;16
100;KeySend;65
200;KeyUp;16
300;KeySend;66
400;KeySend;67
//
//
//
//
//
Press shift
a (Down and up)
Release shift
b (Down and up)
c (Down and up)
0;KeyDown;16
100;KeySend;49
200;KeyUp;16
// Press shift
// 2
// Release shift
KeyUp
Arguments: see KeySend
LogOff
No arguments.
MouseDown
Arguments:
1. Button id
2. Modifiers
3. X Position
4. Y Position
Button IDs: 1 - Left, 2 - Right, 4 - Middle.
Modifier = 0.
Example: 0;MouseDown;1;0;200;400
MouseMove
Arguments: See mouse down
MouseUp
Arguments: See mouse down
Pause
Arguments:
1. Delay in milliseconds.
Example: 0;Pause;1000
WindowActivate
Arguments:
1. Window title
Example: 0;WindowActivate;Calculator
WindowLocation
Arguments:
1. X position
2. Y position
3. Window title
Example: 0;WindowLocation;200;100;Calculator
WindowSize
Arguments:
1. Window width
2. Window height
3. Window title
Example: 0;WindowSize;300;200;Calculator
WindowWait
Arguments:
1. TimeOut in milliseconds.
2. Window title
Example: 0;WindowWait;10000;Calculator
ica_response Metrics
This section describes the metrics that can be configured for the Citrix Client Response Monitoring (ica_response) probe.
Contents
QoS Metrics
Alert Metrics Default Settings
QoS Metrics
The following table describes the checkpoint metrics that can be configured using the ica_response probe.
Monitor Name
Units
Description
Version
QOS_ICA_CONNECT
Milliseconds
v3.0
QOS_ICA_LOGON
Milliseconds
v3.0
QOS_ICA_MACRO
Milliseconds
v3.0
QOS_ICA_LOGOFF
Milliseconds
v3.0
QOS_ICA_PING
Milliseconds
v3.0
QOS_ICA_APPLICATION
Milliseconds
v3.0
QOS_ICA_TOTAL
Milliseconds
v3.0
QOS_ICA_SESSION
Milliseconds
Monitor session from start connect to disconnect again from the server.
v3.0
Warning Threshold
Warning Severity
Error Threshold
Error Severity
Description
ConnectFailed
None
None
None
Critical
ConnectionDisconnected
None
None
None
Critical
Connection disconnected
ConnectionTimeout
None
None
None
Critical
ConnectResponseHigh
None
None
None
Critical
IcaFileFailed
None
None
None
Critical
LogoffFailed
None
None
None
Critical
Logoff failed
LogoffResponseHigh
None
None
None
Critical
LogonFailed
None
None
None
Critical
Logon failed
LogonResponseHigh
None
None
None
Critical
LogonTimeout
None
None
None
Critical
Macro Failed
None
None
None
Critical
None
None
None
Critical
MaxSessionsReached
None
None
None
Critical
ProfileAlreadyRunning
None
None
None
Critical
PublishedAppFailed
None
None
None
Critical
PublishedAppResponseHigh
None
None
None
Critical
PublishedAppTimeOut
None
None
None
Critical
SessionResponseHigh
None
None
None
Critical
SessionTimeOut
None
None
None
Critical
ICMP message from the echo request when the echo request does not encounter any network issues. If an error condition is encountered, such
as the router identified in the echo request is unreachable, the echo response returns with an ICMP error in the packet. If ping has been disabled
on a device, the icmp probe generates an unreachable Q0S message.
The icmp probe generates QoS messages based on the following response data:
Average Response Time (Default Monitor)
Maximum Response Time
Minimum Response Time
Successful Attempts
Failed Attempts
Percentage of Packet Loss (Default Monitor)
Service Availability (Default Monitor)
You can deploy the icmp probe to any robot where the ppm v3.0 or later probe is running. Configure the icmp probe using only Admin Console.
This probe can monitor up to 50,000 resources in about 5 minutes. Use the Verify Filter option in the ICMP GUI to filter the resources displayed
in the navigation pane, as it is limited to displaying up to 255 resources at a time. The alarms generated from the icmp probe are displayed in
Infrastructure Manager, in reports displayed in CA Unified Management Portal (UMP), or from the Metrics tab in Unified Service Manager.
More Information:
icmp (Internet Control Message Protocol) Release Notes
Verify Prerequisites
Before configuring the icmp probe, verify that required software and information is available.
Follow these steps:
1. Determine the path to the primary hub. See the Discovery Server topic for details.
2. Verify in Admin Console that the following probes are installed and running:
CA Unified Infrastructure Management discovery_server is running on the primary hub
ppm v3.0 probe is running on each hub with a robot running the icmp probe
alarm_enrichment v4.6 and nas v4.6 probes are running on each hub with a robot running the icmp probe
prediction_engine v1.0 or later is running on each hub with a robot running the icmp probe
3. Verify CA Unified Management Portal (UMP) is running on a hub if you want to look at metrics and alarms in UMP
4. Have the following information:
IP address for UMP
Login credentials for UMP
Path to your CA Unified Infrastructure Management primary hub
Configuration Overview
The following diagram shows the tasks you complete to configure monitoring devices to collect QoS data for availability, response time, and
packet loss for network devices.
Contents
Verify Prerequisites
Configuration Overview
Discovery of Network Devices
Configure icmp Probe Settings
Apply a Monitoring Configuration with the Probe Configuration GUI
Test the Connection to a Device
Add Discovery Filters
Manually Update Discovery
Filter the Number of Device Profiles Displayed
4.
Note: When you enter a subnet mask, the number of IP addresses the mask represents is displayed (the number of effective
hosts minus two). Only /16 subnets or smaller are supported.
Note: Do not select the Active check box until you want the probe to begin monitoring a device.
Note: For additional information about threshold settings, see Configuring Alarm Thresholds.
Important! Add filters before you manually update discovery. The probe only applies templates to new devices.
Note: If a discovery_server path is not specified, the icmp probe will assume that the discovery_server and the probe are
running on the same robot.
Discovery Scopes - Filter devices by IP address (1.2.3.4), IP range (1.2.3.0-100), or subnet mask (1.2.3.0/24).
Discovery Agents - Filter devices by the agent address, robot with the agent, or IP address of the robot with the agent. The agent
address must be the complete path, following the pattern /<domain>/<hub>/<robot>/discovery_agent.
Discovery Origins - Filter devices by origin. The origin is a name that is assigned to QoS data from probes to identify the origin of the
data. If you are an MSP, for example, typically the origin is the name of each customer. For enterprise customers, typically the hub
name is used.
Note: When you add information to a discovery scope, discovery agent, or discovery origins filter, the data entry field
appears below the filter table. Click New to add more rows to the table.
3. Click Save. The filters are active the next time you select icmp node > Options > Query Discover Server.
1.
Note: If icmp templates have been activated and saved, settings configured in the templates are applied to the filtered devices.
2. Click Reload to refresh your screen. A profile for each device is listed under the icmp node. The profile icon indicates the status of the
subcomponent discovery.
Note: Wait for the component discovery process to complete before you create a monitoring configuration. Some QoS metrics are only
applied to components on specific devices. Determine what device types exist in your environment before applying a monitoring
configuration.
Contents
When you want to begin monitoring devices, go to icmp > Template Editor > template name, select the Active check box, and save
your changes. When you select icmp > Query Discovery Server, the template settings are applied to all newly discovered devices that
meet the rules specified with Discovery Filters.
Note: The icmp probe stops monitoring discovered devices when the Active check box for the Resource Settings is not
selected.
Click Discard to discard all changes that have not been saved.
Click Save to save all your changes.
You can create several filters for a single template. The precedence setting determines which filter applies to a device.
For each template, there must be at least one host filter and one icmp filter. The system displays an error message if a host filter or icmp
filter does not exist for a template.
Create a Template
When you create a template, you also configure host filters and icmp filters as part of the template.
Follow these steps:
1. Go to the icmp GUI and click Template Editor.
2. Click Options (...) > Create Template.
3. Enter the name of the template and a description.
4. (Optional) Determine if you must modify the default precedence setting.
Precedence controls the order of template application. The lower the precedence number, the higher the priority. The probe applies a
template with a precedence of 1 after a template with a precedence of 2. If there are any overlapping configurations between the two
templates, then the settings in the template with a precedence of 1 overrides the settings in the other template. If the precedence
numbers are equal, then the templates are applied in alphabetical order.
Note: Do not select the Active check box for the template until after you have configured template monitoring filters or rules.
5. Click Submit.
The system creates a template that you can configure. The hierarchical structure within a template is host filters, then icmp filters.
6. Click Save.
The template settings are saved. Configure the host filter and the icmp filter settings before you activate the template. The template is not
activated until you select the Active check box and click Save.
Using Filters
A filter allows you to control which network devices are associated with a particular template. You specify additional device criteria by using rules.
Filters usually contain one or more rules to define the types of devices for the template. You can add rules to a device filter to create divisions
within a group of systems or reduce the set of devices that are monitored by the probe. For example, you can add a rule to apply a monitoring
configuration to all devices with the IPV4 address that contains 1.1.1.1. Or you can add a rule to apply the monitor SuccessfulAttempts to all
devices with the Label <host name>.
Note: If no rules exist, the probe always applies the monitoring configuration in an active template to all applicable devices.
When you create a template, a Host Auto Filter for monitoring settings appears as a template node. You can modify the settings in the Host Auto
Filter or copy the Auto Filter to keep the original as a default filter. The Host Auto Filter contains the icmp and monitoring parameters and lets you
add rules to indicate which network devices will be configured with these settings.
Note: The precedence in the Auto Filter is set to zero (highest precedence) by default. Remember to change the precedence in
the Auto Filter to a higher number to allow the filters you're creating to be applied to devices.
3. Click Submit.
4. Open the filter you just created or copied.
5. In the Rules section click New.
6. Configure one or more rules to indicate which devices will use these monitoring settings.
Label - Devices that match the label (or component name).
Primary IPV4 - Devices that match the entered IPV4 address.
Primary IPV6 - Devices that match the entered IPV6 address.
Condition - Options include contains, does not contain, ends with, equals, not equals, regex (regular expression), and starts with.
Value - A value that applies to the Label and Condition.
7. In the Resource Setup section, click Include in Template to include the resource settings in the template.
Uncheck this option to exclude these settings from the template.
8. Enter values for the Interval and ICMP Buffer Size parameters.
These settings pertain to the robot where the icmp probe resides.
9. Click a QoS metric in the Monitors table.
10. To configure the monitoring settings, select Include in Template.
11. For the selected monitor, select the Publish Data, Publish Alarms, and Compute Baseline check boxes to allow baselines to be
computed and QoS data and alarms to be displayed in UMP or Infrastructure Manager.
12. Select and configure alarm thresholds for the selected metric. See Configuring Alarm Thresholds for details.
13. Save your changes.
You can return to this filter at a later time and activate the Resource settings (by selecting the Active check box) when you want the probe to
apply these settings to discovered targets.
When you create a template, an icmp Auto Filter for probe-specific settings appears as a template node. You can modify the settings in the icmp
Auto Filter or copy the Auto Filter to keep the original as a default filter. The icmp Auto Filter contains the icmp parameters and lets you add rules
to indicate which network devices will be configured with these settings.
Follow these steps:
1. In the template editor, under the template you created, click the icmp node > Options (...) > Create Filter to create a new filter.
You can also click icmp Auto Filter node > Options (...) > Copy to create a copy of the default host auto filter.
2. Enter a unique filter name and select a precedence.
Note: The precedence in the Auto Filter is set to zero (highest precedence) by default. Remember to change the precedence in
the Auto Filter to a higher number to allow the filters you're creating to be applied to devices.
3. Click Submit.
4. Open the filter you just created or copied.
5. In the Rules section click New.
6. Configure one or more rules to indicate which devices will use these icmp settings.
Label - Devices that match the label (or component name).
Primary IPV4 - Devices that match the entered IPV4 address.
Primary IPV6 - Devices that match the entered IPV6 address.
Condition - Options include contains, does not contain, ends with, equals, not equals, regex (regular expression), and starts with.
Value - A value that applies to the Label and Condition.
7.
Activate a Template
The probe does not automatically apply the template monitoring configuration. The probe only applies templates that are in an active state. The
template icon in the navigation pane indicates the state of the template. Mouse over the icon to see the state (Inactive or Active).
Important! The monitor settings in a template override any monitor settings in the probe configuration GUI.
Apply a Template
You can apply activated templates to newly discovered devices.
1. Access the icmp GUI.
2. Go to icmp > Template Editor > template name > Host > host filter and verify that the precedence, rules, resource setup, and
monitoring settings are appropriate for the devices to which you want to apply the template. Then close the template editor.
3. Go to icmp > Discovery Filters and verify the discovery filtering settings are correct.
4. Select icmp > Options (...) > Query Discovery Server.
A Response dialog appears indicating the number of devices that match the rules configured with Discovery Filters.
5. Close the Response dialog.
6. Refresh the browser.
The newly discovered devices appear underneath the icmp node in the navigation pane. The template is applied to all newly discovered
devices.
After the template has been applied you see a display panel containing links to the template and template filter used to configure a device.
This article explains the configuration information and options available through the Admin Console icmp configuration GUI and the Raw
Configure menu option.
Probe Interface
icmp Node
Device Profile
Resource Setup
Device Monitors
Discovery Filters Node
Template Editor
<Template Name> Node
Host Filters Node
<Host Filter> Node
ICMP Filters Node
<icmp> Node
Probe Interface
The probe interface is divided into a navigation pane and a details pane. The left navigation pane contains a hierarchical representation of the
probe inventory which includes monitoring targets and configurable elements. The right details pane usually contains information that is based on
your selection in the navigation pane.
icmp Node
Navigation: icmp
This section lets you view probe information, change the probe setup values, apply a filter to limit the number of devices that appear in the
navigation tree, discover devices based on entered attributes, and create a template to bulk configure icmp attributes for monitored devices.
Probe Information
This section provides the basic probe information and is read-only.
Probe Setup
This section provides general configuration details.
Log Level: Sets the amount of detail that is logged to the log file. The default is 1-Error.
Timeout (sec): The time limit for a device to send an ICMP echo response. If this time limit is exceeded, the ICMP echo response
contains an error code. The default is 2 seconds.
Number of Packets: The number of packets sent in an ICMP echo request. The default is 3 packets.
Delay Between Packets The number of milliseconds to wait before sending another packet. The default is 1000 milliseconds.
Buffer Size: Maximum size, in bits, of the ICMP buffer. The default is 64 bits.
Interval in Seconds: Time interval between each ICMP echo request. The default is 600 seconds.
Override Source: Overrides the default QoS source with the provided value. Values include Robot, Hostname, and IP. The default
value for the QoS source is Robot where the probe is deployed.
Note: If you change the Override Source field after the initial configuration, multiple graphs display on the Unified Service
Management (USM) Metrics view (one for every QoS source value).
Profile Filtering
The value in the Filter Criteria field is used to limit the number of resources displayed in the navigation pane.
icmp > Actions > Verify Filter
Filter Criteria: Enter the complete IP address or use regex (regular expressions) patterns. You cannot enter a range (for example
10.1.1.1-9). Instead, to achieve the same results as entering a range, use 10.1.1.[1-9]{1}$.
Additional examples include:
Note: By default, the first 255 resources are visible under the icmp node in the navigation pane.
Device Profile
Navigation: icmp node > Options > Add New Profile
This section lets you create a profile for monitoring a specific device. This profile is displayed as a child node in the icmp tree.
Resource Setup
Active: Uses the interval and ICMP buffer size settings for the resource specified in the Hostname parameter.
Interval in Seconds: The time interval between each ICMP echo request. The default is 600 seconds.
ICMP Buffer Size: Maximum size, in bits, of the ICMP buffer. The default is 64 bits.
icmp > device profile > Action > Test Connection
Select this option to determine if a device is reachable. A Success dialog displays to indicate the probe successfully received a response
from the device.
icmp > device profile > Option (...) > Delete Profile
Select this option to delete a device profile.
Device Monitors
Compute Baseline: Select this option to enable thresholds. This option might not be available depending on your CA Unified
Infrastructure Management configuration.
Dynamic Alarm, Static Alarm, Time Over Threshold Alarm, Time To Threshold Alarm. For more information, see Configuring Alarm
Thresholds.
Template Editor
Navigation: icmp > Template Editor
The template editor allows you to create monitoring and icmp configuration templates with filters and filter rules.
Template Editor > icmp probe > Options (...) > Create Template
Fields to know:
Template Name: A meaningful name for the template
Description: Text description of the template
Precedence: A number that indicates the order of template application by the probe. The lower the precedence number, the higher the
priority. If the precedence numbers are equal, then the templates are applied in alphabetical order.
Active: Indicates the template state.
<Template Name> Node
Navigation: Template Editor > icmp probe > template name > Host Filters
An organizational node in the template tree. This node contains the device filters for the probe monitoring profile.
icmp probe > template name > Host > Options (...) > Create Filter
Click to add a new device filter to a template. Fields to know:
Filter Name: A meaningful name for the filter
Precedence: A number that indicates the order of filter application by the probe. The lower the precedence number, the higher the
priority. If the precedence numbers are equal, then the template filters are applied in alphabetical order.
<Host Filter> Node
Navigation: Template Editor > icmp probe > template name > Host Filters > device filter
Device filters apply monitoring configurations that are based on the attributes of target devices. Multiple filters can exist for a device.
host filter > Filter
View or modify the filter configuration. Fields to know:
Filter Name: A meaningful name for the filter
Precedence: A number that indicates the order of filter application by the probe. The lower the precedence number, the higher the
priority. If the precedence numbers are equal, then the template filters are applied in alphabetical order.
host filter > Rules
Control which devices are associated with a particular template. Enter a rule type, condition, and value.
host filter > Resource Setup
Control icmp settings for a device. Enter an interval and ICMP buffer size for a resource.
host filter > Monitors
Control monitoring settings for a device. Fields to know:
Monitors table: Lists the QoS metrics and the configured settings.
Include in Template: Select this check box to include the monitoring alarm threshold settings in the Host filter.
QoS Information: Display-only information about the QoS metric selected in the Monitors table.
Publish Data, Publish Alarms, Compute Baseline: Select these check boxes for each monitor to allow the probe to publish data and
alarms. and/or compute baselines for the selected monitor.
Static Alarm, Dynamic Alarm, Time Over Threshold, and Time To Threshold Alarm: Select the appropriate check box to publish the
selected type of alarms in Infrastructure Manager or UMP. For the selected alarm type, configure the alarm severity and threshold
settings. See Configure Alarm Thresholds for more details.
host filter > Options (...)
Available options:
Copy - Click to create a new device filter with a similar configuration to an existing filter.
Delete Filter - Select Delete Filter and click Save to permanently remove a device filter.
Note: At least one host filter must exist for each template. If you inadvertently delete the only existing host filter for a template and click
Save, an error appears the next time you click Template Editor > icmp probe > template name > Host Filters. You'll be required to
delete the template and create a new one.
Navigation: Template Editor > icmp probe > template name > icmp Filters
An organizational node in the template tree. This node contains the icmp filters for the probe-specific settings.
icmp probe > template name > icmp > Options (...) > Create Filter
Click to add a new icmp filter to a template. Fields to know:
Filter Name: A meaningful name for the filter
Precedence: A number that indicates the order of filter application by the probe. The lower the precedence number, the higher the
priority. If the precedence numbers are equal, then the template filters are applied in alphabetical order.
<icmp> Node
Navigation: Template Editor > icmp probe > template name > icmp Filters > filter
ICMP filters apply probe-specific settings. Multiple filters can exist for a device.
icmp filter > Filter
View or modify the filter configuration. Fields to know:
Filter Name: A meaningful name for the filter
Precedence: A number that indicates the order of filter application by the probe. The lower the precedence number, the higher the
priority. If the precedence numbers are equal, then the template filters are applied in alphabetical order.
icmp filter > Rules
Control which devices are associated with a particular template. Enter a rule type, condition, and value.
device filter > Probe Setup
View or modify the filter configuration. Fields to know:
Include in Template: Select this check box to include the probe-specific settings in the icmp filter.
Log level: Sets the level of information saved to the log.
See Probe Setup for details about the remaining fields.
device filter > Options (...)
Available options:
Copy - Click to create a new icmp filter with a similar configuration to an existing filter.
Delete Filter - Select Delete Filter and click Save to permanently remove an icmp filter.
Note: At least one icmp filter must exist for each template. If you inadvertently delete the only existing icmp filter for a template and click
Save, an error appears the next time you click Template Editor > icmp probe > template name > icmp Filters. You'll be required to
delete the template and create a new one.
icmp Metrics
The following table describes the QoS metrics monitored by the Internet Control Message Protocol (icmp) probe.
Monitor Name
Unit
Description
Version
QOS_AVG_RESPONSE_TIME
Milliseconds
v1.2
QOS_MAX_RESPONSE_TIME
Milliseconds
v1.2
QOS_MIN_RESPONSE_TIME
Milliseconds
v1.2
QOS_FAILED_ATTEMPTS
Count
v1.2
QOS_SUCCESSFUL_ATTEMPTS
Count
v1.2
QOS_PACKET_LOSS
Percent
v1.2
QOS_SERVICE_AVAILABILITY
Percent
v1.2
More information:
iis (IIS Server Monitoring) Release Notes
Contents
Verify Prerequisites
Create Host Profile
Enable a Monitor
Monitor the Application Pool
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see iis (IIS Server Monitoring)
Release Notes.
Note: The probe does not support HTTP Response value monitor for HTTPS connections.
Enable a Monitor
The IIS Server Monitoring displays the list of available monitors in the right-pane for the selected component in the left-pane. You can configure
the monitor properties for generating corresponding QoS and alarm messages. Adding a monitor can be optional as the probe has default
monitors which are enabled.
Follow these steps:
1. Double-click the monitor in the right-pane. The Monitor Properties dialog appears.
2. Enter values for the following fields:
Description: Specify the monitor description.
Monitoring Object: Specify the counter name.
Enable Monitoring: Select this checkbox to generate alarms when the threshold is breached.
Value: Select which value to be compared with the Threshold Value as follows:
Current Value: Compares the last measured value with the Threshold Value.
Compute Average: Calculates the average of the values as measured in the given time interval.
Operator: Specifies the threshold operator while comparing the threshold value with the actual value. For example, use the =>o
perator for generating alarm when the measured value is greater than or equal to the threshold value.
Threshold Value: Defines the alarm threshold value.
Unit: Identifies the unit of the monitored value. For example, % and Mbytes.
Message Token: Specifies the alarm message to be issued when the threshold value is breached.
Publish Quality of Service (QoS): Generates the QoS messages on the monitor.
3. Click OK and save the monitor properties.
4. Click Apply on the probe interface for applying the changes.
The probe is restarted and starts measuring the monitor value for generating QoS and alarm messages.
Important! The Application Pool monitoring feature is supported for IIS 7.0 or later. The APP_POOL_WAS monitor must be available
on the Windows Performance Monitor of the system where IIS 7.0 server or later is installed.
The probe gives the following list of monitors for monitoring the application pool:
Current Worker Processes: fetches the total number of worker processes that are currently running in the application pool.
Recent Worker Process Failures: fetches the total number of times that the worker processes for the application pool has failed during
the rapid-fail protection interval.
Total Worker Process Startup Failures*: fetches the total number of times the Windows Process Activation Service (WAS) has failed to
start a worker process.
Current Application Pool State: fetches the current status value of the application pool, which can be one of the following values:
1 - Uninitialized
2 - Initialized
3 - Running
4 - Disabling
5 - Disabled
6 - Shutdown Pending
7 - Delete Pending
Application Pool Running: fetches the current running status of the application pool, which can be one of the following values:
1 - Running
0 - Not Running
Total Worker Process Failures*: fetches the total number of times the worker processes have crashed after starting the application pool.
Total Application Pool Recycles*: fetches the total number of times the application pool is recycled after the WAS is started.
Total Worker Process Ping Failures*: fetches the total number of times the WAS has failed to receive a ping response from the worker
process.
Total Worker Process Shutdown Failures*: fetches the total number of times the WAS has failed to shut down a worker process.
Application Pool Memory: fetches the total memory all the worker processes of the application pool.
Application Pool CPU: fetches the total CPU used by all the worker processes of the application pool.
Note: The counters marked with an asterisk (*) return the difference between values of the current and last monitoring interval.
You can configure more than one counter for each application pool for monitoring, after adding the IIS server host to the probe.
Follow these steps:
1. Right-click the Application Pool group for the appropriate IIS host and select the Rediscover option.
2. Select the appropriate application pool from the Available Application Pools list to the Selected Application Pools list.
3. Click OK.
The selected application pools appear under the Application Pool group in the left-pane.
4. Select the application pool from the left-pane. Its counters are displayed in the right-pane.
5. Select the check box for each checkpoint and activate the counter for monitoring.
6. Double-click the counter and configure the counter properties in the Monitor Properties dialog.
The probe starts fetching values for the activated counters of the application pool for generating QoS and alarm messages.
The probe configuration interface consists of a row of tool buttons and two window panes, as follows:
Left pane
Right pane
Toolbar buttons
Note: When you have made configuration modifications, you must click the Apply button to activate the new settings.
Activate All
Activates all checkpoints on the selected host.
Deactivate All
De-activates all checkpoints on the selected host.
You can move a host from one group to another by left-clicking the host, dragging it and dropping it in another group.
Edit
Opens the Monitor Properties dialog that enables you to modify the monitoring properties for the selected checkpoint.
Activate
Activates the selected checkpoint to be monitored by the probe. Note that you may also activate it by clicking the checkbox belonging
to the checkpoint.
Deactivate
Deactivates the selected checkpoint (if activated) after which the probe will stop monitoring the checkpoint.
Monitor
Opens the monitor window for the selected checkpoint showing the values recorded since the probe was started.
Note: The horizontal red line in the graph indicates the alarm threshold (in this case 90 %) defined for the checkpoint.
When clicking and holding the left mouse button inside the graph, a red vertical line appears. If you continue to hold the left
mouse-button down and move the cursor, you can read the exact value at different points in the graph. The value is displayed in the
upper part of the graph on the format: <Day> <Time> <Value>.
Right-clicking inside the monitor window lets you select the backlog (the time range shown in the monitor window). In addition, the
right-click menu lets you select the option Show Average. This adds a horizontal blue line in the graph, representing the average
sample value.
Note the status bar at the bottom of the monitor window. The following information can be found:
The number of samples since the probe was started.
The minimum value measured.
The average value measured.
The maximum value measured.
The backlog: This is the time range shown in the monitor window. The backlog can be selected to 6, 12, 24 or 48 hours by
right-clicking inside the graph. Note that the graph can not show a time range greater than the selected Maximum data storage
time setting in the Setup section of the GUI.
General Setup
Clicking this button opens the Setup dialog for the probe, allowing you to modify the general probe parameters.
Displays the database status for the monitored hosts in the right pane.
The list shows the first and the last date backlog and summary data that has been recorded and written to the database, and also the
number of records in the period.
You can create a new host profile by selecting the group it should belong to and click the Create a New Host Profile tool button. The
Profile[New] dialog-box appears and prompts you for the hostname or IP-address of the IIS host to monitor and some additional
parameters.
Hostname or IP address
Specifies the name or IP address of the host to be monitored. If the probe is located on the same computer as the IIS Server, you
should just specify localhost instead of the name or IP address of the host. This is to avoid authentication issues.
Active
If selected, activates or deactivates monitoring of the checkpoints on the host. Note that you may also activate/deactivate monitoring
of the various checkpoints individually.
Group
Select from the drop-down list which group folder you want to put the host. Use the group folders to place the hosts in logical groups.
Alarm Message
The alarm message to be issued if the host does not respond. Using the Message pool, you can edit this message or add other
messages.
Server address for http response
The address to be tested for http response on the format: http://193.71.55.8 or http://www.nimsoft.com.
Note: The probe does not support HTTP Response value monitor for HTTPS connections.
Description
Specifies a short optional description of the monitored host.
Data collection interval
Select the data collection interval (how often to collect data from the host) from the drop-down menu. This value overrules the
Common minimum data collection interval defined on the General Setup dialog.
Note: You may also type in a value. The minimum possible value for data collection interval is 1 min.
Filter port
Defines the port which will be used to communicate between the add-on and the probe. The default value is 999. If changed, the same
port number has to be set at the IIS probe as well.
Windows Authentication
Username and Password
The credentials to login (a valid user name and password with administrator privileges) on the monitored host.
Domain
Defines the DNS domain name if the host local machine. The blank field means that the machine is hosting the probe.
Test
Clicking this button verifies if your windows authentication works.
Http Server Authentication
None
Indicates no authentication against the http server (will still require windows authentication for performance data).
Basic
Indicates basic authentication against the http server. Using the Http server authentication Username and Password specified.
Windows
Uses the Windows authentication, Username and Password specified as described above (see Windows Authentication above).
Username and Password
The login properties (a valid user name and password with administrator privileges) on the http server.
Launch the Message Pool Manager
The alarm messages for each alarm situation are stored in the Message Pool. This option lets you customize the alarm text, and you can
also create your own messages.
Note: Variable expansion is supported in the message text. If you type a $ in the Error Alarm Text field, a dialog pops up, offering a
set of variables to be chosen.
Click this button to open the IIS Summary Report for the selected host. The window contains a graph, showing the HTTP response time and the
Cache hits/misses per day.
Note: You must select the IIS sub node for a host in the left pane to activate the button.
Click the drop-down menu in the upper left corner to display the graph with the values for:
Per hour
Displays the values with a solution of one hour for the period selected using the From and To fields.
Click the Submit button to fetch the values for the selected period.
Per day
Displays the values with a solution of one day for the period selected using the From and To fields.
Click the Submit button to fetch the values for the selected period.
Last day
Displays the values with a solution of one hour for the last day.
Last month
Displays the values with a solution of one hour for the last month.
When clicking and holding the left mouse-button down and move the cursor, you can read the exact value at different points in the
graph. The value is displayed in the upper part of the graph on the format: <Day> <Time> <value>.
View IIS Server Requests
The iis probe enables you to view IIS server data (requests) in a separate window. To enable this functionality, some configuration tasks
must be done on the computer running the IIS server software.
Open the configuration tool for the probe by double-clicking the line representing the probe in the Infrastructure Manager.
Using the tool buttons located in the upper part of the window, you can select between different statistics views. The most recent request
running on the IIS server (at the time of the last sample).
Request statistics (hourly, daily and monthly)
The entries in the list can be selected and copied to clipboard (CTRL + C), in the case you want to paste the entries into another
application (CTRL + V).
Left-click the entries to select them, or select CTRL+ A to select all.
Request database status, showing the number of records in the request database
If you want to delete some or all records, you right-click in the list and select Delete Request Data.
The following dialog appears, asking you to select for which period to delete data. Make your selection and click the OK button.
Using the calendar functionality, you can select to show the statistics for a specific period. Select a period and click Submit.
Click this button to show/hide checkpoints from the list that are not available on the selected host.
Contents
Verify Prerequisites
Add Host
Alarm Thresholds
Monitor the Application Pool
Set Device ID Key Using Raw Configure
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see iis (IIS Server Monitoring)
Release Notes.
Add Host
You can add the IIS host server under the iis node to monitor the IIS server.
Follow these steps:
1. Click the Options(icon) next to the iis node in the navigation pane.
2. Click the Add New Host option.
3. Set or modify the following values in the Add New Host window
Hostname or IP Address: Enter the name or IP address of the IIS server to be monitored.
Active: Enables you to activate the host for monitoring.
Default: Selected
Alarm Message: Specify the alarm message to be used when the host does not respond.
Default: MsgAgentError
URL for http response: Specify the address to be tested for http response. For example, http://10.112.69.14 or http://www.msn.com.
Description: Specify a short description of the monitored host.
Data Collection Interval (min): Specify how often the probe collects data from the host.
Default: 1
Filter Port: Define the port which is used to communicate between the IIS add-on (IISrequest.dll) and the probe. This option is used for
monitoring the IIS server remotely.
Default: 999
Windows Username: Define a valid user name with administrator privileges for logging in to the monitored host.
Windows Password: Define the password for logging in to the monitored host.
Windows Domain: Define the DNS domain name for locating the IIS server when it is not hosted on the local system. Leave the field
blank, when both IIS and probe are installed on the same system.
Http Server Authentication: Specify the auhentication type as follows:
None: indicates no authentication against the http server (still requires the Windows authentication for performance data).
Basic: indicates basic authentication against the http server. The Http Username and Http Password fields values are used for
the authentication.
Windows: uses the Windows authentication credentials as specified in Windows Username and Windows Password fields.
Note: The probe does not support HTTP Response value monitor for HTTPS connections.
Http Username: Define a valid user name with administrator privileges of the http server.
Http Password: Define the password for logging in to the http server.
2. Click Submit.
3. Click the Test Windows Credential option from the Actions drop-down to verify the entered values.
The host is saved under the iis node. Every new host contains four child nodes (ASP, IIS, System, and Webservices), which are used
to configure the monitoring properties of the host.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
Important! The Application Pool monitoring feature is applicable for IIS 7.0 or later. The APP_POOL_WAS counter must be available
on the Windows Performance Monitor of the system where IIS 7.0 server or later is installed.
Key
Value
Impact
Previous
to v1.71
NA
NA
Upgrade
to v1.71
No
No Impact
Upgrade
to v1.71
Yes
Previously, all QoS and alarms for the local host profile were generated on the old Device ID. Now, the probe generates all
the QoS and alarms for local host profiles on the new Device ID i.e. Robot Device ID. This breaks the old data continuity
on the USM portal.
iis Node
Host-<Host Name> Node
<Host Name> Node
IIS Server Node
Application Pool Node
<Application Pool Name> Node
ASP Node
IIS Node
System Node
Webservices Node
iis Node
This section contains configuration details specific to the IIS Server Monitoring probe. In this node, you can view the probe information and can
configure the log properties of the probe. You can also configure the advanced properties of the probe to set maximum summary storage and
number of concurrent threads.
Navigation: iis
Set or modify the following values, if needed:
iis > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
iis > General Configuration
This section lets you configure the default log level for the probe.
Log Level: specifies the detail level of the log file.
Default: 3 - Info
iis > Advanced Configuration
This section lets you set the maximum summary storage time and the number of concurrent threads for the probe.
Max Summary Storage: specifies the maximum data storage time for local monitoring purpose. This value is the time range within which
the monitored values are picked for calculating the average value (used when setting the alarm threshold). This option does not affect the
QoS data.
Default: 6 hours
Maximum Concurrent Threads: specifies the maximum number of profiles the probe can run simultaneously. The valid range is 0 - 100.
Default: 20
iis > Messages
This section lets you view the default alarm messages for the different error conditions.
Error: identifies the error message text. You can use variables by typing "$" from the following list:
state
expected_state
name
display_name
profile_name
credential
expected_credential
restarts
OK: identifies the clear message text.
Subsystem: identifies a descriptive text name for the subsystem.
Hostname or IP Address: defines the name or IP address of the IIS server to be monitored.
Active: activates the host monitoring.
Default: Selected
Alarm Message: specifies the alarm message when the host does not respond.
Default: MsgAgentError
URL for http response: specifies the address to be tested for http response. For example, http://10.112.69.14 or http://www.msn.com.
Description: specifies a short description of the monitored host.
Data Collection Interval (min): specifies how often the probe collects data from the host.
Default: 1
Filter Port: defines the port which is used to communicate between the IIS add-on (IISrequest.dll) and the probe. This option is used for
monitoring the IIS server remotely.
Default: 999
Windows Username: defines a valid user name with administrator privileges for logging in to the monitored host.
Windows Password: defines the password for logging in to the monitored host.
Windows Domain: defines the DNS domain name for locating the IIS server when it is not hosted on the local system. Leave the field
blank, when both IIS and probe are installed on the same system.
Http Server Authentication: specifies the authentication type as follows:
None: indicates no authentication against the http server (still requires the Windows authentication for performance data).
Basic: indicates basic authentication against the http server. The Http Username and Http Password fields values are used for the
authentication.
Windows: uses the Windows authentication credentials as specified in Windows Username and Windows Password fields.
Note: The probe does not support HTTP Response value monitor for HTTPS connections.
Http Username: defines a valid user name with administrator privileges of the http server.
Http Password: defines the password for logging in to the http server.
Note: Click the Test Windows Credential option from the Actions drop-down to verify the entered values.
Total Worker Process Startup Failures*: fetches the total number of times the Windows Process Activation Service (WAS) has failed to
start a worker process.
Current Application Pool State: fetches the current status value of the application pool, which can be one of the following values:
1 - Uninitialized
2 - Initialized
3 - Running
4 - Disabling
5 - Disabled
6 - Shutdown Pending
7 - Delete Pending
Application Pool Running: fetches the current running status of the application pool, which can be one of the following values:
1 - Running
0 - Not Running
Total Worker Process Failures*: fetches the total number of times the worker processes have crashed after starting the application pool.
Total Application Pool Recycles*: fetches the total number of times the application pool is recycled after the WAS is started.
Total Worker Process Ping Failures*: fetches the total number of times the WAS has failed to receive a ping response from the worker
process.
Total Worker Process Shutdown Failures*: fetches the total number of times the WAS has failed to shut down a worker process.
Application Pool Memory: fetches the total memory of all the worker processes of the application pool.
Application Pool CPU: fetches the total CPU used by all the worker processes of the application pool.
Note: The counters marked with an asterisk (*) return the difference between values of the current and last monitoring interval.
Click the Options (icon) next to the Application Pool node and select the Delete Application Pool option to delete the Application Pool.
Alternatively, you can move the application pool back from the Selected list to the Available list in the Application Pool node, and click
Save to stop monitoring the application pool.
ASP Node
The ASP node is used to configure the following counters of the IIS server:
Requests Bytes In Total
Requests Executing
Requests Queued
IIS Node
The IIS node is used to configure the following counters of the IIS server:
Host Availability
HTTP Response Time
HTTP Response Value
IIS max request time
IIS Status Value
Total Allowed Async I/O Requests
URI Cache Flushes
URI Cache Hits
URI Cache Hits %
URI Cache Hits/minute
URI Cache Misses
URI Cache Misses/minute
System Node
The System node is used to configure the following counters of the IIS server:
Available Physical Memory
Memory In Use
Memory Usage
NW Bytes Total/sec
Total CPU
Total Memory
Webservices Node
The Webservices node is used to configure the following counters of the IIS server:
Bytes Received/sec
Bytes Sent/sec
Bytes Total/sec
Connection Attempts/sec
Current Connections
Get Requests/sec
Measured Async I/O Bandwidth Usage
iis Metrics
This section contains the metrics for the IIS Server Monitoring (iis) probe.
Contents
QoS Metrics
Alert Metric Default Settings
QoS Metrics
This table contains the QoS metrics for the IIS Server Monitoring probe.
Monitor Name
Units
Description
Version
QOS_ASYNCIO_BWUSAGE
v1.0
QOS_BYTESRECEIVD_PS
Bytes received/sec
v1.0
QOS_BYTESSENT_PS
Bytes sent/sec
v1.0
QOS_BYTESTOTAL_PS
Bytes Total/sec
v1.0
QOS_CONNATTEMPTS_PS
Connection attempts/sec
v1.0
QOS_CPU_USAGE
Percent
CPU Usage
v1.0
QOS_CURRENTCONNECTS
Current connections
Current connections
v1.0
QOS_IIS_DISK_USAGE
Megabytes
Disk Usage
v1.0
QOS_GET_REQUESTS_PS
Get Requests/sec
v1.0
QOS_IIS_HTTPRESTIME
Milliseconds
v1.0
QOS_IIS_USEROBJ
Value
v1.0
QOS_MEMAVPHY
Megabytes
v1.0
QOS_MEMORY_INUSE
Percent
Memory in Use %
v1.0
QOS_MEMORY_USAGE
Megabytes
Memory Usage
v1.0
QOS_MEMORYAVAILPHYS
Megabytes
v1.0
QOS_NETWORK_BTPS
Bytes Total/sec
v1.0
QOS_WSC_CACHEHITSPC
Cache hits
v1.0
QOS_WSC_CACHEHITSPM
Cache hits/minute
v1.0
QOS_WSC_CACHEMISSPM
Cache misses/minute
v1.0
This table describes the QoS metrics on the Application pool for the IIS Server Monitoring probe.
QoS on Application Pool
Units
Description
Version
QOS_IIS_RECENT_WRKR_PROCESSES_FAILURE
Count
Total number of times the worker processes for the application pool
has failed during the rapid-fail protection interval.
1.0
QOS_IIS_WRKR_PROCESSES
Count
1.0
QOS_IIS_TOT_WRKR_PROCESS_STARTUP_FAILURE
Count
1.0
QOS_IIS_APP_RUNNING_STATE
State
1.0
QOS_IIS_APP_RUNNING
State
1.0
QOS_IIS_TOT_WRKR_PROCESSES_FAILURE
Count
Total number of times the worker processes has crashed after starting
the application pool.
1.0
QOS_IIS_TOT_APP_POOLS_RECYCLES
Count
Total number of times the application pool is recycled after the WAS is
started.
1.0
QOS_IIS_TOT_WRKR_PROCESSES_PING_FAILURES
Count
Total number of times the WAS has failed to receive a ping response
from the worker process.
1.0
QOS_IIS_TOT_WRKR_PROCESSES_SHUTDOWN_FAILURES
Count
Total number of times the WAS has failed to shut down a worker
process.
1.0
QOS_IIS_APP_POOL_MEMORY
Percent
1.0
QOS_IIS_APP_POOL_CPU
Percent
1.0
Error Threshold
Error Severity
Description
MsgAgentError
Critical
MsgPerfHandleError
Minor
MsgPerSelectedError
Warning
MsgHostAvailError
Warning
MsgWarning
Warning
Message warning
MsgError
Critical
Message error
Note: Probes that support SNMP on Linux (interface_traffic, snmptd, and snmpget) use an SNMP library. This library can cause newer
Linux systems to issue the following message in the Linux console log:
The SNMP library supports older versions of glibc which require the flag for sockets to work correctly. The network portion of the glibc
library sends this message. The message shows that an unsupported flag is being sent to the setsockopt function. The library ignores
this flag, so you can also ignore it.
More information:
interface_traffic (Interface Traffic Monitoring) Release Notes
Contents
Add Connection
Create Group
Configure Alarm Messages
Set Default Interface Parameters
Set Bulk Configuration
Get Routing Table
Get Interface Details
Create Virtual Interface
Generate Checksums
Use Regular Expressions
Port: Specify the port to be used to send SNMP requests to the host.
Default: 161
Timeout: Select the timeout value for the SNMP requests.
Default: 1 second
Retries: Set the maximum number of attempts to send SNMP requests without response from the device to consider the device as
not available.
Default: 5
Community/password
Specify the community, if SNMPv1 or SNMPv2c is selected from the SNMP Version field.
Specify the password, if SNMPv3 is selected from the SNMP Version field and enabled when Security is selected as AuthNoPriv
or AuthPriv.
Show Community/password: Display the string in the Community/password field as plain text, if selected.
Username: Specify a username to access the monitored device.
Note: The Username, Security, Priv. Protocol, and Priv. Passphrase fields are enabled only when SNMPv3 is
selected from the SNMP Version field.
Note: The OK button (on the Host Profile dialog) gets enabled once the SNMP query is successfully completed.
Note:
You must select a connection from the Connection Information drop-down before clicking the Start Query button to
collect SNMP data.
You can also select <Add New> from the Connection Information to add a new connection to be used. Refer Add
Connection for more information.
7.
Default: 1 second
Retries: Set the maximum number of attempts to send SNMP requests without response from the device to consider the device as not
available.
Default: 5
Monitoring group: Specify the name of the group where the host profile is created.
Default: Default group
8. Select Activate interface monitor to automatically activate monitoring of the interfaces that are detected on the hosts that are found by
the query.
You can select one of the three available monitoring criteria:
For all interfaces that are up
Monitoring is activated for all interfaces detected that are up and running (green indicator).
For all interfaces that are matching
Monitoring is activated for all interfaces that are detected with Interface names matching the string specified.
Example: FastEthernet0, Ethernet0, and Ethernet1 all match the string Eth
For all interfaces matching index
Monitoring is activated for all interfaces that are detected with an ID matching the index specified. This setting can be a single ID or a
comma-separated list.
9. Click Start Query.
10. Click OK.
Drag-and-Drop Hosts
This section describes how to drag-and-drop a host record to create a new host profile.
Follow these steps:
1. Open the file with the IP addresses or hostnames in a word editor such as Microsoft Word or WordPad.
2. Select the SNMP hosts to monitor.
3. Drag the selection to the applicable group in the navigation pane.
The SNMP Query window appears.
4. Select a connection to use for all hosts in the specified range from the Connection Information drop-down list.
Default: Auto
8. Click OK.
Notes:
You must specify either Low threshold or High threshold, as specifying one of these values is mandatory.
An error message is displayed, if using values option is selected with both thresholds and you specify only Low
threshold or High threshold or none of these.
The error message also displays if one of the checkboxes is selected and you do not enter any value in it.
Alarm when traffic less than or equal to: select to issue alarms with the selected severity level when traffic is less than or equal to
the defined value on the selected interface:
Inbound traffic: Alarms are issued when inbound traffic is less than or equal to defined value.
Outbound traffic: Alarms are issued when outbound traffic is less than or equal to defined value.
Both interfaces: Alarms are issued when both inbound and outbound traffic is less than or equal to defined value. This is the
default selection.
Any interface: Alarms are issued if either inbound or outbound traffic is less than or equal to defined value.
Max (Extreme) value: Enter the maximum value for the traffic.
Action required: Select the action to be performed when the max value is breached. Action required has following values:
No Action: The probe generates the alarms and QoS with the actual values, if applicable fields are selected.
Use max value: The defined maximum value is used in alarms and QoS.
Discard: The defined maximum value is discarded. No alarm that is related to the threshold is sent. Also, NULL QoS is sent.
Use zero: The zero is used in alarms and QoS.
Send Alarm when Max value is breached: Select if you want the probe to send an alarm when the defined max value is breached.
Note: This field is enabled only when maximum value for traffic Max (Extreme) value is provided.
Set Default: click to display the Default interface settings dialog. Refer Set Default Interface Parameters for more information.
Each option, when selected, opens a confirmation dialog.
Monitor: click to display the interface traffic in a graphical format.
3. You configure various settings for Error Packets, Discarded Packets, and Processed Packets in the Packets tab.
Set or modify the following values:
Error Packets
The number of packets that could not be transmitted because of errors. You can select QoS to be published as:
Packets per second
Number of packets
Packets per second and number of packets
Percentage %
Publish Quality of Service (QoS): select this option to publish the Quality of Service (number of packets with errors) for the selected
interface when checked.
Enable monitoring: enable monitoring of the number of packets with errors on the interface.
Max. error packets: specify the maximum number of packets with errors on the interface that are allowed before an alarm is issued.
Alarm severity level: select the severity level of the alarms issued when the maximum monitoring threshold is breached.
Max (Extreme) value: specify the threshold for the extreme maximum number of packets with errors on the interface that are
allowed before an alarm is issued.
Extreme severity level: select the severity level of the alarms issued when the extreme maximum monitoring threshold is breached.
Note: The alarm severity and message string of Max (Extreme) value can be changed from the Message Pool dialog.
Action Required: select the action to be performed when the maximum value is breached. This field can have the following values:
No Action: The probe generates the alarms and QoS with the actual values, if applicable fields are selected.
Use max Value: The defined maximum value is used in alarms and QoS.
Discard: The defined maximum value is discarded. No alarm related to the threshold is sent. In addition, NULL QoS is sent.
Zero: The zero is used in alarms and QoS.
Send Alarm when Max value is breached: select this option to send an alarm when defined max value is breached.
Note: This field is enabled only when Max (Extreme) value is provided.
Discarded Packets
The number of packets that were discarded from the interface. You can select QoS to be published as:
Packets per second
Number of packets
Packets per second and number of packets
Percentage %
Publish Quality of Service (QoS): select this option to publish the Quality of Service (number of discarded packets) for the selected
interface when checked.
Enable monitoring: enable monitoring of the number of discarded packets on the interface.
Max. discarded: specify the maximum number of packets discarded by the interface that are allowed before an alarm is issued.
Alarm severity level: select the severity level of the alarms issued when the maximum monitoring threshold is breached.
Max (Extreme) value: specify the threshold for the extreme maximum number of discarded packets from the interface that are
allowed before an alarm is issued.
Extreme severity level: select the severity level of the alarms issued when the extreme maximum monitoring threshold is breached.
Note: The alarm severity and message string of Max (Extreme) value can be changed from the Message Pool dialog.
Action Required: select the action to be performed when the maximum value is breached. This field can have the following values:
No Action: The probe generates the alarms and QoS with the actual values, if applicable fields are selected.
Use max Value: The defined maximum value is used in alarms and QoS.
Discard: The defined maximum value is discarded. No alarm related to the threshold is sent. In addition, NULL QoS is sent.
Zero: The zero is used in alarms and QoS.
Send Alarm when Max value is breached: select this option to send an alarm when defined max value is breached.
Note: This field is enabled only when Max (Extreme) value is provided.
Processed Packets
The number of packets that were processed by the interface. You can select QoS to be published as:
Packets per second
Number of packets
Packets per second and number of packets
Percentage %
Publish Quality of Service (QoS): select this option to publish the Quality of Service (number of processed packets) for the selected
interface when checked.
Enable monitoring: enable monitoring of the number of processed packets on the interface. You can specify the maximum number of
packets processed by the interface that are allowed before an alarm is issued.
Max (Extreme) value: specify the threshold for the extreme maximum number of discarded packets from the interface that are allowed
before an alarm is issued.
Action Required: select the action to be performed when the maximum value is breached. This field can have the following values:
No Action: The probe generates the alarms and QoS with the actual values, if applicable fields are selected.
Use max Value: The defined maximum value is used in alarms and QoS.
Discard: The defined maximum value is discarded. No alarm related to the threshold is sent. In addition, NULL QoS is sent.
Zero: The zero is used in alarms and QoS.
Send Alarm when Max value is breached: select this option to send an alarm when defined max value is breached.
Note: This field is enabled only when Max (Extreme) value is provided.
4. You can specify the length of the output packet queue in the Queue Length tab. Queue length is the maximum permissible number of
packets in the output queue before the alarm is raised.
Set or modify the following values:
Publish Quality of Service: select this option to publish Quality of Service (the length of the output packet queue) for the selected
interface when checked.
Enable monitoring: enable monitoring of the length of the output packet queue on the interface.
Max. packets in the output queue before alarm: specify the maximum number of packets in the output queue on the
interface allowed before an alarm is issued.
Alarm severity level: select the severity level of Queue length alarms.
5. You can configure the current and desired operational state of the interface in the State tab.
Set or modify the following values in the State tab:
Interface Operational State
Configures monitoring of the current operational state of the interface.
Publish Quality of Service: select this option to publish Quality of Service (the state of the interface).
Enable monitoring: enable monitoring of the operational state of the interface.
Legal states: select the expected states of the interface from the multiple options provided.
Note: A down state indicates that no packets can be passed. If the current state of the interface is not in the list of
legal states, an alarm with the selected severity level is generated.
Alarm severity level: select the severity level of alarms when the state is not as expected.
Interface Administrative State:
Configures monitoring of the desired state of the interface.
Publish Quality of Service: select this option to publish Quality of Service (the state of the interface).
Enable monitoring: enable monitoring of the operational state of the interface.
Legal states: select the expected states of the interface from the multiple options provided.
Note: If the current state of the interface is not in the list of legal states, an alarm with the selected severity level is
generated.
Alarm severity level: select the severity level of alarms when the state is not as expected.
Ignore Operational State when admin state is not as expected
Ignores the operational state when admin state is not as expected, when selected.
Note: This check box is enabled only when both Enable monitoring checkboxes are selected.
6. You can specify advanced configuration for an interface in the Advanced tab.
Set or modify the following values:
Interface speed: The valid options are:
Automatic detection: The speed of the interface is automatically detected.
Manual override: specify a value to override the detected interface speed. The speed can be specified using one of these units:
B/s, Kb/s, Mb/s, and Gb/s. The speed rate specified is equally divided on inbound and outbound traffic.
Manual override per direction: override the speed individually per direction. The speed can be specified using one of these units:
B/s, Kb/s, Mb/s, and Gb/s.
Total: specify the value and unit for the manual override.
Override Outbound Traffic Monitoring: override the high and low thresholds for Outbound Traffic monitoring.
QoS Destination: override the QoS target value with interface name, description, and user-specified description. Depending on the
option selected in this field, the same appears in the QoS.
User specified description: define a custom description. When listing the interfaces for a host, this description appears in the User
column. This option makes it easier to distinguish between different interfaces.
7. Click OK to save the interface configuration.
Add Connection
This section describes how to create a new connection. The same settings can be applied to the Add Host Profile section.
Follow these steps:
1. Click the
button on the toolbar.
The Setup window appears.
2. Open the Advanced tab.
3. Click the
Note: Enabled only when SNMPv3 is selected from the SNMP Version field.
Port: specify the port to be used to send SNMP requests to the host.
Default: 161
Community/password
Note: The Username, Security, Priv. Protocol, and Priv. Passphrase fields are enabled only when SNMPv3 is selected.
to delete a connection.
Create Group
This section allows you to create a group in the left pane of the probe. Groups are used to categorize and arrange profiles in the probe.
Example: A group can be created for each department with users of exchange servers within the same network.
Follow these steps:
1. Click the
button on the toolbar.
A new group is created along the Default Group.
2. Specify a name of the group.
3. Press Enter.
1. Click
.
The Message Pool window appears.
2. Click
Clear Alarm Text (OK): specify the alarm text in case no error occurs.
Error Severity: select the severity level for the error message.
Subsystem string/id: select the subsystem id for the message.
4. Click OK.
The new message gets added or the selected message gets modified.
Note: Variable expansion in the message text is supported. Enter $ in the Alarm text fields to open the variable expansion pop-up with
a set of variables.
1. Click
.
The Default interface settings window appears.
You can select one of the following options:
Default interface settings (General) to use the predefined standards. This is the default selection.
Default interface settings (ifType Specific) to select interfaces of a specific type.
Default interface settings (ifName specific) to select interfaces with names matching the specified regular expression.
Refer Using Regular Expressions for more information.
Default interface settings (ifSpeed specific) to select from interfaces with specified transfer speed.
Note: The settings is applied on priority as stated below:
Regular expression (ifName)
ifSpeed
ifType
Default settings (General).
Note: You can also click Set Default in the InterfaceName window and set the configuration as the applicable default
interface.
1. Click
.
The Bulk Configuration window appears.
2. Set or modify the following values:
Select Agents
All Agents: indicate that the configuration parameters is distributed to all the monitored SNMP hosts.
Only active ones: indicate that the configuration parameters is distributed to all the active monitored SNMP hosts.
All agents matching: specify the pattern of host names where the configuration parameters is distributed. Refer Using Regular
Expressions for more information.
All agents in the group: select the group of hosts where the configuration parameters is distributed.
Selected Agents: indicate that the configuration parameters is distributed to one or more SNMP hosts selected in a group.
Select Interfaces
Apply to all Interfaces: indicate that the configuration parameters is distributed to all interfaces on the monitored SNMP hosts.
Only active ones: indicate that the configuration parameters is distributed to all the active interfaces on the monitored SNMP hosts.
Apply to interfaces matching: specify the pattern of interface names names where the configuration parameters is distributed on
the monitored SNMP hosts. Refer Using Regular Expressions for more information.
Selected Interfaces: indicate that the configuration parameters is distributed to one or more interfaces selected in the monitored
SNMP hosts.
Note: If you select one or more interfaces in the probe GUI and then click the Bulk Configuration button, the bulk
configuration applies to the selected interfaces.
Port: specify the port to be used to send SNMP requests to the host.
Default: 161
Timeout: select the timeout value for the SNMP requests.
Default: 1 second
Retries: set the maximum number of attempts to send SNMP requests without response from the device to consider the device as
not available.
Default: 5
Community/password
Specify the community, if SNMPv1 or SNMPv2c is selected.
Specify the password, if SNMPv3 is selected and enabled when Security is selected as AuthNoPriv or AuthPriv.
Show Community/password: display the string in the Community/password field as plain text, if selected.
Username: specify a username to access the monitored device.
Note: The Username, Security, Priv. Protocol, and Priv. Passphrase fields are enabled only when SNMPv3 is
selected.
Note: This section is available only when Modify interface values check box is selected.
Activate: apply the settings done in the upper part of the Bulk Configuration dialog on interfaces, if selected.
3. Click OK.
The bulk configuration is applied to the interfaces.
Note: You can skip step 1 to specify a host currently not monitored by the probe. The Routing information window appears,
but then the host values must be specified.
2. Click
.
The Routing information window appears with the details of the host.
3. Set or modify the following values:
Profile: select the host profile to find the routing information.
Host address: specify the host name or IP address of the host to be monitored.
Version: select the SNMP software version number (SNMPv1, SNMPv2c, or SNMPv3).
Auth.: select the type of authentication strategy (none, MD5, or SHA).
Port: specify the port to be used to send SNMP requests to the host.
Default: 161
Timeout: select the timeout value for the SNMP requests.
Default: 1 second
Retries: set the maximum number of attempts to send SNMP requests without response from the device to consider the device as not
available.
Default: 5
Community/password
Specify the community, if SNMPv1 or SNMPv2c is selected.
Specify the password, if SNMPv3 is selected and enabled when Security is selected as AuthNoPriv or AuthPriv.
Show password: display the string in the Community/password field as plain text, if selected.
Username: specify a username to access the monitored device.
Note: The Username, Security, Priv. Protocol, and Priv. Passphrase fields are enabled only when SNMPv3 is selected.
This section provides you the details about Active and Inactive interfaces. Click
to display the Interface Details window with the interface
statistics of the probe. Inactive interfaces continue to generate fail alarms and must be cleared.
Generate Checksums
The probe (versions 4.35 and earlier) had a checksum issue which was corrected in version 4.36. However, the index of an interface can change
when you upgrade to a later version of the probe. You can generate checksums to verify that the profile remains valid after the upgrade.
Follow these steps:
1. Right-click the host in the main window and select Generate Checksum.
Note: Generate Checksum is only valid if there is a problem with corrupted interface names.
The Generate Interface Checksum dialog appears with the selected host profiles listed.
Note: Click the More Information button in this dialog to learn more about the checksum issue.
Notes:
Regular expressions specified in the Default interface settings (ifName specific) field are case-sensitive.
Regular expressions must be enclosed in an asterisk (*), for example, *regex*.
A regular expression is applied on the interface name and on the substring of the interface.
If default settings for a regular expression corresponding to an interface are saved once, other regular expressions for the
same interface which are created later are not applicable
The following table describes some examples of regex and pattern matching for the probe.
Regular expression
Explanation
[A-Z]
Standard (PCRE)
/[A-Z]
Custom
Standard (PCRE)
.e*
Custom
Standard (PCRE)
/[a-d]*
Custom
Probe GUI
Toolbar
Left Pane
Right Pane
Interface Status Indicators
General Setup
General Tab
Advanced Tab
Host Profile Window
General Settings Tab
Advanced Tab
Message Definitions Tab
Add Agent Range Window
SNMP Query Window
Monitor Interface Window
<InterfaceName> Window
Traffic Tab
Packets Tab
Queue Length Tab
State Tab
Advanced Tab
Rediscover Interfaces
Virtual Interfaces
Probe GUI
The interface_traffic probe is configured by double-clicking the probe in the Infrastructure Manager.
Toolbar
The interface_traffic GUI contains a toolbar, which allows you to configure the interface_traffic probe:
The toolbar contains following buttons:
General Setup
Displays the Setup window, which allows you to set general properties of a host.
New Folder
Creates a folder for a group in the left pane.
New SNMP Host
Displays the Host Profile window, which allows you to create new host profile to be monitored.
Message Pool Manager
Displays the Message Pool window, which contains a list of all the messages.
Set the default interface parameters
Allows you to set default parameters for all or specific interfaces.
Bulk Configuration
Allows you to set default parameters for multiple hosts.
Get Routing Table
Displays routing information for a specific profile.
Get Interface Details
Displays a count of active and inactive interfaces.
Update interface status
Allows you to update the status of an interface.
Left Pane
The left pane contains various groups and hosts belonging to a specific group. You can right-click in the left pane to display the following options:
New
Displays the Host Profile window, which allows you to define a new host to be monitored.
Rename
Lets you rename the selected group or host.
Delete
Lets you delete the selected group or host from the list of monitored devices.
Expand group
Expands/opens the group folder to show all associated hosts.
Collapse group
Collapses/closes the group folder to hide all associated hosts.
Show all agents
If selected, displays the All Agents folder, which contains all defined agents (SNMP hosts). If not selected, hides the All Agents folder.
Note: If selected, overrides the Show All Agents option in the Setup window.
Displays the properties window for the selected host or interface, as selected in the left pane.
The properties window enables you to modify the properties of the selected host or interface.
Rename
Allows you to rename a selected interface.
This option is available only when a host is selected in the left pane.
Delete
Deletes the selected host or interface, as selected in the left pane.
Activate
Starts monitoring the selected host or interface, as selected in the left pane.
Deactivate
Stops monitoring the selected host or interface, as selected in the left pane.
Monitor
Displays the interface monitor window and starts monitoring the selected interface.
This option is available only when a host is selected in the left pane.
Refresh
Refreshes the list of hosts or interfaces in the right pane.
Query SNMP Agent
Displays the SNMP Query window showing the hostname, uptime, and SNMP system information.
Generate Checksums
Displays the Generate Interface Checksums window. The interface checksums are used to remap interface indexes when an interface
index shift occurs. This utility should be used for profiles/interfaces added using interface_traffic version 4.35 or earlier.
Rediscover interfaces
Sends an SNMP query to the agent hosting the selected interface, attempting to find all interfaces available on the agent. All interfaces
found are listed in the right pane.
This option is available only when a host/profile is selected in the left pane.
Virtual Interfaces
Used to create a new virtual interface and modify the details of an existing virtual interface.
Interface Status Indicators
General Setup
When you click the General Setup button on the toolbar, the Setup window is displayed.
This window contains the below listed tabs:
General
Advanced
General Tab
button is clicked.
This tab is used to configure the general settings for the probe such as setting the log-level, sending the alarms when the interface is down, and
so on.
The fields in the General tab are described as follows:
Send Alarm Once (if Interface is Down)
Sends alarm once when the operational state for the interface is down. This alarm is sent with the Ignore other alarms when
operational state is down/not present/lower layer down field of the Advanced tab.
Send Alarm Once (if Interface is Unavailable/Deleted)
Sends alarm only once for the interface operational state, when an interface is deleted or unavailable. The alarm is sent with the Ignore
other alarms when operational state is down/not present/lower layer down field of the Advanced tab
Show All Agents
Displays a group called All Agents in the left pane. It contains all defined hosts.
Note: You can right-click in the left pane of the user interface and select Show All Agents to override this option.
Polling interval
Specifies the default interval for the probe to poll the selected host. You can override this value in the Host Properties window.
Log-level
Specifies the level of details that are written to the log file.
Default: 0 - Fatal
Note: Log as little as possible during normal operation to minimize disk consumption and increase the amount of detail when
debugging.
Log-size
Defines the size of the probe log file to which probe-internal log messages are written. When this size is reached, the contents of the file
are cleared.
Default: 100 KB
Note: In version 5.42 and later, the maximum logsize is enhanced to 2 GB (2097151 KB).
Note: The Use IfDescription for CI Name option is added from version 5.40 onwards of the probe. Refer interface_traffic
(Interface Traffic Monitoring) Release Notes > Upgrade Considerations for more information.
Use Alias
Displays the alias name of the Configuration Item in the Description column on the probe GUI. By default, this checkbox is not selected.
When you select the checkbox then the probe GUI displays the interface description in the Name column and the interface alias in the De
scription column (like version 5.32) and SNMP_v3 support.
Note: The Use Alias option is added from version 5.41 onwards of the probe. Refer interface_traffic (Interface Traffic
Monitoring) Release Notes > Upgrade Considerations for more information.
Advanced Tab
This tab is used to maintain the connection information and also perform some advanced settings.
Note: This field is enabled only if Save inactive interface definitions is selected.
Note: You can select this and the Send Alarm Once (if Interface is Unavailable or Deleted) to configure the Op State
(operational state) alarm. The alarm is generated once when the interface is down or unavailable and all other alarms except
Admin State are ignored.
When you add a single host, the Host Profile window appears with General Settings tab is displayed. Through this tab, you can specify the host
that needs to be added and can configure some general settings for that host.
The fields in the General Settings tab are explained as follows:
Host address
Defines the host name or the IP address of the SNMP host.
Poll interval
Specifies the interval for the probe to poll the selected host.
SNMP Properties
This section defines the connection details for the host profile.
SNMP Version: selects the SNMP software version number (SNMPv1, SNMPv2c, or SNMPv3).
Authentication: selects the type of authentication strategy (none, MD5, or SHA).
Port: specifies the port to be used to send SNMP requests to the host.
Default: 161
Community/password
Represents community, if SNMPv1 or SNMPv2c is selected.
Represents password, if SNMPv3 is selected and enabled when Security is selected as AuthNoPriv or AuthPriv.
Show Community/password: displays the string in the Community/password field as plain text, if selected.
Username: specifies a username to access the monitored device.
Note: The Username, Security, Priv. Protocol, and Priv. Passphrase fields are enabled only when
SNMPv3 is selected.
Timeout
Selects the timeout value for the SNMP requests. Default: 1 second
Retries
Sets the maximum number of attempts to send SNMP requests without response from the device to consider the device as not
available. Default: 5
Monitoring group
Specifies the name of the group where the host profile is created. Default: Default group
Description
Specifies additional information about the monitored SNMP host in the profile.
SNMP V3 Support
The probe is enabled to monitor hosts/agents based on the SNMP V3 protocol. Few guidelines that must be adhered to when monitoring the
SNMP V3 hosts are given as follows:
If the same probe instance is monitoring multiple SNMP V3 hosts/agent, the user must ensure that the EngineID of all the hosts/agents is
unique. The absence of unique EngineID causes sporadic connection timeouts and failure alarms.
The probe does not support creating multiple monitoring profiles for one V3 host/agent. Adding such duplicate profiles is disabled in the
probe GUI at most of the places, except the Bulk Configure screen. Do not use the Raw Configure option or add directly in the
configuration file for creating multiple profiles for the same V3 host/agent. Doing so, causes the probe for producing some unpredictable
results.
Advanced Tab
This tab is used to configure some advanced settings for the new host to be added. These settings include alarm and QoS identification settings,
number of interfaces to be monitored (through automatic detection or by specifying manually), and whether to re-map interfaces automatically
after an index shift or not.
The fields in the Advanced tab are explained as follows:
Alarm identification method
Selects alarm messages issued on breached thresholds from the selected host to be identified by the host address or the profile name.
QoS identification method
Selects QoS messages issued from the selected host to be identified by the host address, profile name, and interface description.
Prefix alarm messages with the 'Description'
Prefixes the alarm messages issued on breached thresholds from the selected host with the Description text string (from the General
Settings tab).
Monitor the number of interfaces
Specifies the number of interfaces on the selected host to be monitored. You can set the number of interface(s) to be monitored using
the Manual option number or select Automatic Detection to automatically select the number of interfaces.
Show Multicast/Broadcast Packets
Allows QoS and alarms to be generated for unicast, multicast, and broadcast packets for the processed packets. Refer Monitor
Interface Window section for more information.
Default: selected
Automatically re-map interfaces after an index shift
Sends an SNMP query to the host, attempting to rediscover all interfaces when modifications have been made on a monitored host (such
as replacement of interface cards, deleted/added VLANs etc.). The probe maps the active profiles to the new indexes using the interface
names found during the initial interface discovery.
Message Definitions Tab
This tab displays a list of token values, and the message ids.
You can select a message ID for the different message tokens.
Add Agent Range Window
You can add a range of IP addresses as host profiles in the probe. All hosts within the range are queried using a single connection.
The fields in the Add Agent Range window are described as follows:
IP address
Specifies the initial IP address for the range.
Number of agents
Specifies the Number of agents to query as SNMP hosts.
SNMP Query Window
The SNMP Query window allows you to send queries to SNMP hosts. This window is displayed when you either add a range of hosts or
drag-and-drop a host record to the probe.
Connection Information
Selects a connection to use for all hosts in the specified range from the drop-down list.
Default: Auto
Notes:
You must select a connection from the Connection Information drop-down before clicking the Start Query button to collect
SNMP data.
You can also select <Add New> from the Connection Information to add a new connection to be used. Refer Add
Connection for more information.
Timeout
Selects the timeout value for the SNMP requests.
Default: 1 second
Retries
Sets the maximum number of attempts to send SNMP requests without response from the device to consider the device as not available.
Default: 5
Monitoring group
Specifies the name of the group where the host profile is created.
Default: Default group
Activate interface monitor
Automatically activate monitoring of the interfaces detected on the hosts found by the query.
You can select one of the three available monitoring criteria:
The Set Collect Interval option lets you select another collect interval (how often the graph is updated with data from the monitor).
Default: 3 seconds.
The fields in the above window are explained as follows:
Interface Data Type
Indicates the interface name, identity, and interface speed. On rediscovering the interface, if the speed has changed it will show updated
speed.
State (indicator)
Indicates the interface status (green = OK, red = error, yellow = unknown).
Scale unit
Sete the units for the scales (B/s, KB/s etc.).
Auto scale
If selected, the scales will automatically be adjusted to match the current traffic.
Inbound/Outbound
Traffic
Displays the current traffic.
Bandwidth
Indicates the current percentage of the bandwidth (max. speed). Red means that the max threshold is exceeded, Green means OK.
Note: If the Show Multicast/Broadcast Packets option is selected on the Advanced tab, the Broadcast and Multicast traffic is
shown in the above graph and not the NUcast.
<InterfaceName> Window
You can double-click the interface (or right-click the selected interface and then choosing Edit) brings up a window-box, enabling you to activate
various monitoring options.
The window contains five tabs - Traffic, Packets, Queue Length, State, and Advanced.
Traffic Tab
Traffic is defined as the number of bytes transmitted over the interface. When you double-click on an interface (or select the Edit option), the Traff
ic tab is displayed by default. This tab is used to publish QoS using different criteria, set the value(s) of low and high threshold, set the alarm
severity level, and configure other alarm-generation settings.
Notes:
You must specify either Low threshold or High threshold, as specifying one of these values is mandatory.
An error message is displayed, if using valuesoption is selected with both thresholds and you specify only Low
threshold or High threshold or none of these.
The above error message also displays if one of the checkboxes is selected and you do not enter any value.
Note: This field is enabled only when maximum value for traffic Max (Extreme) value is provided.
Set Default
Displays the Default interface settings dialog.
Each option, when selected, opens a confirmation dialog.
Monitor
Displays the interface traffic in a graphical format.
Packets Tab
This tab is used to configure various settings for Error Packets, Discarded Packets, and Processed Packets.
Note: The alarm severity and message string of Max (Extreme) value can be changed from the Message Pool dialog.
Action Required: selects the action to be performed when the maximum value is breached. This field can have the following values:
No Action: The probe generates the alarms and QoS with the actual values, if applicable fields are selected.
Use max Value: The defined maximum value is used in alarms and QoS.
Discard: The defined maximum value is discarded. No alarm related to the threshold is sent. In addition, NULL QoS is sent.
Zero: The zero is used in alarms and QoS.
Send Alarm when Max value is breached: selects this option to send an alarm when defined max value is breached.
Note: This field is enabled only when Max (Extreme) value is provided.
Discarded Packets
The number of packets that were discarded from the interface. You can select QoS to be published as:
Packets per second
Number of packets
Packets per second and number of packets
Percentage %
Publish Quality of Service (QoS): selects this option to publish the Quality of Service (number of discarded packets) for the selected
interface when checked.
Enable monitoring
Enable monitoring of the number of discarded packets on the interface.
Max. discarded: specifies the maximum number of packets discarded by the interface that are allowed before an alarm is
issued. Alarms can be raised for pkts\s, pkts, and percent.
Alarm severity level: selects the severity level of the alarms issued when the maximum monitoring threshold is breached.
Max (Extreme) value: specifies the threshold for the extreme maximum number of discarded packets from the interface that are
allowed before an alarm is issued.
Extreme severity level: selects the severity level of the alarms issued when the extreme maximum monitoring threshold is
breached.
Note: The alarm severity and message string of Max (Extreme) value can be changed from the Message Pool dialo
g.
Important! The probe is unable to provide a correct value for the current percentage of interface errors. This happens
when you have enabled the monitoring of errors and discarded packets. For more information about how to enable
the probe to calculate the correct value for current percentage of interface errors, see interface_traffic
Troubleshooting.
Action Required: selects the action to be performed when the maximum value is breached. This field can have the following values:
No Action: The probe generates the alarms and QoS with the actual values, if applicable fields are selected.
Use max Value: The defined maximum value is used in alarms and QoS.
Discard: The defined maximum value is discarded. No alarm related to the threshold is sent. In addition, NULL QoS is sent.
Zero: The zero is used in alarms and QoS.
Send Alarm when Max value is breached: selects this option to send an alarm when defined max value is breached.
Note: This field is enabled only when Max (Extreme) value is provided.
Processed Packets
The number of packets that were processed by the interface. You can select QoS to be published as:
Packets per second
Number of packets
Packets per second and number of packets
Percentage %
Publish Quality of Service (QoS): selects this option to publish the Quality of Service (number of processed packets) for the
selected interface when checked.
Enable monitoring: enable monitoring of the number of processed packets on the interface. You can specify the maximum number
of packets processed by the interface that are allowed before an alarm is issued. Alarms can be raised for pkts\s and pkts.
Max (Extreme) value: specifies the threshold for the extreme maximum number of discarded packets from the interface that are
allowed before an alarm is issued.
Action Required: selects the action to be performed when the maximum value is breached. This field can have the following values:
No Action: The probe generates the alarms and QoS with the actual values, if applicable fields are selected.
Use max Value: The defined maximum value is used in alarms and QoS.
Discard: The defined maximum value is discarded. No alarm related to the threshold is sent. In addition, NULL QoS is sent.
Zero: The zero is used in alarms and QoS.
Send Alarm when Max value is breached: selects this option to send an alarm when defined max value is breached.
Note: This field is enabled only when Max (Extreme) value is provided.
You can use this tab to specify the length of the output packet queue. Queue length is the maximum permissible number of packets in the output
queue before the alarm is raised.
The fields in the Queue Length tab are explained below:
Publish Quality of Service
Publishes Quality of Service (the length of the output packet queue) for the selected interface when checked.
Enable monitoring
Enables monitoring of the length of the output packet queue on the interface.
Note: If the current state of the interface is not in the list of legal states, an alarm with the selected severity level is generated.
Note: This checkbox is enabled only when both Enable monitoring checkboxes are selected.
Advanced Tab
Using this option requires in-depth understanding of the implications to the monitoring profile.
The fields in the Advanced tab are explained as follows:
Interface speed
The valid options are:
Automatic detection: The speed of the interface is automatically detected.
Manual override: You can specify a value and thus override the detected interface speed. The speed can be specified using one of
these units: B/s, Kb/s, Mb/s, and Gb/s. The speed rate specified is equally divided on inbound and outbound traffic.
Manual override per direction: Overrides the speed individually per direction. The speed can be specified using one of these units:
B/s, Kb/s, Mb/s, and Gb/s.
Override Outbound Traffic Monitoring
Allows you to manually override the high and low thresholds for Outbound Traffic monitoring.
QoS Destination
Overrides the QoS target value with interface name, description, and user-specified description. Depending on the option selected in this
field, the same appears in the QoS.
User specified description
Defines a custom description. When listing the interfaces for a host, this description will appear in the User column. This option makes it
easier to distinguish between different interfaces.
Rediscover Interfaces
All interfaces are discovered during the initial SNMP query. There are, however, situations where new interfaces are added to the device after a
period of time. For example, a new subnet is needed and so on.
You can rediscover interfaces that are new or removed by right-clicking the right-pane, whenever a host is selected in the left-pane, and selecting
Rediscover Interfaces.
Notes:
You can use Rediscover Interfaces to "undo" a previous delete.
You can delete the interfaces that are not required by right-clicking the device and selecting Delete from the pop-up menu.
Select the Activate option to activate that interface and the Active column changes the status to yes. Select the Deactivate o
ption to deactivate the interface.
Select the Monitor option to display the interface traffic data for the selected interface. Refer Monitor Interface Window.
Virtual Interfaces
A Virtual Interface in this context is an interface that combines inbound and outbound interface statistics from two different physical interfaces. It is
possible to cross-connect the physical interfaces, so that the inbound on the physical interface is treated as the outbound for the Virtual interface.
The Virtual Interface window provides options for doing this.
After creating a Virtual Interface, it will appear in the probe GUI together with the physical interfaces, and it is configured (QoS and Alarms
settings) like any other interface. The virtual interface is automatically configured with a set of standard monitoring parameters, which you should
edit according to your requirements.
The probe will use data collected from the physical interfaces to send QoS and Alarms for the Virtual Interface. When created, the interface can
be treated and configured as any physical interface when it comes to QoS, monitoring, bulk-configuration, and so on.
interface_traffic Metrics
Contents
QoS Metrics
Alert Metrics Default Settings
This section describes the metrics for the Interface Traffic Monitoring (interface_traffic) probe.
QoS Metrics
The following table describes the checkpoint metrics that can be configured using the interface_traffic probe.
Monitor Name
Units
Description
Version
QOS_INTERFACE_ADMINSTATUS
State
v4.0
QOS_INTERFACE_DISCARDS
Packets/sec
Displays the number of inbound and outbound packets discarded by the interface.
v4.0
QOS_INTERFACE_ERRORS
Packets/sec
Displays the number of inbound and outbound error packets on the interface.
v4.0
QOS_INTERFACE_OPSTATE
State
v4.0
QOS_INTERFACE_PACKETS
Packets/sec
v4.0
QOS_INTERFACE_QLEN
Packets
Displays the queue length of the outbound packets across the interface.
v4.0
QOS_INTERFACE_TRAFFIC
Bytes/sec
v4.0
QOS_INTERFACE_TRAFFIC_KBITS
Kilobits/sec
Displays the inbound and outbound traffic on the interface in kilobits per second.
v4.0
Warning
Threshold
Warning
Severity
Error
Threshold
Error
Severity
Description
InOctets
profile
OutOctets
profile
InOctetsExtreme
critical
The max value has been breached by inbound traffic on the interface.
OutOctetsExtreme
critical
The max value has been breached by outbound traffic on the interface.
NoTraffic
profile
InErrors
profile
OutErrors
profile
InErrorsExtreme
profile
The max value has been breached by inbound error packets on interface.
OutErrorsExtreme
profile
InDiscards
profile
OutDiscards
profile
InDiscardsExtreme
profile
The max value has been breached by inbound discarded packets on the
interface.
OutDiscardsExtreme -
profile
The max value has been breached by outbound discarded packets on the
interface.
OutQLen
profile
OpStatus
profile
AdminStatus
profile
AgentState
critical
IndexShift
major
The interface index number has changed for the interface. You need to
rediscover the interface.
InterfaceCount
warning
InPackets
major
OutPackets
major
InPacketsExtreme
critical
The max value has been breached by inbound packets on the interface.
OutPacketsExtreme
critical
The max value has been breached by outbound packets on the interface.
NoMaxDefined
critical
The max interface speed of the interface on the agent could not be
determined.
VirtualInterface
critical
The physical interface(s) to use for virtual interface on the agent could not
be determined.
InterfaceExist
major
The interface index on the agent does not exist on the MIB.
interface_traffic Troubleshooting
This article contains troubleshooting points for the interface_traffic probe.
Contents
Probe Calculates More Than 100 % for Error and Discarded Packets
Probe Calculates More Than 100 % for Error and Discarded Packets
Symptom:
The probe is unable to provide a correct value for the current percentage of interface errors. This happens when you enable the monitoring of
errors and discarded packets.
Solution:
Do the following:
1. Go to the Raw Configuration interface of the interface_traffic probe.
2. Set the value of the incerrdistototal key as 1
Default: 0
3. Restart the probe.
When the value of this key is set as 1, the error and discarded packets are added to the total packets. The probe is now able to calculate the
correct value for current percentage of interface errors.
PostgreSQL
IBM DB2
IBM Informix
More information
jdbc_response (Java Database Connectivity SQL Queries Response Monitoring) Release Notes
jdbc_response AC Configuration
This article describes the configuration concepts and procedures to set up the jdbc_response probe. Configure this probe to monitor the
connection to the JDBC database, by executing custom SQL queries. A profile is used to define these queries and also to configure the alarms
and QoS.
The following diagram outlines the process to configure this probe:
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see jdbc_response (Java
Database Connectivity SQL Queries Response Monitoring) Release Notes .
Verify that exclude_connection_time key from Raw Configuration is set to yes.
Note: Connection time is one of the parameters that is used to calculate the response time. If this connection time is not
excluded from the calculation by setting the exclude_connection_time key to yes, the response time keeps increasing.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Note: The default driver name is used for Microsoft SQL Server. You need to update it for other databases.
Driver Path: specifies the absolute path of JDBC driver in the file system. Example: [CA UIM Installation
Directory]/probes/database/jdbc_response/lib/sql_drv.jar
Note: For Microsoft SQL Server and Oracle databases, drivers are already installed with the probe. So, this field is disabled.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Timeout (Unit): defines the unit for measuring the value of timeout.
Default: sec
Connection Error: select Active check box to generate an alarm when connection to the database fails. The specified Severity and
Message are sent when the connection fails.
Connection Established: select Active check box to generate an alarm when connection to the database is successful. The
specified Severity and Message are sent when the connection is made.
9. Click on Actions and select Test Connection. If the test fails, check the log file for error messages.
The profile is configured.
9.
Note: If you do not want to use this connection, click Options icon on the connection and click on Delete Connection.
Add a Query
You can add a query which when executed helps to monitor the database connection.
Follow these steps:
1. Click the Options icon beside the connection name node.
2. Click the Add New Query option.
3. Enter the query name in Add New Query dialog and click Submit.
A success message dialog appears.
4. Click Reload.
5. Navigate to the <Query Name> node.
6. Update the general information of the query:
Active: allows you to use this query for monitoring, on creation.
Description: defines the query description
Alarm source override, QoS source override: defines the values that overrides the default source values, which is the robotname
where the probe is deployed.
Important! CA does not recommend you to change the source override fields after the initial configuration. In case you
change the QoS source later, multiple graphs are displayed on the Unified Service Management (USM) Metrics view (one
for every QoS source value). Also, CA recommends you to keep the source identical for both alarm and QoS.
Run Interval (Value): specifies the time interval after which a profile runs automatically.
Default: 5
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Run Interval (Units): specifies the unit for measuring the Run Interval.
Default: min
Simple Query: specifies the query to be executed.
Query Timeout (Value): select the time interval after which an alarm is generated when the SQL query fails to validate.
Default: 5
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Query Timeout (Units): specifies the unit for the Query Timeout value.
Default: min
Timeout Error: defines the severity level of the alarm to be generated when the SQL query fails to validate.
7. Click on Actions and select Test SQL Query to test if the query is valid.
The profile is saved and you can configure the alarms and QoS for the probe.
Note: If you do not want to use this query, click Options icon on the query and click on Delete Query.
Note: If character or regular expression is selected, "=" and "!=" operators are valid. If regular expression is selected, a Java
regular expression as defined in the Java Pattern class is used to check the value.
(If you are adding information under Value section) Column: defines the column value to be checked. If multiple rows are returned
from the SQL query, the value is always from the first row.
(If adding information for Row count and Value sections) Operator: defines the operator to use to check the number of rows returned
against the threshold values. For example if the Operator is "<=" and the threshold value is 10 and the number of rows returned is 9,
an alarm will be sent for that threshold. If the next query returns 11 rows, a clear alarm (if configured) will be sent.
High Threshold: select the check box to define the properties for high threshold. Specify the maximum Value (in milliseconds) that
will be compared with the value returned from the SQL query. For example, if the response time exceeds this threshold value, an
alarm is generated. The Severity of the alarm and Message is sent when this threshold value is breached.
Low Threshold: select the check box to define the properties for low threshold. Specify the minimum Value (in milliseconds) that will
be compared with the value returned from the SQL query. For example, if the response time exceeds this threshold value, an alarm is
generated. The Severity of the alarm and Message is sent when this threshold value is breached.
Clear: select the check box to define the clear alarm. Define the Severity of the alarm to clear or maintain history of the alarms. The
Message is sent when the threshold value is not breached.
Alarm Thresholds
The alarm threshold options that are available can vary depending on the probe versions installed at the hub level. The alarm threshold settings to
allow the probe to:
Send alarms when threshold criteria is met
Indicate to baseline_engine to compute baselines
See Configuring Alarm Thresholds for details.
jdbc_response IM Configuration
This article describes the configuration concepts and procedures to set up the jdbc_response probe. Configure this probe to monitor the
connection to the JDBC database, by executing custom SQL queries. A profile is used to define these queries and also to configure the alarms
and QoS.
The following diagram outlines the process to configure this probe:
Verify Prerequisites
Configure Log Properties
Create a Connection
Create a Profile
Configure Alarms and QoS
Run the Profile
Verify Prerequisites
Verify that required hardware and software is available before you configure the probe. For more information, see jdbc_response (Java
Database Connectivity SQL Queries Response Monitoring) Release Notes.
Verify that exclude_connection_time key from Raw Configuration is set to yes.
Note: Connection time is one of the parameters that is used to calculate the response time. If this connection time is not
excluded from the calculation by setting the exclude_connection_time key to yes, the response time keeps increasing.
Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail
when debugging.
Create a Connection
To enable the monitoring process you can create a connection to the JDBC database.
Follow these steps:
1. Right-click on the Connections tab and click New.
2. Enter the name for the connection and click OK.
The New Connection window appears.
3. Complete the following field information:
Win/Domain Authorization: select the check box, if you want to use Windows integrated security for MS SQL Server to connect to
the database.
Database URL: specifes the JDBC URL. Examples:
IBM DB2
jdbc:db2://IP address:PortNumber/Database Name
IBM Informix
jdbc:informix-sqli://IP address:Port Number/Database Name:INFORMIXSERVER=Server Name
Micrsoft SQL Server
jdbc:sqlserver://IP address:Port Number;DatabaseName=Database Name
MySQL
jdbc:mysql://IP address:Port Number/Database Name
Oracle
jdbc:oracle:thin:@IP address:Port Number:Database Name
PostgreSQL
jdbc:postgresql://IP address:Port Number/Database Name
Driver Name: specifies the java class name of the JDBC driver. Examples:
com.ibm.db2.jcc.DB2Driver
com.informix.jdbc.IfxDriver
com.microsoft.sqlserver.jdbc.SQLServerDriver
com.mysql.jdbc.Driver
oracle.jdbc.driver.OracleDriver
org.postgresql.Driver
Driver Path: specifies the absolute file system path to the JDBC driver. Example: [CA UIM Installation
Directory]/probes/database/jdbc_response/lib/sql_drv.jar
Note: For Microsoft SQL Server and Oracle databases, this field is disabled as drivers are already installed with the probe.
(If Win/Domain Authorization check box is not selected) User ID: enter the User ID to connect to the database.
(If Win/Domain Authorization check box is not selected) Password: enter the password of the specified user ID.
Timeout: defines the time for which the probe will attempt to connect to the database.
Connection Error Alarm: defines the severity of the alarm when the connection fails.
Connection Error Alarm Message: specifies the alarm message to be sent to the hub when the connection fails.
Connection Established Alarm: defines the severity of the clearing alarm when the connection is successful.
Connection Established Alarm Message: specifies the message to be sent when the connection is successful.
4. Click the Test button.
If a connection is made to the database, the LED turns green. Otherwise the LED remains black. Check the log file for error messages.
Notes:
To modify an existing connection, right click or double-click on a connection and select Edit.
To delete an existing connection, right click on a connection and select Delete.
Create a Profile
Profiles are associated with a configured connection and contains the SQL query to be executed on the database. As a system administrator, you
can create multiple profiles to monitor the database response time.
Follow these steps:
1. Right-click on the Profiles tab and click New from the context menu.
2. Enter the name of the profile and click OK.
The Edit Profile tab appears.
3. Complete the following fields on the General tab:
Connection: defines the database connection to be used by this profile.
Alarm Source Override, QoS Source Override: defines the values that overrides the default source values, which is the robotname
where the probe is deployed.
Important! CA does not recommended to change the source override fields after the initial configuration. In case you
change the QoS source later, multiple graphs are displayed on the Unified Service Management (USM) Metrics view (one
for every QoS source value). CA also recommends to keep the source identical for both alarm and QoS.
Timeout error: defines the severity level of the alarm to be generated when the SQL query fails to validate
Query Timeout: select the time interval after which an alarm is generated when the SQL query fails to validate.
Note: Reduce this interval to generate alarms frequently. A shorter interval can also increase the system load.
Run Interval: specifies how often the SQL query and tests should run.
4. Navigate to the SQL Query tab.
5. Enter the SQL query in Simple Query or read the query from a file using the From File option.
Note: If you select the From File option, ensure that the query file is located in the probe home directory. Once the file name is
entered, the query can be loaded by clicking on the Read File button.
Notes:
To copy the attributes of a selected profile to another, right click on a profile and select Copy.
To modify an existing profile, right click or double-click on a profile and select Edit. If changes have been made to a profile,
restart the probe for the changes to take effect.
To delete an existing profile, right click on a profile and select Delete.
(If Value tab is selected) Comparison: defines the type of comparison that checks the value returned from the SQL query. For
example, to check the numeric value of the column, select Numeric. Consider the following points:
If character or regular expression is selected, "=" and "!=" operators are valid.
If regular expression is selected, a Java regular expression as defined in the Java Pattern class is used to check the value.
(If Value tab is selected) Column: defines the column value to be checked. If multiple rows are returned from the SQL query, the
value is always from the first row.
(If Row Count tab or Value tab is selected) Operator: defines the operator to use to check the number of rows returned against the
threshold values. For example if the Operator is "<=" and the threshold value is 10 and the number of rows returned is 9, an alarm will
be sent for that threshold. If the next query returns 11 rows, a clear alarm (if configured) will be sent.
High Threshold: specifies the maximum Value (in milliseconds) that will be compared with the value returned from the SQL
query. For example, if the response time exceeds this threshold value, an alarm is generated. The Severity of the alarm and Messag
e is sent when this threshold value is breached.
Low Threshold: specifies the minimum Value (in milliseconds) that will be compared with the value returned from the SQL query. For
example, if the response time exceeds this threshold value, an alarm is generated. The Severity of the alarm and Message is sent
when this threshold value is breached.
Clear Severity: defines the Severity of the alarm to clear or maintain history of the alarms. The Message is sent when the threshold
value is not breached.
Note: By default, the Clear Severity level is set to inactive to maintain alarm history in the alarm console.
Note: For more information about using variables in a message, see Attribute Substitution.
jdbc_response Metrics
The following section describes the metrics that can be configured with the Java Database Connectivity SQL Queries Response Monitoring
(jdbc_response) probe.
Contents
QoS Metrics
Alert Metrics Default Settings
QoS Metrics
The following table describes the QoS metrics that can be configured using the jdbc_response probe.
Monitor Name
Units
Description
Version
QOS_JDBC_RESPONSE
Milliseconds
1.1
QOS_JDBC_ROWS
Rows
1.1
QOS_JDBC_VALUE
Value
1.1
Warning
Threshold
Warning
Severity
Error
Threshold
Error
Severity
Description
Version
Response
time
NA
Warning
NA
Major
Monitors the alarm value that is sent if the response time of the profile exceeds
the Value setting in milliseconds. This time is calculated on the following
parameters:
1.1
NA
Warning
NA
Major
Monitors the alarm value that is sent if the number of rows of the profile exceeds
the Condition and Value settings.
1.1
Value
NA
Warning
NA
Major
Monitors the alarm value that is sent if the value exceeds the threshold value.
1.1
jdbc_response Troubleshooting
This section contains the troubleshooting points for jdbc_response probe.
-DNIMV_CONTIP=$NIMV_CONTIP -DNIMCPRID=$NIMCPRID
-DNIM_PROBE_KEY=$NIM_PROBE_KEY -DNIMPROBEPORT=$NIMPROBEPORT
-DNIM_CONTROLLER_PORT=$NIM_CONTROLLER_PORT
-DNIM_SPOOLER_PORT=$NIM_SPOOLER_PORT -cp "./*;lib/*" -Djava.libra
ry.path="C:/Program
Files/Nimsoft/probes/database/jdbc_response/lib" com.nimsoft.nimb
us.probes.database.jdbc_response.JdbcResponse
d. Launch the probe, right-click on the Connections tab, and click New.
e. Enter all the field information.
f. Click the Test button.
g. Go to the Run command window, specify services.msc and click OK.
h. Right-click Nimsoft Robot Watcher service and select Properties option.
i. Click the Log on tab and select This account option.
j. Specify the username in This account field.
k. Define the password to establish the connection and click OK.
l. Restart Nimsoft Robot Watcher service.
m. If you are still unable to create the connection, check the following points:
If you are running 32-bit Java Virtual Machine (JVM) on x64 version operating system, use the sqljdbc_auth.dll file in the x86 folder.
If you are running 64-bit JVM on x64 processor, you are not required to update the dll file.
If you are running 64-bit JVM on Itanium processor, use the sqljdbc_auth.dll file in the IA64 folder.
Note: You can download Microsoft JDBC Drivers 4.2, 4.1, and 4.0 for SQL Server. (sample path for download: C:\Microsoft
SQL Server JDBC Driver\sqljdbc_<version>\enu\auth\x86).
Substitution
$profile
$url
$query
$con
$threshold
$time
$rows
$value
$condition
Click the General tab and select the connection you created.
Click the Subscribe tab and select subject nas_transaction and table AlarmTransactionLog.
Activate the profile, save it and watch the table get filled.
Setup Tab
Providers Tab
Connections Tab
Profiles Tab
General Tab
SQL Query Tab
Alarm Definition Tab
Publish Message Tab
Quality of Service Definition Tab
Subscribe Tab
Setup Tab
The fields are explained below:
Log File
The file where the probe logs information about its internal activity.
Log Level
Sets the level of details written to the log file. Log as little as possible during normal operation to minimize disk consumption, and
increase the amount of detail when debugging.
Connection Error Severity
Lets you specify the severity level for the alarms issued when communication errors occur.
Providers Tab
The Providers tab displays the list of jdbc connection providers that can be used by all the profiles.
Connections Tab
The Connections tab contains a pool of database connections that can be used by all the profiles.
For adding new connection, the Edit Connection dialog opens.
The fields are explained below.
Provider
Select connection provider from the drop dwon list.
Driver
Define the driver for the selected provider.
URL
Type the URL to access the server.
Example:
Micrsoft SQL Server
jdbc:sqlserver://IP address:Port Number;DatabaseName=Database Name
MySQL
jdbc:mysql://IP address:Port Number/Database Name
Oracle
jdbc:oracle:thin:@IP address:Port Number:Database Name
User ID
Enter the database User ID.
Password
Enter the valid password for the given user id.
QOS Generation
Enter the QOS Server (Target Host) string
Test button
Click this button to test the defined connection.
Profiles Tab
New profiles can be created by right-clicking and selecting New under the Profiles tab.
A "Profile" is a definition of one specific JDBC Gateway task. The tasks can be one of the following types:
1. Send alarms
2. Publish messages
3. Publish QoS messages
4. Subscribe to messages.
You can create four types of profiles:
1. Alarm
2. Publish
3. Qos
4. Subscribe
When clicking the OK button, the Profile dialog appears. The two first tabs of the "Profile" dialog is common for most of the profile types, and
contains:
General Tab
Messages without any mapping will post a message using all the values in the row and the column names as variable names.
The field is explained below.
Subject
The subject used for publishing.
To create a new variable:
1. Right click in the Variable Mapping frame.
2. Select New from the shortcut menu.
The New Variable Mapping dialog opens.
The fields are explained below.
Variable
Variable name in the message and must be unique for the message.
Value
The value of the variable. Supports variable expansion. For example, '$b follows $b'.
Quality of Service Definition Tab
There are three different QoS types supported by the JDBC gateway.
Query Response. How long (in milliseconds) it took to run the SQL query.
Row Count. How many rows the SQL query returned.
Value QoS. Sends the value of selected column (must be a numeric value) returned by the SQL query.
You can send a QoS message and/or send an alarm if the threshold is breached.
The fields are explained below.
Send QoS Message
Enable/disable Quality of Service messages.
Send Alarm
Enable/disable Alarm.
Severity
Alarm severity.
Message
Alarm message text.
Subsystem
Alarm subsystem.
Threshold
Alarm condition and threshold value.
Suppression key
Suppression key is used to create a logical checkpoint that is used for suppression of alarms.
Source of Sender
Used to impersonate another source that the machine that runs the JDBC Gateway.
Subscribe Tab
A subscriber profile makes it possible to subscribe to a subject and insert the data into a specific table. You must create the target table before
you configure a new Subscribe profile. Make sure that the columns are named corresponding to the message that you want to subscribe to and
that they are of the correct data type. This will make it much easier to create the profile.
The fields are explained below.
Subject
Select the subject you want to subscribe to.
Table
The table you want to insert the data into.
You have to select a connection before the Table pull down is populated. When you create the profile a queue will be created on the hub as well.
This queue will be enabled/disabled corresponding to the profile, and it will be deleted when you delete the profile.
jdbcgtw Tips
The query
There are some rules that you should follow when you create a query:
1. Use column names in the query. E.g. SELECT a, b FROM table1.
This makes it easier to determine the variables available in the profile. In this case $a and $b.
2. Try to limit the number of rows that is returned by the query. It is not funny getting 12432 alarms every 5 minutes.
Use selects like this one if possible: SELECT a, b FROM table1 WHERE somedate < DATEADD(n,-10,GETDATE())
3. Use queries that returns one row if you can. E.g. SELECT count(*) as rows FROM table1.
Remember that each row returned by a query results in one alarm or on message.
Using column variables
It is possible to use variables in most fields in Alarm and Publish profiles. The number of variables available depends on the select statement
used in the query. Always use column names in the query. E.g. SELECT a, b FROM table1. This makes it easier to determine the variables
available in the profile. In this case $a and $b.
The example above could in an Alarm profile result in a Message definition like this: $a contains $b. Each variable will be replaced with the
corresponding value from the select.
Using message variables
This applies to the Subscribe profile only. When you create a table that you are going to use to insert Nimsoft messages try to use the same name
on the columns as the variables used in the message (PDS). You can use a sniffer tool (the hub) to find out what the message contains.
Some data types are treated specially:
1. Numbers that look like this: 1022649974 is most likely an EPOC value, this number corresponds to the date Wednesday, May 29, 2002
05:26:14. EPOC is a system date used by computers and starts on Thursday, January 01, 1970 00:00:00 with the value 0. This value can
be inserted into the database as a datetime value.
2. Formats for number and decimal types must contain only one variable and that value must contain numbers only.
3. The string can contain anything, including multiple variables.
jdbcgtw Metrics
The following section describes the metrics that can be configured with the jdbcgtw (JDBC Gateway) probe.
Contents
Monitor Name
Units
Description
Version
QOS_SQL_RESPONSE
Milliseconds
1.0
QOS_SQL_ROWS
Count
1.0
QOS_SQL_VALUE
Value
1.0
This article describes the configuration concepts and procedures for setting up the jdbcgtw probe. The following figure provides an overview of the
process you must follow to configure a working probe.
The Java Database Connectivity Gateway probe is configured to establish a JDBC connection with a database. You can create a connection
using different types of database providers. You can also create profiles for monitoring database transactions.
Add Connection
You can create a JDBC connection of the Java Database Connectivity Gateway probe for monitoring a database.
Follow these steps:
1. Click the Options icon next to the jdbcgtw node in the navigation pane.
2. Click Add New Connection.
3. Update the field information and click Submit.
The new JDBC connection is available for database monitoring and is visible below the jdbcgtw node in the left pane.
Manage Profiles
You can add a profile of the JDBC connection in the Java Database Connectivity Gateway probe for database monitoring.
Follow these steps:
1. Click Options next to the connection name node in the navigation pane.
2. Select Add New Profile.
3. Update the field information.
4. Define the query for accessing database tables and read data from them under the Sql Query Setup section.
5. Click the Test option under the Actions drop-down to execute the defined query.
The new profile is available for database monitoring.
Delete Connection
You can delete a JDBC connection to stop database monitoring.
Follow these steps:
1. Click the Options icon next to the Connection-connection name node that you want to delete.
2. Select Delete Connection.
3. Click Save.
The JDBC connection is deleted.
Delete Profile
You can delete a custom profile of the connection to stop database monitoring.
Follow these steps:
1. Click the Options icon next to the profile name node that you want to delete.
2. Select the Delete Profile.
3. Click Save.
The monitoring profile is deleted from the resource.
jdbcgtw node
This node lets you view the probe information and configure the log properties. You can also view and configure the list of database providers.
Navigation: jdbcgtw
Set or modify the following values as required:
jdbcgtw > Probe Information
This section provides information about the probe name, probe version, start time of the probe, and the probe vendor.
jdbcgtw > General Configuration Setup
This section lets you configure the log properties and severity level of Java Database Connectivity Gateway probe.
Log File: defines the log file name.
Log Level: specifies the detail level of the log file.
Severity: specifies the severity level of the communication error alarm.
jdbcgtw > Providers
This section lets you view the list of database providers. You can also add a custom provider using New option or delete an existing
provider using Delete option.
Path: specifies the .jar file path of the provider.
Driver: specifies the provider driver path.
jdbcgtw > Add New Connection
This section lets you establish a new JDBC connection with the database server.
Connection Name: defines the new connection name.
Providers: specifies the database provider used for establishing connection.
URL: specifies the JDBC URL for connecting to the database.
Example:
Micrsoft SQL Server
jdbc:sqlserver://IP address:Port Number;DatabaseName=Database Name
MySQL
jdbc:mysql://IP address:Port Number/Database Name
Oracle
jdbc:oracle:thin:@IP address:Port Number:Database Name
User ID: defines the database user name.
Server: defines the server IP address.
Connection-<Connection Name> Node
This node lets you view and configure the connection information. You can add a monitoring profile for monitoring the database transactions. A
default profile defines a specific Java Database Connectivity Gateway task. The following three types of default profiles are created when you
establish a new connection:
Metric
Publish
Subscribe
Note: This node is referred to as connection name node in the document and is user-configurable. The fields of this section are same as
described in the Add New Connection section in a checkpoint under the jdbcgtw node.
This node represents the Metric type profile for Java Database Connectivity Gateway probe. All custom Metric type profiles are displayed as
child nodes of this node. There are no fields in this section. The Java Database Connectivity Gateway probe generates the following QoS
messages:
Query Response QoS
Row Count QoS
Value QoS
Navigation: jdbcgtw > Connection-connection name > connection name > Metric
<Metric Profile Name> Node
This node lets you view and configure the Metric type profile properties. You can set the QoS and alarm values for monitoring the transaction
execution.
Note: This node is referred to as metric profile name node and is user-configurable.
Navigation: jdbcgtw > Connection-connection name > connection name > Metric > metric profile name
Set or modify the following values as required:
metric profile name > General Setup
This section lets you configure the profile properties and set the timeout values.
Active: activates the database monitoring.
Profile Name: indicates the profile name.
Connection Name: indicates the provider name.
Query Timeout: specifies the Java Database Connectivity Gateway probe waiting time for query execution.
Run Interval: specifies the time interval between each SQL query execution.
Interval Unit: specifies the measurement unit of Run Interval.
metric profile name > SQL Query Setup
This section lets you define the Simple Query for accessing database tables and reads data from them. The Test option under Actions l
ets you execute the defined query.
metric profile name > Query Response QoS
This section lets you configure the threshold properties for Query Response QoS value.
Severity: specifies the alarm severity level.
Default: information
Message: defines the alarm message issued.
Subsystem: specifies the alarm subsystem ID that defines the alarm source.
Default: 1.1.13-Database
Threshold: specifies the alarm threshold operator.
Threshold Value (ms): specifies the time interval exceeding which alarms are issued.
Suppression Key: defines the logical checkpoint at which the alarms get suppressed.
Source of Sender: defines the IP of another machine that runs Java Database Connectivity Gateway probe.
Note: Similarly, you can configure the Row Count QoS and Value QoS.
Publish Node
This node represents the Publish type profile for Java Database Connectivity Gateway probe. All custom Publish type profiles are displayed as
child nodes of this node. There are no fields in this section.
Navigation: jdbcgtw > Connection-connection name > connection name > Publish
<Publish Profile Name> Node
This node lets you view and configure the Publish profile properties. You can set the variable value and