Sei sulla pagina 1di 58

IBM Tivoli Agentless Monitoring Overview

Sara Moggi (sara.moggi@it.ibm.com)

Agenda
Overview
Installation Configuration Workspaces Reference Troubleshooting Log Files Problem determination

Known Problems/APARS
Q&A

2010 IBM Corporation

Agent and Agentless technology

Agent-based technology resides directly on a managed server

Agentless technology resides primarily on a management server and gets its data via a remote application programming interface (API)

2010 IBM Corporation

Agent and Agentless technology


The IBM Tivoli Monitoring product uses agent and agentless technology.
Agent Technology: Database agents Operating system agents, etc. Agentless Tecnology Customer Agent or Agentless Solution with Universal Agent/Agent Builder ITM for Virtual Servers ITM for Applications (SAP) Operating Systems (starting from ITM 6.21 version)

2010 IBM Corporation

ITM Agentless
IBM Tivoli Monitoring Agentless provides a way to monitor the availability and performance of all the systems in your enterprise from one or several designated workstations
Agentless Monitoring for Windows Operating Systems (r2)

Agentless Monitoring for AIX Operating Systems (r3)


Agentless Monitoring for Linux Operating Systems (r4) Agentless Monitoring for HP-UX Operating Systems (r5) Agentless Monitoring for Solaris Operating Systems (r6)

2010 IBM Corporation

Key Features (1/2)


It is possible to monitor an IT environment from a small set of workstations. It is not needed to install an agent on each box we want to monitor. Agentless provides a more flexible monitoring solution.

2010 IBM Corporation

Key Features (2/2)


Tivoli Monitoring Agentless can monitor multiple operating system nodes that do not have standard OS agents running on them.
An Agentless obtains data from nodes that are monitored via:
SNMP (Simple Network Management Protocol) CIM (Common Information Model) WMI (Windows Management Instrumentation)

2010 IBM Corporation

Agenda
Overview
Installation Configuration Workspaces Reference Troubleshooting Log Files Problem determination

Known Problems/APARS
Q&A

2010 IBM Corporation

Agentless Monitoring Installation


Supported Platform
AIX 5.3 (32/64 bit) or AIX 6.1 (64 bit)

Solaris 9 or higher
HP-UX 11i or higher Windows:
Windows 2003 Server SE (32/64 bit) Windows Server 2003 Datacenter Edition Windows Vista Enterprise, Business and Ultimate (32/64 bit) Windows Server 2008 SE (32/64 bit) Windows Server 2008 EE (32/64 bit) Windows Server 2008 Data Center Windows Server 2008 Data Center (64 bit)

Linux
Red Hat Enterprise Linux 4 or higher SUSE Linux Enterprise Server 9 or highter

2010 IBM Corporation

Agentless Monitoring Installation


When you use the Agent DVD in a Windows box, then you need to select the Agenless features in the following screen:

10

2010 IBM Corporation

Agentless Monitoring Installation


On the other hand, in a unix box, when you run the install.sh, you need to select the agentless from the following menu:

11

2010 IBM Corporation

Agenda
Overview
Installation Configuration Workspaces Reference Troubleshooting Log Files Problem determination

Known Problems/APARS
Q&A

12

2010 IBM Corporation

Agentless Monitoring Configuration


We can perform agentless configuration by using different sources:
Manage Tivoli Enterprise Monitoring Services (MTEMS)
itmcmd command tacmd command TEP gui : right click on the agentless icon and then click on Configure option

13

2010 IBM Corporation

Agentless Monitoring Configuration


It is possible to configure the Agentless to collect all the data from the monitored box using the SNMP protocol (SNMP Version 1, SNMP Version 2c or SNMP Version 3) The Agentless for Solaris provides an additional possibility: CIM The Agentless for Windows provides an additional possibility: WMI

14

2010 IBM Corporation

Agentless Monitoring Configuration


For each agentless, as soon as you start the configuration youll see the following window:

15

2010 IBM Corporation

Agentless Monitoring Configuration


Agentless Monitoring for AIX

16

2010 IBM Corporation

Agentless Monitoring Configuration


Agentless Monitoring for AIX

17

2010 IBM Corporation

Agentless Monitoring Configuration


Agentless Monitoring for AIX

18

2010 IBM Corporation

Agentless Monitoring Configuration


Agentless Monitoring for Solaris: CIM

19

2010 IBM Corporation

Agentless Monitoring Configuration


Agentless Monitoring for Solaris: CIM

20

2010 IBM Corporation

Agentless Monitoring Configuration


Agentless Monitoring for Solaris: CIM

21

2010 IBM Corporation

Agentless Monitoring Configuration


Agentless Monitoring for Windows: WMI

22

2010 IBM Corporation

Agentless Monitoring Configuration


Agentless Monitoring for Windows: WMI

23

2010 IBM Corporation

Agentless Monitoring Configuration


Agentless Monitoring for Windows: WMI

24

2010 IBM Corporation

Agentless Monitoring Configuration


Agentless Monitoring for Windows: WMI

25

2010 IBM Corporation

Agentless Monitoring Configuration


Relationship between Managed System Details and TEP gui tree

26

2010 IBM Corporation

Agentless Monitoring Configuration


Relationship between Managed System Details and TEP gui tree

27

2010 IBM Corporation

Agenda
Overview
Installation Configuration Workspaces Reference Troubleshooting Log Files Problem determination

Known Problems/APARS
Q&A

28

2010 IBM Corporation

Workspaces Reference

Agentless Linux OS contains agent instance level workspaces.

SNMP Linux Systems: LNX subnode Each node is an individual server.

29

2010 IBM Corporation

Workspaces Reference
Agentless Navigator Item
lists the collection status of the managed systems, and lists which systems are being monitored Agentless for Windows: two views that list the Windows systems that are monitored through the SNMP and the WMI subnode Agentless for Solaris: three views that list the Solaris systems that are monitored through the SNMP subnode (Sun Management Center or System Management Agent) and the CIM subnode.

30

2010 IBM Corporation

Workspaces Reference
Metrics collected by the Agentless Monitoring: Disk utilization Physical and Virtual Memory Network Interface Processes running Processor capacity of the system System level

31

2010 IBM Corporation

Agenda
Overview
Installation Configuration Workspaces Reference Troubleshooting Log Files Problem determination

Known Problems/APARS
Q&A

32

2010 IBM Corporation

Log Files
RAS1 Logs
Windows: %CANDLE_HOME%\TMAITM6\logs\<hostname>_<pc>_k<pc>agent_<instance> _<timestamp>-nn.log Unix/Linux: $CANDLE_HOME/logs/<hostname>_<pc>_<instance>_<timestamp>-nn.log where: pc is the product code of the specific Agentless monitoring

33

2010 IBM Corporation

Trace Levels
Startup/Initialization problems
ERROR (UNIT:query ALL) Windows ERROR (UNIT:ct_main ALL) Unix/Linux

WMI Data Provider


ERROR (UNIT:WMI ALL)

Perfmon Data Provider


ERROR (UNIT: QueryClass ALL)

SNMP Data Provider


ERROR (UNIT:SNMP ALL)

Windows Event Log Data Provider


ERROR (UNIT:EventLog ALL) (UNIT:WinLog ALL)

CIM-XML Data Provider


ERROR (UNIT:CIM ALL)

34

2010 IBM Corporation

Problem Scenario 1
Using the itmcmd command to start agentless we obtained the following error message:
KCIIN0201E Specified product is not configured.

We need to use the following command, including o option:


itmcmd agent -o SNMP start r4

35

2010 IBM Corporation

Problem Scenario 2
Agentless Monitoring configured to use the SNMP Blank workspaces
In the agent log you find error messages as the following:

Check if the snmpd process is running

36

2010 IBM Corporation

Problem Scenario 3
Agentless monitoring for AIX
Agentless is configured to collect the data using SNMP data provider The only blank workspaces are Disk and Process

By default in AIX 5.x, 6.x, the aixmibd daemon is excluded access to the MIBD Modify the /etc/snmpdv3.conf file to comment out a line that is excluding access.

# exclude aixmibd managed MIBs from the default view #VACM_VIEW defaultView 1.3.6.1.4.1.2.6.191 -excluded-

37

2010 IBM Corporation

Problem Scenario 4 (1/2)


Agentless monitoring for Linux or Solaris Agentless is configured to collect the data using SNMP data provider
All the workspaces are blank In the agentless trace logs we found messages as the following:

38

2010 IBM Corporation

Problem Scenario 4 (2/2)


Check connectivity with the monitored box:
Make sure you can ping the remote system Check about firewalls that are blocking communications on the SNMP port (UDP 161)

Check community string and passwords specified in the Agentless configuration


Check the SNMP system is not restricting access to localhost (see snmpd.conf file) Run the following command to check the connectivity with the SNMP system: snmpwalk c public v 1 <hostIP> Check the MIB branches are not restricted (see snmpd.conf file)

39

2010 IBM Corporation

Problem Scenario 5 (1/2)


Agentless Monitoring for AIX
Agentless is configured to collect the data using SNMP data provider Some workspaces are blank

In AIX, the SNMP daemon is composed by 4 processes: snmpd aixmibd System workspaces Disk and File System Capacity, Volume Group, Logical Volume, Physical Volume, Page System, Process Availability, and User Account Information workspaces

hostmibd
snmpmibd

Memory and Processor workspaces


Network workspace

40

2010 IBM Corporation

Problem Scenario 5 (2/2)


Check all the snmp processes are running If the community string is not public, verify that the three SNMP processes: aixmibd, hostmibd and snmpmibd are started with the -c <community> command line option

41

2010 IBM Corporation

Problem Scenario 6
Agentless Monitoring for Linux
Agentless is configured to collect the data using SNMP data provider On TEP gui, there are data only for Network and System workspaces

For Red Hat operating systems, the /etc/snmpd.conf must be modified to allow the Host Resources MIB and ucdavis MIB to be viewed by all users.
Add the following system views to the SNMP configuration:
view systemview included .1.3.6.1.2.1.25 view systemview included .1.3.6.1.4.1.2021

42

2010 IBM Corporation

Problem Scenario 7
Agentless Monitoring for Windows
(48E294C2.00011D28:queryclass.cpp,803,"internalCollectData") Authentication failed against host <host> as user <Domain>\<User>, return code = 1326 (48E294C3.000016EC:wmiqueryclass.cpp,757,"internalCollectData") ::collectData==>Could not connect. Error code = 0x80070005, subnode = <name>

These errors indicate an invalid password, invalid username, or username without Administrators group membership.

43

2010 IBM Corporation

Problem Scenario 8
Agentless Monitoring for Windows
(4891C694.0066-1558:queryclass.cpp,1006,"start") Error adding query for class PhysicalDisk. (4891C694.0067-1558:queryclass.cpp,1007,"start") \\<hostname box>\PhysicalDisk(*)\% Disk Write Time - add returned C0000BB8

Check if the counter exists Check if the Remote Registry service is enabled. Check if the counter indexes are corrupted. You need to check in
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Perflib\009

44

2010 IBM Corporation

Agenda
Overview
Installation Configuration Workspaces Reference Troubleshooting Log Files Problem determination

Known Problems/APARS
Q&A

45

2010 IBM Corporation

Known Problems/APARS 1 (1/2)


IZ80454: CPU metrics for Solaris SMA should be per interval
Recreation Steps:
Install the Agentless Monitoring for Solaris (KR6) product. Configure it to use "SNMP (System Management Agent)" or SMA. In Tivoli Enterprise Portal (TEP), click on the Processor workspace. In the "Overall CPU Utilization over Time",select to have data viewed in table format.

Symptom: for each polling interval the values of the following attributes
increase over time:
User CPU, System CPU, Nice CPU, Idle CPU Total CPU, CPU Used Pct, CPU Idle Pct

46

2010 IBM Corporation

Known Problems/APARS 1 (2/2)


The following CPU metrics are collected via SNMP and the values are cumulative since the machine was started:
User CPU, System CPU, Nice CPU, Idle CPU

These values are used in the calculations of the following attributes:


Total CPU, CPU Used Pct, CPU Idle Pct

The CPU metrics were changed to represent the values since the last polling interval instead of a cumulative value since the machine was last rebooted. Fixed in 6.2.2-TIV-ITM-FP0003

47

2010 IBM Corporation

Known Problems/APARS 2
IZ71871: Add ability to monitor services that are down
The following functions have been added to the agent:
a new query; a new workspace named Windows Services under the System navigator item;

a new attribute group with several attributes that will provide information about the services. For example: Display Name, Description, Process ID, Status, State.

Fixed in 6.2.1-TIV-ITM-FP0002 and 6.2.2-TIV-ITM-FP0002

48

2010 IBM Corporation

Known Problems/APARS 3 (1/2)


IZ77565: Perfmon data stops collecting when other systems are down
When one or more remote hosts go down, the calls to perfmon for other remote hosts (which are up) fail with a timeout error:
(4BCF113E.0000-400:queryclass.cpp,1001,"internalCollectData) Error collecting query data for class Terminal Services host <host>. Error is 00000102. (4BCF113E.0001-744:queryclass.cpp,1001,"internalCollectData") Error collecting query data for class Terminal Services host <host> Error is 00000102.

49

2010 IBM Corporation

Known Problems/APARS 3 (2/2)


The Microsoft code is single-threaded in the API call the agent calls. When a system is down, the call to collect the perfmon data will time-out for the system that is down. All other requests queued up at the time (for the same system or other systems) will also time-out.
The agent code was updated to ensure the call to the Microsoft API is also single-threaded. Fixed in 6.2.1-TIV-ITM-FP0003

50

2010 IBM Corporation

Known Problems/APARS - Missing Operator


Example: R2 Agentless running on a Linux system, and monitoring a remote Windows system
<instance>:<hostname>:R2 R2:<WindowsHost>:WIN

51

2010 IBM Corporation

Known Problems/APARS - Missing Operator


Define a situation that monitors if a process is MISSING on the monitored Windows box
When the situation will be true, then well have an alert on <instance>: <hostname>:R2 node Enhancement Requests:
MR1126096811 MR0109086845 MR0204095945

DCF http://www-01.ibm.com/support/docview.wss?uid=swg21420788

52

2010 IBM Corporation

Known Problems/APARS - Missing Operator


Workaround: Agent Builder solution
Agent which remotely monitor one machine Agent built with the multi-instance support

Agent to monitor these metrics on Windows:


Computer System including Model and Serial Number, Operating System Windows Event Log Disk Usage including Logical Disk, Physical Disk Processor

Memory including Physical Memory,Page File Usage


Network Interfaces Windows Terminal Services

Also available for AIX, HP-UX, Solaris and Linux platforms

53

2010 IBM Corporation

Known Problems/APARS - Missing Operator


In the Agent package you can find the following scripts:
installIra.bat/sh (installs all components on a single machine) installIraAgent.bat/sh (installs the Agent) installIraAgentTEMS.bat/sh (installs the TEMS application support) installIraAgentTEPS.bat/sh (installs the TEPS application support)

54

2010 IBM Corporation

Known Problems/APARS - Missing Operator

55

2010 IBM Corporation

Agentless Monitoring Scale Information


Agentless Monitors are multi-instance agents Support for up to 10 active instances on a single system Each instance supports communication with 100 remote nodes

10 instances x 100 remote nodes = 1000 monitored systems

56

2010 IBM Corporation

Performance Variables
Variable Name
CDP_DP_CACHE_TTL

Default Value
60

Description
Time in seconds before a query will trigger a new data collection

CDP_DP_THREAD_POOL_SIZE

60

The number of threads created to perform background data collections. The Thread Pool is shared among all attribute groups in all remote nodes in an agent.
The interval in seconds at which each attribute group cache is updated in the background The number of seconds to wait for a data collection to happen before timing out and returning cached data.

CDP_DP_REFRESH_INTERVAL CDP_DP_IMPATIENT_COLLECTOR_TIMEOUT

60 2

CDP_SNMP_RESPONSE_TIMEOUT

The number of seconds to wait for each request to time out. Each row in an attribute group is a separate request
The number of times to retry sending the SNMP request after a response timeout Configures whether or not the Windows Event Log data provider should report old log entries on startup, or only new ones Cache lifetime in seconds of an event from the Windows Event Log Number of pure events held in cache at any one time. When a query is made, reports all events in the cache at that time. When cache is full, oldest events are removed to make room for new ones

CDP_SNMP_MAX_RETRIES CDP_NT_EVENT_LOG_GET_ALL_ENTRIES_FIRST_TI ME CDP_NT_EVENT_LOG_CACHE_TIMEOUT CDP_PURE_EVENT_CACHE_SIZE

2 NO

3600 100

57

2010 IBM Corporation

QUESTIONS

58

2010 IBM Corporation

Potrebbero piacerti anche