Sei sulla pagina 1di 18

What is System Monitoring?

System monitoring is a daily routine activity and this document provides a systematic step by step
procedure for Server Monitoring. It gives an overview of technical aspects and concepts for proactive
system monitoring. Few of them are:
Sno Task
ABAP Stack Checks
1
Check process overview(SM50)
2
Check overall system process overview(SM66)
3
Check application servers status(SM51)
4
Check for any pending locks (SM12)
5
Check for Dumps in the system(ST22)
6
Check System log for any errors(SM21)
7
Check for any hanged updates or update status(SM13)
8
Check for excessive swapping (ST02)
9
Check for critical job status like backup,updatestats,checkdb etc(DB13)
10 Check for longrunning/failed jobs status(SM37)
11 Check database alertlogs and performance(ST04)
12 Check spool job status (SP01)
13 Check cache status (sxi_cache) for PI System
14 Check SLD functionality(SLDCHECK)
15 Check SXI_MONITOR for PI system
16 Check for Database locks(DB01)
Java Stack Checks
1
Check java portal accessibility using link
2
Check server0 log for java system for critical errors
3
Check accessibility of management console
4
5
6
1
2
3

Check server node status


Check default trace for critical java errors
Check java reports for memoryconsumption/swapping
Os level checks
Check filesystems usage (shouldb be <80%)
Check for swap space using topas etc
Check for work directory log files at oslevel for errors

Checking Application Servers [ SM51 ]


This transaction is used to check all active application servers and also this transaction is a central transaction
from which you can select the instance to be monitor.

Here you can see which services or work processes are configured in each instance.

Monitoring Work Processes for Individual Instances [ SM50 ]:


This Transaction is used to check Work Process Overview and for Systems without Servers. Here all work
processes with a Running or Waiting State. Displays all running, waiting, stopped and PRIV processes
related to a particular instance. Under this step we check all the processes; the process status should always
be waiting or running. If any process is having status other than waiting or running we need to check that
particular process and report accordingly.

This transaction displays lot of information like:


1. Status of Work process (whether its occupied or not)
2. If the work process is running, you may be able to see the action taken by it in Action column.
3. You can which table is being worked upon
Some of typical problems:

User take long time to log on/not able to logon/online transaction very slow. This could be the result of the
DIA work processes are fully utilized. There could be also the result of long running jobs (red indicator under
the Time column). If necessary you can cancel the session by selecting the jobs then go to Process>Cancel
Without core. This will cancel the job and release the work process for other user/process
Some users may have PRIV status under Reason column. This could be that the user transaction is so big
that it requires more memory. When this happen the DIA work process will be owned by the user and will
not let other users to use. If this happens, check with the user and if possible run the job as a background
job.
If there is a long print job on SPO work process, investigate the problem. It could be a problem related to the
print server or printer.

Monitoring System wide Work Processes [SM66]

By checking the work process load using the global work process overview, we can quickly investigate the
potential cause of a system performance problem.
Monitor the work process load on all active instances across the system
Using the Global Work Process Overview screen, we can see at a glance:

The status of each application server


The reason why it is not running
Whether it has been restarted
The CPU and request run time
The user who has logged on and the client that they logged on to
The report that is running

Monitor Application User [ AL08 and SM04 ]


This transaction displays all the users of active instances. This transaction displays User Ids and Terminal
Names who are currently logged on the system.

Work Process Calculation


SAPinst installs SAP systems with a minimum number of work processes, which
are calculated using the following formula:
- Number of dialog work processes = RAM/256 (min 2, max 18)
- Number of update work processes = RAM/768 (min 1, max 6)
- Number of update2 work processes = RAM/1024 (min 1, max 3)
- Number of batch work processes = RAM/1024 (min 2, max 3)
- Number of enqueue work processes = 1
- Number of spool work processes = 1

Monitoring Update Processes [ SM13 ]


This Transaction is used for Check for any Errors in Update Records. A failed or Update Terminate is an
update to the failed database. These failed Updates occurs when a user entry or transaction is not entered or
Updated in database. And we should check Whether Update is ACTIVE or not. And also we should check any
Update Records are terminated.
Execute Transaction SM13 and put * in the field USER and click on

button.

If there are no long pending updates records or no updates are going on then this queue will be empty as
shown in the below screen shot.

But, if the Update is not active then find the below information:

Is the update active, if not, was it deactivated by system or by user?


o Click on
button and get the information.
o Click on
button and get the below information:
Is any update cancelled?
Is there a long queue of pending updates older than 10 minutes?

Monitoring Lock Entries [ SM12 ]


This Traction is used for Check for Lock Entries. There may be old locks still in place from transactions that did
not release, or from when the user was cut off from the Network. If any lock entries are exist then we have to
check since how long this lock existing. If, this lock existing from more than one day then we should check that
particular User is available or not (By using the Transaction Code AL08). If he is not available then delete that
lock. Otherwise we should wait 1 or 2 Days.
These old locks need to be cleared or it could prevent access or changes to the records.
We can use lock statistics to monitor the locks that are set in the system. We record only those lock entries
which are having date time stamp of previous day.
Execute Transaction SM12 and put * in the field User Name

Monitoring System Log [ SM21 ]


This Transaction is used to check System Logs. We can use the log to pinpoint and rectify errors occurring in
the system and its environment. The log is important because unexpected or unknown warnings and errors
could indicate a serious problem.
We check log for the previous day with the following selection/option:

Enter Date and time.


Select Radio Button Problems and Warnings
Press Reread System Log.

Tune Summary [ ST02 ]


This transaction is used for Buffer tuning or Buffer Statistics. And it is used to tune Buffer perameters of R/3.
The Buffer is important because significant buffer swaping reduces performance. Look under Swaps for Red
Entries. Regularly check these entries to establish trends and get a feel of the Buffer behaviour. And in this
transaction we should Maintain Hit Ratio is in between 90 100% for all , and Heap Memory allocated
should be 0 , and Swaps is 0.

Step 1: Go to ST02 to check the Tune summary.


Step 2: If you see any red values, in SWAPS, double click the same.

Step 3: In the below screen click on the tab Current Parameters

Step 4: Note down the value and the Profile parameters

Step 5: Go to RZ10 (to change the Profile parameter values)


Step 6: Save the changes.
Step 7: Restart the server to take the new changes effect.

CPU Utilization [ ST06 ]

Idle CPU utilization rate must be 60-65%,if it exceeds the value then we must start checking at least below
things:

Run OS level commands top and check which processes are taking most resources.
Go to SM50 or SM66 .Check for any long running jobs or any long update queries being run.
Go to SM12 and check lock entries
Go to SM13 and check Update active status.
Check for the errors in SM21.

ABAP Dumps [ ST22 ]


This Transaction is used to Analyze ABAP Dumps or Short Dumps. This transaction is also used for Analyse
and Determine why the error occurred and takes corrective action. It looks for an excessive number of dumps,
and also dumps of an unusual nature. A ABAP dump is generated when a report or transaction terminates as a
result of a serious error. The system records in the System Log (SM21) and writes a snapshot of the program
terminate to a special table. Here we check for previous days dumps

Spool Request Monitoring [ SP01 ]


For spool request monitoring, execute SP01 and select as below:

Put * in the field Created By

Click on

execute button.

Here we record only those requests which are terminated with problems.

Monitoring Batch Jobs [ SM37 ]


This transaction is used for Monitoring Background Jobs. Background are batch jobs scheduled to run at
specific time during the day. Here we should check for cancelled jobs.
For Monitoring background jobs, execute SM37 and select as below:

Put * in the field User Name and Job name


In Job status, select: Scheduled, Cancelled, Released and Finished requests.

Transactional RFC Administration [ SM58 ]


Transactional RFC (tRFC, also originally known as asynchronous RFC) is an asynchronous communication
method which executes the called function module in the RFC server only once.

We need to select the display period for which we want to view the tRFCs and then select * in the username
field to view all the calls which have not be executed correctly or waiting in queue.

QRFC Administration (Outbound Queue-[ SMQ1 ])


We should specify the client name over here and see if there any outgoing qRFCs in waiting or error state.

QRFC Administration (Inbound Queue- [ SMQ2 ])

We should specify the client name over here and see if there any incoming qRFCs in waiting or error state.

Database Administration [ DB02 ]


This transaction is used to check all table Spaces and Indexes. Here we should check table spaces sizes. This
transaction is provides a way to examine database allocation. And this transaction is allows us to monitor items
such as: a) DB Space History b) DB Analysis By this transaction we can view Database history by Dates and
Times.

After you select Current Sizes on the first screen we come to the below screen which shows us the current
status of all the tablespaces in the system.

If any of the tablespace is more than 95% and the autoextent is off then we need to add a new datafile so that
the database is not full.

We can also determine the history of tablespaces.

We can select Months, Weeks or Days over here to see the changes which takes place in a tablespace.
We can determine the growth of tablespace by analyzing these values.

Database Backup logs [ DB12 ]


From this transaction we could determine when the last successful backup of the system was. We can review
the previous days backups and see if everything was fine or not.
We can also review the redo log files and see whether redo log backup was successful or not.

DB Performance Monitor [ ST04 ] :


This Transaction is used for Database Performance Analysis. This transaction provides the ability to Monitor
the Database in related to
1.
2.
3.
4.
5.

In Shared Pool SQL Area Get Ratio should be 90 95%


In Log Buffer Allocation Retries / Entries <1%
In Data Buffer Quality should be in 90 95%
In Sorts Disk / Memory <1%
In Table Scan & Fetches Long Table Scans should maintain as minimum as possible.

We should check Detailed Analysis then we should hit Detailed Analysis menu Button in this screen. In that
we can check Database Overview like

Exclusive Lock Waits


Wait Events
Buffer Busy Waits
File System Request
Database Message logs
SQL Requests
Summery Report for all activities, etc

Time/User call should be less than 20ms. Note that the value can be much higher due to the inclusion of
special idle events which limits the relevance of this data.
The ratio of Busy wait time to CPU time should be close to 60:40. This is an indication of a well-tuned
system. If you see very high values (such as 80:20), system performance can be improved using 'wait event
tuning'. If the CPU time is significantly higher than 40%, check the CPU utilization on the database server.
Hit ratio (Quality) of the data buffer should be more than 94%. A low hit ratio might be due to small size of
the data buffer.Chek SAP note 619188 for a deeper analysis.
Reads/User Calls should be less than 15. If it is too high, check for expensive SQL statements. Check SAP
Note 766349.
Go to Detail Analysis Menu -> File System Requests and check the average read time (Avg(ms) for Blk
Reads) for individual data files or in total (Total under the column). If the values is very high in comparison to
10 ms, check whether the problem can be solved by improving the data distribution or if there is any Disk I/O
problem at hardware level.
You can analyse more on disk usage from ST06/OS07 (-> Detail Analysis Menu -> Disk). If the load on the
database server's disks is more than 80%, you need to redistribute the data files.
DD-cache quality should be more than 80%; pinratio should be more than 95%; reloads/pin should be
lower than 0.04; User/recursive calls should be more than 2. If there is too much deviation in these
indicators, you need to increase shared pool size.

Work Load Monitor [ ST03 / ST03N ]:


This Transaction is used to Check Work Load Analysis of . Workload Analysis is used to determine System
performance. Check Statistics and Record Trends to get a feel for the Systems Behaviour and Performance.
Understanding the System when it is running well helps us to determine what changes may need to be made
when it is not.

General performance indicating factors: factors indicating good performance.


Wait time < 10% of response time.
Average roll in time < 20 milliseconds.
Average roll wait time < 200 ms.
Average load (& generation time) < 10% of response time(<50 ms)
Average database request time < 40% of (response time wait time)
Average CPU time < 40% of (response time wait time)
Average CPU time is not much less than processing time.
Average response time Depends on customer requirements there is no general rule.
Problems in the above factors and reasons for their problems
Large roll wait time -> Communication problem with GUI or external system
Large load time -> Program buffer, CUA buffer or screen buffer too small
Large database request times -> CPU/ memory bottleneck on database server, network problems,
expensive SQL statements, database locks missing indexes, missing statistics, small buffers
Large CPU times -> Expensive ABAP processing, for example, processing large tables, frequent accessing
of R/3 buffers

System Trace [ ST01 ]:


This Transaction is used for System Trace. Sometimes you may face a strange behavior in DMS functions
which are caused by wrong authorization customizing or you do not know how and where authorization objects
are checked by the system. This trace shows all checked authorization objects, the values which are handed
over to the check and which object leads to the missing authorization behavior.

Potrebbero piacerti anche