Sei sulla pagina 1di 4

Database Max processes warning

Created By – Santosh Kumar


Reviewed By -- Tanmay Ghosh
Approved By -- Alan Wang

Customer Name : TELCORDIA TECHNOLOGIES, INC. DBA ICONECTIV


Incident Ticket No :
Database Name : npcmos1
Instance Names : npcmos11/12
Server Names : aur1ttidbchdb01a/02a

Summary: This document is dedicated for handling the Incident ticket triggered when the no of
spawn process associated with the database instances reached to the threshold limit. There are two
monitoring process threshold limit set in the Sitescope :

Case 1 – If threshold reached 60%


Incident Summary are –
Severity:4 Team:DBO Oracle_Database_npcmos11_MaxProcesses_Warning-ttid-aur1ttidbchdb01a-10.137.182.84
Severity:4 Team:DBO Oracle_Database_npcmos12_MaxSessions_Warning-ttid-aur1ttidbchdb02a-10.137.182.86

Case 2 – If threshold reached 70%


Incident Summary are –
Severity:5 Team:DBO Oracle_Database_npcmos11_MaxProcesses_Warning-ttid-aur1ttidbchdb01a-10.137.182.84
Severity:5 Team:DBO Oracle_Database_npcmos12_MaxProcesses_Warning-ttid-aur1ttidbchdb02a-10.137.182.86

Case 3- If threshold reached 100% (Monitoring not set in the sitescope), this case is just a
reference if such an incident took place when we have 100% threshold.

Root Cause- Due to customer apps getting out of control and flooding processes by creating new
connections.

Impact Statement – New Connection request to the database will get failed if it threshold reached
100%. Existing connection will keep on working as usual.

Important Points to consider during handling of ticket –

1. Notify customer in any case with the current process utilization along with the user’s
sessions via email, phone call.
2. Do not recycle the instance without customer confirmation.
3. Do not kill any process in case 1, 2 without customer confirmation.
4. Kill only few process in case 3 so that you are able to login to the database and gather info
to notify customer with the details.
Case 1 –

1. Severity:4 Team:DBO Oracle_Database_npcmos11_MaxProcesses_Warning-ttid-


aur1ttidbchdb01a-10.137.182.84
2. Severity:4 Team:DBO Oracle_Database_npcmos12_MaxSessions_Warning-ttid-
aur1ttidbchdb02a-10.137.182.86

Steps to follow: -

1. Logged in the database.

2. Check the current utilization of the database.


SQL> select * from v$resource_limit where resource_name='processes';

3. Check the processes associated with the users.

SQL> select (case NVL (a.username,'Background') when 'Background' then 'Background


Instance processes' else a.username end) "Users", count(b.spid) "Processes" from v$session
a, v$process b where a.paddr=b.addr group by a.username;

4. Retrieved the detail record of the inactive/Active sessions.

user_session.sql
5. Update the ticket with the detail information and sent notification mail to the customer
(Philippe Nguyen-Tan) pnguyentan@iconectiv.com including sessions and process details.

6. Call the customer. If notification matrix does not include customer phone number then ask
service desk.

7. The customer should take actions.

Note – We are not going to kill any processes.


Case 2 –

1. Severity:5 Team:DBO Oracle_Database_npcmos11_MaxProcesses_Warning-ttid-


aur1ttidbchdb01a-10.137.182.84
2. Severity:5 Team:DBO Oracle_Database_npcmos12_MaxProcesses_Warning-ttid-
aur1ttidbchdb02a-10.137.182.86

Steps to follow: -
1. Logged in the database.

2. Check the current utilization of the database.


SQL> select * from v$resource_limit where resource_name='processes';

3. Check the processes associated with the users.

SQL> select (case NVL (a.username,'Background') when 'Background' then 'Background Instance
processes' else a.username end) "Users", count(b.spid) "Processes" from v$session a ,v$process
b where a.paddr=b.addr group by a.username;

4. Retrieved the detail record of the inactive/Active sessions.

user_session.sql
5. Update the ticket with the detail information and sent notification mail to the customer
(Philippe Nguyen-Tan) pnguyentan@iconectiv.com including sessions and process details.

6. Escalate the ticket to Sungard Account Manager.

Note – We are not going to kill any processes.

Case 3 –

Steps to follow: -

1. Logged in the server.

2. As the user oracle execute the following command and filter the records with older
timestamp–
ps -ef|grep LOCAL=NO

3. We will find the processes consuming high cpu utilization.


ps -e -o pcpu -o pmem -o pid -o user -o args | sort -k 1 | tail -21r

4. Exclude the processes from the process list obtained in the step 2 if the same process is in
the list obtained in the step 2.
(Note - The reason of excluding the processes are to make sure that high CPU utilized
processes should not be killed)

5. Kill few process (from the list of processes obtained from step 4) as per understanding
so that you are able to login to the database.

6. After killing few processes logged in the database.

7. Check the current utilization of the database.


SQL> select * from v$resource_limit where resource_name='processes';

8. Check the processes associated with the users.

SQL> select (case NVL (a.username,'Background') when 'Background' then 'Background Instance
processes' else a.username end) "Users", count (b.spid) "Processes" from v$session a ,
v$process b where a.paddr=b.addr group by a.username;

9. Retrieved the detail record of the inactive/Active sessions.

user_session.sql
10. Update the ticket with the detail information and sent notification mail to the customer
(Philippe Nguyen-Tan) pnguyentan@iconectiv.com including sessions and process details.

11. Escalate the ticket to Sungard Account Manager.

12. If customer says recycle the instance and then only we can recycle the instance. Without
customer confirmation we will not kill any process and not even recycle the instances.

Potrebbero piacerti anche