Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
New AR:
History | States | Assignment | Timetracking
AR Text Search | More | CARES Home
Assistance Request 1-5531112
Description
Summary
Outage Yes
Severity 1
Priority 1
Sub-Type Software
Category Software
Internal No
Outage Report
Assignment Events
Product
Version LR13.3.W
Product Location
Contact
Contact Info
Dates
Entitlement
People
Copy To rories
Submitter dlempick
Description
Attachments
Task:0x7d550000: UeCallp
DAR: 0x00000000
LR: 0x6b27d844
_ZN20UeCallUeRrmInterface12handoverAlgoEP16Uecall_Context_t +0x114
003a94f0 taskUnlock
1. 12-Jan-2015 13:19
dlempick
AR Number: 1-5531112
ACT link:
http://umts-er.ca.alcatel.com/activecall.php?callId=1765#live-area
3. 12-Jan-2015 14:48
gmohamed
4. 12-Jan-2015 16:14
stbalko
Update to Current Summary: TMU reset caused by wrong message send over Iur.
TEC
5. 12-Jan-2015 17:41
stbalko
Update to Current Summary: Iur link which caused the problem identified (Ss7
6. 12-Jan-2015 17:47
aelmidan
outage Timeline:
09:42:55 GMT (10:42:55) : all TMUs expect two are down (pending state)
10:02:57 GMT (11:02:57) : some TMU has been restarted with softwareError
(Lp/2
block 0x53f20
10:11:25 GMT (11:11:25) : new TMU reset on Lp/13 Ap/5 and Lp/12 Ap/1
10:36:11 GMT (11:36:11) : just one TMU reset seen (Lp/12 Ap/1) after RNC
reset
10:55:01 GMT (11:55:01) : two TMU reset seen after RNC restart
10:55:57 GMT (11:55:57) : Remi Burie from TEC join the bridge
11:05:09 GMT (12:05:09) : new occurence of the TMU reset, issue is back also
after
RNC reset
11:16:59 GMT (12:16:59) : TEC update: problem is on Iur link which causes
TMU
11:45:17 GMT (12:45:17) : arround 8 Iur link exist, other RNC vendor
connected to
12:46:47 GMT (13:46:47) : customer finaly agreed to lock Iur link one by one
13:00:12 GMT (14:00:12) : fisrt Iur locked (Ss7 M3ua/1 PMP/53 Assoc/*)
13:10:54 GMT (14:10:54) : Remark: customer were award that locking Iur link
one by
one may not identify the corrupted Iur if the bad message is sent via two or
more
13:19:05 GMT (14:19:05) : No more TMU reset after locking Ss7 M3ua/1 PMP/53
13:21:05 GMT (14:21:05) : Iur link unlocke to proof that this link was that
one
13:22:15 GMT (14:22:15) : New TMU reset occurred just after Iur link (Ss7
M3ua/1
13:32:31 GMT (14:32:31) : Link (Ss7 M3ua/1 PMP/53 Assoc/*) locked back to
restore
service.
/traces/OUTAGE_RCA_ER_GPS/1-5531112
Outage triggered by IUR instability between ALU RNC and other vendor RNC
14:14:43 GMT (15:14:43) : no more TMU reset till now after Iur locking;
still
Hello,
Here after the stat done by Laurent regarding the TMU reset issue.
From callp point of view it's due to alcap timeout then it means instability
on
manager.
The second one regarding the investigation of this alcap issue on IUR:
It appears that's not new as we have reset since at least 4 weeks, local
team has
Regards,
Rmi BURIE
To: IBRAHIM ABD EL NABY, Karim (Karim)** CTR **; RIES, Robert (Robert)
(STANISLAV); BOSLEY, Tim (Tim)** CTR **; ABDEL-HALIM, SAYED (SAYED); OKASHA,
TAHER
(TAHER); JACKYRA, GREGORY (GREGORY); MORVAN, FREDERIC (FREDERIC); EL-MIDANY,
AHMED
(AHMED); BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET);
DAOU,
CYRIL (CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY);
WAGDY
MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD (MAHMOUD); MOHAMED, GEHAD
Gents,
Just sending short summary of the actions that we have discussed on the
call
earlier today.
1) The fix mentioned below is good to have on RNC in front of HWI RNC,
however
4) Local team to
i. Configuration check
working RNC)
b. Open new ticket for TMU storms due to IuR resource shortage.
Investigation
ii. How the other RNC is different ( pls provide snapshot of impacted
"bad"
& not impacted "good"RNC that is connected to the same HWI RNC
issue ( all TMUs down until RNC restart ).... There are two disticket
issues
7) Local team :
b. Objective is to understand what are KPIs like with unlocked IuR & how
big
is the impact of TMU restarts on KPIs ( issue was unnoticed from KPIs
even
Rgds,
Dusan
It is connected with unexplained TMU pending states. The customer ETC (UAE)
noticed strong KPI degradation the 12th Jan 2015 (he was already asked to
provide
appropriate NPO KPI outputs. Once he provide them, we update the ticket).
+==+==+------+----------+------+----------+----------+----------+----------
| | | | | | | ll | |
+==+==+------+----------+------+----------+----------+----------+----------
| 2| 4|pc | 0|na | 0| 0| 0| 0
| 3| 4|pc | 1|na | 0| 0| 0| 0
| 4| 4|pc | 2|na | 0| 0| 0| 0
| 5| 4|pc | 3|na | 0| 0| 0| 0
| 5| 5|sOmu | 1|spared| na| na| na| na
| 6| 4|pc | 4|na | 0| 0| 0| 0
| 7| 4|pc | 5|na | 0| 0| 0| 0
| 8| 4|pc | 10|na | 0| 0| 0| 0
| 9| 4|pc | 11|na | 0| 0| 0| 0
ok 2015-01-12 13:40:31.12
During emergency recovery activity we performed TMU reset and PMC master
swact. No
change in TMU states. Only CP switchover brought all TMUs to oper state.
/traces/OUTAGE_RCA_ER_GPS/1-5531112
Thank you
Best Regards
Robert Ries
BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU,
CYRIL
(CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY
MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD (MAHMOUD); MOHAMED, GEHAD
Subject: RE: RNC231 cannot carry any calls and we have a stream of TMUs
reset,
Hello all,
Please find below the summary of the issue and the latest updates:
Problem description:
Actions done:
RNC shelf reset => slight improvement observed (still observing TMU reset
but with
lower frequency)
Lock the IUR towards RNC206, the issue solved and no alarms.
Impact:
Investigation:
TEC is suspecting some rejection occurring on IuR interface leading to TMU
instability
Waiting TEC & TSO feedback with AP and corrective action as per the customer
Additional traces to be collected and shared with TEC, all logs available @
/srv/data202256/data/server/Traces/from_france/ETISALAT_UAE/ 1-5531112
SFTP://172.25.80.47
user : ftraces
password : Elv24trf
Thanks,
--
BR,
Karim
To: RIES, Robert (Robert); BALKO, STANISLAV (STANISLAV); BURIE, REMI (REMI);
Cc: IBRAHIM ABD EL NABY, Karim (Karim)** CTR **; ABDEL-HALIM, SAYED (SAYED);
EL-MIDANY, AHMED (AHMED); BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR,
NAVNEET
(NAVNEET); DAOU, CYRIL (CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI,
RAMSEY (RAMSEY); WAGDY MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD
(RAMY)
Subject: RE: RNC231 cannot carry any calls and we have a stream of TMUs
reset,
Importance: High
Looking for your usual support to find the RCA and solution for the current
Please note that locking IuR interface is not acceptable as WA, so we need
to
Best regards,
Mahmoud Khedr
Hi Karim
Issue summary:
By TEC investigation carried out by Remi Burie, we found out, that issue is
close
Following investigation found, that the issue is known and there is also
known fix
Kindly, as issue has known RC and there is fix provided, could we set this
ticket
course.
And one more question: in communication on the bridge you (or someone from
your
team) mentioned, that you are aware of some configuration issue. What did
you mean
Thanks
Best Regards
Robert Ries
BALKO, STANISLAV (STANISLAV); BOSLEY, Tim (Tim)** CTR **; ABDEL-HALIM, SAYED
**; ZAHABI, RAMSEY (RAMSEY); WAGDY MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM,
RNC231
Importance: High
Hi Reis,
[Karim] is that mean to solve the issue we need a patch or SW upgrade ?!,
are
there a WA to solve the issue for now?, are there a released alert for this
issue?.
[REIS]Kindly, as issue has known RC and there is fix provided, could we set
this
case, of course.
[Karim] As discussed , we cannot restore the outage for now! The customer is
pushing on us and cannot accept to keep the IUR link locked, also if we
unlocked
[REIS]And one more question: in communication on the bridge you (or someone
from
your team) mentioned, that you are aware of some configuration issue. What
did you
[Karim] Ahmed Beridy the responsible for Radio part will update with more
details
Thanks,
--
BR,
Karim
21. 14-Jan-2015 11:23
rories
To: IBRAHIM ABD EL NABY, Karim (Karim)** CTR **; RIES, Robert (Robert)
(STANISLAV); BOSLEY, Tim (Tim)** CTR **; ABDEL-HALIM, SAYED (SAYED); OKASHA,
TAHER
(AHMED); BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET);
DAOU,
CYRIL (CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY);
WAGDY
MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD (MAHMOUD); MOHAMED, GEHAD
Gents,
Just sending short summary of the actions that we have discussed on the
call
earlier today.
1) The fix mentioned below is good to have on RNC in front of HWI RNC,
TUM storms)
4) Local team to
i.
Configuration check
ii.
Features activated check ( compatibility with HWI + crosscheck settings
with working RNC)
iii.
Utilization check ( there seems to be very high utilization )
b. Open new ticket for TMU storms due to IuR resource shortage.
i. Why
TMU storms happen
ii. How
the other RNC is different ( pls provide snapshot of impacted "bad" & not
impacted "good"RNC that is connected to the same HWI RNC
iii. How to
avoid defence of TMU
issue ( all TMUs down until RNC restart ).... There are two disticket
issues
b. Objective is to understand what are KPIs like with unlocked IuR &
how
big is the impact of TMU restarts on KPIs ( issue was unnoticed from KPIs
even
Rgds,
Dusan
Cc: KHEDR, MAHMOUD (MAHMOUD); REDA, RAMY (RAMY); AHMED BERIDY (AHMED)
Importance: High
Dear Daniel,
We have suffered a severe TMU resets on RNC231 which is currently Live
carrying 25
sites.
The feedback from TEC that is related to IUR link with RNC206 suffering from
ALCAP
issue and resource allocation issue. It's been requested to check with
engineering
i.
Configuration check
ii.
Features activated check ( compatibility with HWI + crosscheck settings
with working RNC)
iii.
Utilization check ( there seems to be very high utilization )
We appreciate your support on whom can help with this and what kind of
to be re-engineered.
I'm adding Robert Ries from TSO and Remi Burie from TEC for any further
details
Taher OKASHA
(STANISLAV); BOSLEY, Tim (Tim)** CTR **; ABDEL-HALIM, SAYED (SAYED); OKASHA,
TAHER
(AHMED); BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET);
DAOU,
CYRIL (CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY);
WAGDY
MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD (MAHMOUD); MOHAMED, GEHAD
Hello all
Next is history of recovery team activities connected to ETC (UAE) TMU issue
the
09:38 (13:38 RNC local time) we found 11 TMUs out of 14 in Pended state
09:41 waiting for restore - result => not success, without change
09:43 waiting for restore - result => not success, 11 TMUs still Pended
09:44 waiting for restore - result => not success, 11 TMUs still Pended
09:44 waiting for restore - result => not success, 11 TMUs still Pended
09:46 waiting for restore - result => success, all 14 TMUs working
(TMU reset)
12:54 (16:54 RNC local time) Ss7 M3ua/1 PMP/53 Assoc/0&1 locked - command
sent
by customer
13:19 (17:19 RNC local time) Ss7 M3ua/1 PMP/53 Assoc/0&1 unlocked - command
sent by customer
13:30 (17:30 RNC local time) Ss7 M3ua/1 PMP/53 Assoc/0&1 locked - command
sent
by customer
- no present alarms "TMU -- EXCEPTION:Memory Manager error[memPartFree] at
block"
=====================================================================
Furthermore we inspect hfb files for alarms "Memory Manager error". Next is
history:
Best Regards
Robert Ries
(STANISLAV); BOSLEY, Tim (Tim)** CTR **; ABDEL-HALIM, SAYED (SAYED); OKASHA,
TAHER
CYRIL (CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY);
WAGDY
MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD (MAHMOUD); MOHAMED, GEHAD
BERIDY, AHMED (AHMED); SUSARRET, Andres (Andres)** CTR **; MERCHAUT, VINCENT
(VINCENT); Berky, Dusan (Dusan); IBRAHIM ABD EL NABY, Karim (Karim)** CTR
**;
Thanks.
Best Regards
Christophe.
To: RIES, Robert (Robert); IBRAHIM ABD EL NABY, Karim (Karim)** CTR **
(STANISLAV); BOSLEY, Tim (Tim)** CTR **; ABDEL-HALIM, SAYED (SAYED); OKASHA,
TAHER
(AHMED); BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET);
DAOU,
CYRIL (CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY);
WAGDY
(RAMY); KHEDR, MAHMOUD (MAHMOUD); BURIE, REMI (REMI); BERIDY, AHMED (AHMED);
(Dusan)
Hello Robert,
I think in our report to customer we will need to explain the reason of "11
TMUs
status?
Regards
Gehad
To: EL-MIDANY, AHMED (AHMED); IBRAHIM ABD EL NABY, Karim (Karim)** CTR **;
REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU, CYRIL (CYRIL); ELHAKIM,
Mohamed
(Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY MANSOUR, AYMAN (AYMAN)**
CTR
REMI (REMI); BOSLEY, Tim (Tim)** CTR **; Hall, Gail Culver (Gail)** CTR **;
Hello all,
- KPIs to be checked for any degradation in the period between RNC reset
& IuR
- Next synchro call Wednesday 14-Jan@14:00 Paris time => AP. Local team
to send
the invitation
BR,
Ahmed El-Midany
To: MONNAIE, DANIEL (DANIEL); OKASHA, TAHER (TAHER); MAHER, RAFIK (RAFIK)
Cc: KHEDR, MAHMOUD (MAHMOUD); REDA, RAMY (RAMY); EL-SAEED, AHMED (AHMED);
BERIDY,
AHMED (AHMED); EL-MIDANY, AHMED (AHMED); RIES, Robert (Robert); BURIE, REMI
(REMI); Berky, Dusan (Dusan); IBRAHIM ABD EL NABY, Karim (Karim)** CTR **;
WAGDY
Hi Taher, can you give me a call please ? I can't join you from my phone
issue caused by the IuR link between RNC231 & Huawie RNC206
I don't understand why you are referencing to the ALCAP on your below mail,
howver
the network is on fullIP. ALCPA is existing only when ATM is existing but it
is
Can I get the CPU laod of all the eDCPS boards please for these 2 last
weeks? Make
sure this period is including the period where the IuR is not Locked and
after
As the TEC Guys, seems also the issue is known, and there is a fix provided,
as
peer the historic of this email, can I get it? It may be related to the IuR
mapping of SCTP links where a PDC is used more than other... to be checked
on
fresh snapshot, so thanks also to provide the new snapshot of the network
Cdt/BR
Soufiane B
Cc: KHEDR, MAHMOUD (MAHMOUD); REDA, RAMY (RAMY); EL-SAEED, AHMED (AHMED);
BERIDY,
AHMED (AHMED); EL-MIDANY, AHMED (AHMED); RIES, Robert (Robert); BURIE, REMI
(REMI); Berky, Dusan (Dusan); IBRAHIM ABD EL NABY, Karim (Karim)** CTR **;
WAGDY
Cc: KHEDR, MAHMOUD (MAHMOUD); REDA, RAMY (RAMY); EL-SAEED, AHMED (AHMED);
BERIDY,
AHMED (AHMED); EL-MIDANY, AHMED (AHMED); RIES, Robert (Robert); BURIE, REMI
(REMI); Berky, Dusan (Dusan); WAGDY MANSOUR, AYMAN (AYMAN)** CTR **; ABDEL-
HALIM,
SAYED (SAYED)
Hello Soufiane,
As per our discussion the issue is that the TMUs were restarting randomly
with
TEC has concluded that it's due to something on the IUR link with RNC206 but
they
are unable to determine the correlation till now. issue has been escalated
to RNC
design. TMU restart stopped after this IUR link was locked.
On the other hand they want a check from engineering side for any
differences
working fine. And check for IUR link utilization if it's OK or not.
The ALCAP topic is discarded of course it was strange already but that was
TEC
feedback but they discarded this topic since it's full IP.
The below mentioned fix has been found out that it's not related to the
issue
of Dusan.
We will provide you with a fresh snapshot and the CPU loads for the last
week as
Taher OKASHA
To: OKASHA, TAHER (TAHER); BENIGHIL, SOUFIANE (SOUFIANE)** CTR **; MONNAIE,
DANIEL
Cc: KHEDR, MAHMOUD (MAHMOUD); REDA, RAMY (RAMY); EL-SAEED, AHMED (AHMED);
BERIDY,
AHMED (AHMED); EL-MIDANY, AHMED (AHMED); RIES, Robert (Robert); BURIE, REMI
(REMI); Berky, Dusan (Dusan); WAGDY MANSOUR, AYMAN (AYMAN)** CTR **; ABDEL-
HALIM,
SAYED (SAYED)
Hello Soufiane,
Kindly find attached network snapshot, required indicator and current TMU
mapping
for RNC231.
Thanks,
--
BR,
Karim
To: IBRAHIM ABD EL NABY, Karim (Karim)** CTR **; OKASHA, TAHER (TAHER);
BENIGHIL,
SOUFIANE (SOUFIANE)** CTR **; MONNAIE, DANIEL (DANIEL); MAHER, RAFIK (RAFIK)
Cc: KHEDR, MAHMOUD (MAHMOUD); REDA, RAMY (RAMY); EL-SAEED, AHMED (AHMED);
BERIDY,
AHMED (AHMED); EL-MIDANY, AHMED (AHMED); BURIE, REMI (REMI); Berky, Dusan
(Dusan);
Hello Karim/all
Thank you for data provided, however to see complex behavior (degradation)
we are
waiting for another KPI also:
CSSR
CDR
RRC
RAB
All for CS & PS domain on 15min basis at least since 11th Jan 2015.
Thank you
Best Regards
Robert Ries
(STANISLAV); BOSLEY, Tim (Tim)** CTR **; ABDEL-HALIM, SAYED (SAYED); OKASHA,
TAHER
(AHMED); BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET);
DAOU,
CYRIL (CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY);
WAGDY
MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD (MAHMOUD); MOHAMED, GEHAD
BERIDY, AHMED (AHMED); SUSARRET, Andres (Andres)** CTR **; MERCHAUT, VINCENT
(VINCENT); Berky, Dusan (Dusan); IBRAHIM ABD EL NABY, Karim (Karim)** CTR
**;
Hello Christophe,
I went through the AR description and mail thread below and I think that the
NEA
Therefore I add in the loop Noelle Jaouani to check if her team can provide
such
support.
BR,
fx
To: OKASHA, TAHER (TAHER); MONNAIE, DANIEL (DANIEL); MAHER, RAFIK (RAFIK);
IBRAHIM
Cc: KHEDR, MAHMOUD (MAHMOUD); REDA, RAMY (RAMY); EL-SAEED, AHMED (AHMED);
BERIDY,
AHMED (AHMED); EL-MIDANY, AHMED (AHMED); RIES, Robert (Robert); BURIE, REMI
(REMI); Berky, Dusan (Dusan); WAGDY MANSOUR, AYMAN (AYMAN)** CTR **; ABDEL-
HALIM,
SAYED (SAYED)
Hi Taher,
with Port 7000 instead of 2905 in IuR for the peer (206)
I propose to align with 235-206 IuR by creating the missed associations &
DCPS available 6 of them are free, only slots 2/3/6/7 are used,
8/9/10/11/12/13
By looking to all of these points, I can conclude, that the TMU mapping
should be
reviewed as bellow:
" By mapping the SCTP association to 8/9 (unused till now) instead of
Slot/2 &
" And Created the missed Associations with ports 7000 to 7003 & with
PMP Ip @
10.241.31.53
2. In 2nd action: After 1st action outcomes, we align all the network
interfaces
as it is configured on RNC235 for IuR & IuCs & PS with addition of Slot
So, the fact that the IuR 231-206 is looked seems giving some breathing to
the
TMUs loaded, because the STCP Asscication of all IuRs links in this RNC are
linked
to TMU/2 &/3. I guess any lock of other IiuRs will have the same effect, I
Let's see the impact of this changes for the first action then look at the
TMUs
behaviors
Attached is the WO1 to be applied asap, and unlock the IuR 239-206
Cdt
Soufiane B
To: KHEDR, MAHMOUD (MAHMOUD); BENIGHIL, SOUFIANE (SOUFIANE)** CTR **; EL-
SAEED,
AHMED (AHMED)
Cc: MONNAIE, DANIEL (DANIEL); MAHER, RAFIK (RAFIK); IBRAHIM ABD EL NABY,
Karim
(Karim)** CTR **; REDA, RAMY (RAMY); BERIDY, AHMED (AHMED); EL-MIDANY, AHMED
(AHMED); RIES, Robert (Robert); BURIE, REMI (REMI); Berky, Dusan (Dusan);
WAGDY
Importance: High
But we can't proceed with changing the IUR interface with Huawei from our
side
One association for IUR with RNC206 is the information shared from Huawei
and
For the IUR creation for RNC235 I see that configuration of IUR with RNC206
is
different.
Maybe we can proceed with the other recommended changes without changing the
IUR
associations till we confirm with Etisalat & Huawei. What do you think?
Taher OKASHA
Hi,
(reduced mib).
But, after the CP switchover, I could see the MIB was in nominal state.
Non-zero MIB build number and MIB State 1 signify a Nominal MIB.
And the following errors were seen on the active OMU while accessing the
MIB,
after coming up from switchover.
OMU_0(Lp/4,Ap/5) (PERM): ### DAS WARNING FAULT ###: "MIB has been
deleted"
...
received pointer < minimum expected length. Cannot decode complete message.
Looks like the MIB was not in proper state when the switchover was
triggered.
2) I could also see couple of NFS errors. These were seen at 13:45:13 on
OMU-1
errstr=S_nfsLib_NFSERR_IO
-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-
_Z9notif_log9Boolean_t15Q3_Event_Type_t15Q3_Prob_Cause_t13Q3_Severity_tiPKci
iimjPK
vmjS6_mjS6_ +0x154
00a45618
_Z9notif_log9Boolean_t15Q3_Event_Type_t15Q3_Prob_Cause_t13Q3_Severity_tiPKci
mjPKvm
jS6_mjS6_ +0xcc
00a37768
_Z17fcirc_check_spaceiR18FCIRC_FileHeader_tjPK13FCIRC_mData_tRK14FCIRC_ptFil
e_t9Bo
olean_tRS4_RS7_ +0x2b74
00c17298
_Z8gob_maintPK11gob_class_tttPKjtP11rootParam_tPvPFvhjjS6_ttE13gob_startup_t
iPKPKc
11MSG_scope_t +0x1fa0
DUMP 00
=======
"/OMU/share/rw_data/Assoc_4_9"
DUMP 01
=======
"S_nfsLib_NFSERR_IO"
DUMP 02
=======
574
-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-_+_-
Id: D000CC4
Id: D000CC5
/
ccase_rnccn/ControlNode/BaseOS/Error/bexception/src/bexception_vxworks.cc:12
31
CP switchover.
of TMU reset, it put RNC in several defense case until it get stuck and TMU
went
in pending state.
We believe that if we fix the TMU reset (remember up to 400 per day) we will
not
Regards,
Rmi
To: KHEDR, MAHMOUD (MAHMOUD); BENIGHIL, SOUFIANE (SOUFIANE)** CTR **; EL-
SAEED,
Cc: MONNAIE, DANIEL (DANIEL); MAHER, RAFIK (RAFIK); IBRAHIM ABD EL NABY,
Karim
(Karim)** CTR **; REDA, RAMY (RAMY); BALKO, STANISLAV (STANISLAV); BERIDY,
AHMED
Hello Soufiane,
The configuration of IUR link from Huawei side is matching the attached data
we
requested from them based on the design. Do you think it should be changed
for
Is there a reason why only the IUR RNC235<>RNC206 is having 4 SCTPs while
all
Taher OKASHA
AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU, CYRIL
(CYRIL);
ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY MANSOUR,
AYMAN
Dusan (Dusan); IBRAHIM ABD EL NABY, Karim (Karim)** CTR **; RIES, Robert
(Robert);
Hi All,
1/ RNC231 is full ip, there is no alcap here, by the way design use same
primitive
2/ From call trace, there is a lot of rnsap radio link setup failure due to
a SRB
6.8 configuration which is not supported by our product (already seen and
tracked
by AR 1-5254837).
By the way even it's not supported we attempt to allocate resource on uplane
for
those radio link attempt and may conduct to the resource exhaustion
observed.
I believe it the main cause of the apparition of the faulty scenario that
conduct
Regards,
Rmi BURIE
Cc: MONNAIE, DANIEL (DANIEL); MAHER, RAFIK (RAFIK); IBRAHIM ABD EL NABY,
Karim
(Karim)** CTR **; REDA, RAMY (RAMY); BALKO, STANISLAV (STANISLAV); BERIDY,
AHMED
Hello Taher,
For me it should be the same as in 235_206 IuR, since we are pointing to the
same
neighboring RNC (206), if we follow the logic, unless the Huawie RNc is
working
& Customer.
Meanwhile please apply a change only on our side, I mean remap the sctp
on 235 (4 V.S 1 in 231), this could offload the Cplane of the IuR
Cdt
Soufiane B
Cc: MONNAIE, DANIEL (DANIEL); MAHER, RAFIK (RAFIK); IBRAHIM ABD EL NABY,
Karim
(Karim)** CTR **; REDA, RAMY (RAMY); BALKO, STANISLAV (STANISLAV); BERIDY,
AHMED
Dear Taher,
For both IUR interfaces on RNC231 & RNC235 the configuration from ALU RNC
side
is the same
Regards,
Ahmed
Cc: MONNAIE, DANIEL (DANIEL); MAHER, RAFIK (RAFIK); IBRAHIM ABD EL NABY,
Karim
(Karim)** CTR **; REDA, RAMY (RAMY); BALKO, STANISLAV (STANISLAV); BERIDY,
AHMED
(AYMAN)** CTR **; KHEDR, MAHMOUD (MAHMOUD); BURIE, REMI (REMI); RIES, Robert
So as long as for both links there are two SCTP endpoints from RNC231 side
then
having associations to 4 SCTP endpoints from Huawei side should not make a
Dear Soufiane,
Do you see from counters that the Cplane of the IUR needs to be offloaded?
Taher OKASHA
Cc: MONNAIE, DANIEL (DANIEL); MAHER, RAFIK (RAFIK); IBRAHIM ABD EL NABY,
Karim
(Karim)** CTR **; REDA, RAMY (RAMY); BALKO, STANISLAV (STANISLAV); BERIDY,
AHMED
(AYMAN)** CTR **; KHEDR, MAHMOUD (MAHMOUD); BURIE, REMI (REMI); RIES, Robert
Dear Taher,
From ALU RNC processor load point of view both configurations are the same
Two processors are handling all IUR SCTP messages in both cases
Regards,
Ahmed
Cc: MONNAIE, DANIEL (DANIEL); MAHER, RAFIK (RAFIK); IBRAHIM ABD EL NABY,
Karim
(Karim)** CTR **; REDA, RAMY (RAMY); BALKO, STANISLAV (STANISLAV); BERIDY,
AHMED
(AHMED); EL-MIDANY, AHMED (AHMED); Berky, Dusan (Dusan); WAGDY MANSOUR,
AYMAN
(AYMAN)** CTR **; KHEDR, MAHMOUD (MAHMOUD); BURIE, REMI (REMI); RIES, Robert
Dear Soufiane,
The WO you provided will impact only IUR links correct? IUCS/IUPS SCTP
mapping to
Taher OKASHA
Cc: MONNAIE, DANIEL (DANIEL); MAHER, RAFIK (RAFIK); IBRAHIM ABD EL NABY,
Karim
(Karim)** CTR **; REDA, RAMY (RAMY); BALKO, STANISLAV (STANISLAV); BERIDY,
AHMED
I can't tell whether this action will solve the existing IUR issue or not
Regards,
Ahmed
(Tim)** CTR **; ABDEL-HALIM, SAYED (SAYED); OKASHA, TAHER (TAHER); JACKYRA,
AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU, CYRIL
(CYRIL);
ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY MANSOUR,
AYMAN
Dusan (Dusan); IBRAHIM ABD EL NABY, Karim (Karim)** CTR **; RIES, Robert
(Robert);
Hi All,
1/ RNC231 is full ip, there is no alcap here, by the way design use same
primitive
2/ From call trace, there is a lot of rnsap radio link setup failure due to
a SRB
6.8 configuration which is not supported by our product (already seen and
tracked
by AR 1-5254837).
By the way even it's not supported we attempt to allocate resource on uplane
for
those radio link attempt and may conduct to the resource exhaustion
observed.
I believe it the main cause of the apparition of the faulty scenario that
conduct
to the TMU reset.
Regards,
Rmi BURIE
(PHILIPPE)
(STANISLAV); BOSLEY, Tim (Tim)** CTR **; ABDEL-HALIM, SAYED (SAYED); OKASHA,
TAHER
(TAHER); JACKYRA, GREGORY (GREGORY); MORVAN, FREDERIC (FREDERIC); EL-MIDANY,
AHMED
(AHMED); BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET);
DAOU,
CYRIL (CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY);
WAGDY
MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD (MAHMOUD); MOHAMED, GEHAD
Dusan (Dusan); IBRAHIM ABD EL NABY, Karim (Karim)** CTR **; RIES, Robert
(Robert);
Hello All,
From the PPC alarms/debug we have the fallowing alarms that are linked to
the
issue:
This alarms show that the Drift RNC receive a request Qaal2 Establishment
Request
from the callP to be able to reserve and allocate UDP port ipInf/6208 ( this
port
But the TMB is not able to allocate UDP port on the BandWithPool/ipinf
By comparing the RNC 231 IuR configuration of the neighboring to HWI RNC
that
causing the issue to other RNC 235 which not suffer from the same issue,
there is
In the RNC 231 the neighboring to RNC 206 : the IUR corresponding to this
neighboring is IUR/8.
In the IuR/8 for the UPlane we use the the UDP port 6208:
RNC 235:
By the doing the same mapping with the RNC 235 which the same neighboring
RNC 206
Iur/3'IpIf/6203
For my point of view if there is less associations makes only some of the
TMU
linked to this ports more stresses and overload then the other, and then as
the
TMU could not answer any requested so it resets and the RNC is unbalanced
and
The TMU reset happen more than 5268 time since the 18th December !
To conclude this analysis from our part we suspect that the messing
association on
The SRB 6.8 could be a trigger to bring the problem up and make the TMU
relocate
As the RNC 235 not have the same issue, and as he is neighbor with the RNC (
206 -
HWI) we suspect that the different IuR configuration messing BWpool/ ipflow
could
Philippe Delmas will have a look concerning the messing association and will
Regards,
Nabil
BOSLEY, Tim (Tim)** CTR **; ABDEL-HALIM, SAYED (SAYED); OKASHA, TAHER
(TAHER);
BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU,
CYRIL
(CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY
MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD (MAHMOUD); MOHAMED, GEHAD
Dusan (Dusan); IBRAHIM ABD EL NABY, Karim (Karim)** CTR **; RIES, Robert
(Robert);
hello
You should identify and read the transportMap associated to the Iur bwPool
for the
ALU RNC under discussion and check how many ipFlow are indicated in the
transportMap.
Regards
Philippe
To: ROY, Paul (Paul)** CTR **; EL-MIDANY, AHMED (AHMED); IBRAHIM ABD EL
NABY,
REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU, CYRIL (CYRIL); ELHAKIM,
Mohamed
(Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY MANSOUR, AYMAN (AYMAN)**
CTR
Cc: DAS, Sunil (Sunil)** CTR **; T, Sreenivas Murthy (Sreenivas Murthy)**
CTR **;
CARLSON, Keith (Keith)** CTR **; RATHI, Rajneesh (Rajneesh)** CTR **;
LOUVIER,
Hello All,
Last night the recommendations from NEA team to add the needed IPflows to
the IUR
Uplane has been implemented. IUR has been unlocked this morning and so far
it's
Taher OKASHA
Cc: DAS, Sunil (Sunil)** CTR **; T, Sreenivas Murthy (Sreenivas Murthy)**
CTR **;
CARLSON, Keith (Keith)** CTR **; ROY, Paul (Paul)** CTR **; EL-MIDANY, AHMED
(AHMED); IBRAHIM ABD EL NABY, Karim (Karim)** CTR **; ABDEL-HALIM, SAYED
(SAYED);
BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU,
CYRIL
(CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY
MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD (MAHMOUD); MOHAMED, GEHAD
STANISLAV (STANISLAV); BOSLEY, Tim (Tim)** CTR **; Hall, Gail Culver
(Gail)** CTR
Dears,
No TMU reset alarms have been observed till now since the IUR IPflows
addition.
Are there any traces or logs you would like to check to confirm proper
behavior?
Taher OKASHA
To: OKASHA, TAHER (TAHER); BURIE, REMI (REMI); ROY, Paul (Paul)** CTR **
Cc: DAS, Sunil (Sunil)** CTR **; T, Sreenivas Murthy (Sreenivas Murthy)**
CTR **;
CARLSON, Keith (Keith)** CTR **; ROY, Paul (Paul)** CTR **; EL-MIDANY, AHMED
(AHMED); IBRAHIM ABD EL NABY, Karim (Karim)** CTR **; ABDEL-HALIM, SAYED
(SAYED);
BERIDY, AHMED (AHMED); REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU,
CYRIL
(CYRIL); ELHAKIM, Mohamed (Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY
MANSOUR, AYMAN (AYMAN)** CTR **; IBRAHIM, MAHMOUD (MAHMOUD); MOHAMED, GEHAD
STANISLAV (STANISLAV); BOSLEY, Tim (Tim)** CTR **; Hall, Gail Culver
(Gail)** CTR
Hello Taher,
Can I have the RNC CPU load , with the RNC RAB/TMU configuration (counter :
VS_ApCpuUtilizationAvg (U20202))?
I would like to have the counters for the last two days for one hour
granularity.
Regards,
Nabil
To: OKASHA, TAHER (TAHER); MENJAOUI, NABIL (NABIL); BURIE, REMI (REMI); ROY,
Paul
(Paul)** CTR **
Cc: DAS, Sunil (Sunil)** CTR **; T, Sreenivas Murthy (Sreenivas Murthy)**
CTR **;
CARLSON, Keith (Keith)** CTR **; EL-MIDANY, AHMED (AHMED); IBRAHIM ABD EL
NABY,
REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU, CYRIL (CYRIL); ELHAKIM,
Mohamed
(Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY MANSOUR, AYMAN (AYMAN)**
CTR
Tim (Tim)** CTR **; Hall, Gail Culver (Gail)** CTR **; SUSARRET, Andres
(Andres)**
CTR **; MERCHAUT, VINCENT (VINCENT); BENIGHIL, SOUFIANE (SOUFIANE)** CTR **;
Hello Nabil,
Best Regards,
Martin Nabil
To: NABIL GEORGES, MARTIN (MARTIN)** CTR **; OKASHA, TAHER (TAHER); BURIE,
REMI
Cc: DAS, Sunil (Sunil)** CTR **; T, Sreenivas Murthy (Sreenivas Murthy)**
CTR **;
CARLSON, Keith (Keith)** CTR **; EL-MIDANY, AHMED (AHMED); IBRAHIM ABD EL
NABY,
REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU, CYRIL (CYRIL); ELHAKIM,
Mohamed
(Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY MANSOUR, AYMAN (AYMAN)**
CTR
Tim (Tim)** CTR **; Hall, Gail Culver (Gail)** CTR **; SUSARRET, Andres
(Andres)**
CTR **; MERCHAUT, VINCENT (VINCENT); BENIGHIL, SOUFIANE (SOUFIANE)** CTR **;
Hello all,
Please find there after the analysis of the RNC 231 load.
The RNC is not loaded at all , the average CPU per process is under 10%,
this RNC
For the TMU load, one thing catch my attention is that in the board 9 one
TMU is
not reporting data, TMU ( LP9/Ap1).
For the rest of TMU they are all under 25% most of the TMU loads are
oscillating
load of the TMU is not balanced, it is not that surprising because the RNC
is not
We can see also that a small increase of the average RNC load happened after
The TMU in the board 9 ( Lp9/Ap1) is not reporting data ( since 11th Jan
2015
Regards,
Nabil
To: NABIL GEORGES, MARTIN (MARTIN)** CTR **; OKASHA, TAHER (TAHER)
Cc: DAS, Sunil (Sunil)** CTR **; T, Sreenivas Murthy (Sreenivas Murthy)**
CTR **;
CARLSON, Keith (Keith)** CTR **; EL-MIDANY, AHMED (AHMED); IBRAHIM ABD EL
NABY,
REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU, CYRIL (CYRIL); ELHAKIM,
Mohamed
(Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY MANSOUR, AYMAN (AYMAN)**
CTR
(Rajneesh)** CTR **; BURIE, REMI (REMI); ROY, Paul (Paul)** CTR **;
MENJAOUI,
Next issues result from this outage are continuously investigated in another
ARs:
in design
AR 1-5542978 => KPIs degradation after unlock the IUR link between RNC231& H
RNC206 - TPS L3
As even after 2weeks of monitoring period there are no other issues
connected with
Thank you
Best Regards
Robert Ries
Cc: DAS, Sunil (Sunil)** CTR **; T, Sreenivas Murthy (Sreenivas Murthy)**
CTR **;
CARLSON, Keith (Keith)** CTR **; EL-MIDANY, AHMED (AHMED); IBRAHIM ABD EL
NABY,
REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU, CYRIL (CYRIL); ELHAKIM,
Mohamed
(Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY MANSOUR, AYMAN (AYMAN)**
CTR
(Rajneesh)** CTR **; BURIE, REMI (REMI); ROY, Paul (Paul)** CTR **;
MENJAOUI,
NABIL (NABIL); ESCANDE, PHILIPPE (PHILIPPE)
Hello Robert,
For me the correction of provisioning solved the TMU reset issue but the RC
for
the main outage itself for AR 1-5531112 is not clear yet. is it final
confirmation
Taher OKASHA
Cc: DAS, Sunil (Sunil)** CTR **; T, Sreenivas Murthy (Sreenivas Murthy)**
CTR **;
CARLSON, Keith (Keith)** CTR **; EL-MIDANY, AHMED (AHMED); IBRAHIM ABD EL
NABY,
REDA, RAMY (RAMY); KUMAR, NAVNEET (NAVNEET); DAOU, CYRIL (CYRIL); ELHAKIM,
Mohamed
(Mohamed)** CTR **; ZAHABI, RAMSEY (RAMSEY); WAGDY MANSOUR, AYMAN (AYMAN)**
CTR
(Rajneesh)** CTR **; BURIE, REMI (REMI); ROY, Paul (Paul)** CTR **;
MENJAOUI,
Hello Taher
Yes, based on analysis performed by Nabil Menjaoui (sent the 15th Jan
9:23AM) we
know that TMU resets were triggered by insufficient IuR bandwith (alarm
Failed to
Best Regards
Robert Ries
59. 03-Feb-2015 11:59
rories
Hello Taher
I consulted with TEC points regarding the POA what we talked about via Lync
last
time.
Nabil Menjaoui forwarded to me the attached mail stream (where I was not in
copy
originally). You can see, that he already explained in details the outage RC
to
you.
The TMU reset was caused because there was some Ipflow missing and because
the
fact that there was not a balance distribution of Ipflow link to all the
TMUs.
Some TMU interfaces are handling more traffic than the others. This causes
some
TMU`s memory errors. Thus TMU triggers its own reset as defense mechanism in
order
to recover the memory issue. But this activity creates an imbalance and then
all
the TMUs of the RNC start falling down one after the other (like snow ball
effect).
After several resets (at least 5268 per month), the RNC`s self defense
mechanism
decide to reset all the board to recover the issue. Which leads to the
outage.
Best Regards
Robert Ries
Resolution
The issue is known and there is fix for it delivered to load LR13.3.9