Security Level: internal
Guide to Optimizing LTE Service Drops
www.huawei.com
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Change History Date
Version
Description
2012.1.10
1.0
Completed the draft.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Reviewer
Page 2
Author
Abstract This document Defines the call drop rate. Describes how to use the related counters to diagnose a call drop
and to analyze factors influencing the KPI. Describes common diagnosis methods and standard actions to be taken by front-line engineers to handle a call drop problem. Describes the deliverables that the front-line engineers must submit to R&D engineers if the front-line engineers fail to solve the problem after taking the standard actions
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 3
Content • Definition of the Service Drop Rate • Symptoms of a Service Drop • Cause Analysis and Data Processing
• Checklist and Deliverables • Case Study
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Calculation of the Call Drop Rate on the UE Side (1/3) Call Drop Rate = eRAB AbnormRel / eRAB Setup Success x 100% where eRAB AbnormRel is the number of e-RAB abnormal releases and eRAB Setup Success is the number of successful e-RAB setup events.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 5
Calculation of the Call Drop Rate on the UE Side (2/3) • I.
eRAB AbnormRel is calculated by Huawei Genex PA as follows: eRAB AbnormRel increments by 1 if the UE
Does not receive the DEACTIVATE EPS BEARER CONTEXT REQUEST message and,
Does not receive the DETACH REQUEST message from the MME and,
Does not send the DETACH REQUEST message and,
Receives the RRCConnectionReconfiguration message containing the IE drb-ToReleaseList.
In this case, if the ERAB num minus the eps-BearerIdentity contained in the ReleaseList is 0, the UE transits to RRC_Idle mode. II.
eRAB AbnormRel increments by 1 if the UE
Does not receive the DEACTIVATE EPS BEARER CONTEXT REQUEST message and,
Does not receive the DETACH REQUEST message from the MME and,
Does not send the DETACH REQUEST message and,
Receives the RRCConnectionRelease message and the RLC layer performs data transmission in the last 4s in any direction.
In this case, the UE directly transits to RRC_Idle mode.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 6
Calculation of the Call Drop Rate on the UE Side (3/3)
III. ERABAbnormalRel increments by 1 for each released e-RAB if the UE has established e-RAB(s) and enters the RRC_Idle mode before receiving the RRCConnectionRelease message. IV. ERABAbnormalRel increments by 1 if the UE initiates the RRC connection setup request without receiving the RRC Connection Reconfiguration, Deactivate EPS Bearer Context Request, Detach Request, RRC State, or RRC Connection Release message. V. ERABAbnormalRel increments by 1 if the event RRCReestablishFail occurs. The timestamp contained in these two events is the same. Note: The acceptance criteria of some customers may require that all RRC reestablishments initiated by the UE be counted as service drops.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 7
Calculation of the Call Drop Rate on the Network Side
Call Drop Rate = L.E-RAB.AbnormRel / (L.E-RAB.NormRel + L.ERAB.AbnormRel) x 100% where L.E-RAB.AbnormRel is the number of e-RAB abnormal releases and L.E-RAB.NormRel is the number of e-RAB normal releases.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 8
Counters Recorded by the Network •
•
As shown in point A of Fig1, if the eNodeB sends the E-RAB RELEASE INDICATION message containing a cause value that is not "Normal Release", "User Inactivity", "cs fallback triggered", or "Inter-RAT redirection", L.E-RAB.AbnormRel increments by 1. If the E-RAB RELEASE INDICATION message requests release of multiple e-RABs, L.E-RAB.AbnormRel increments by 1 for each e-RAB. As shown in point A of Fig2, when the eNodeB sends the UE CONTEXT RELEASE REQUEST message to the MME, the eNodeB releases all e-RABs of the UE. If the release cause is not "Normal Release", "User Inactivity", "cs fallback triggered", or "Inter-RAT redirection", L.ERAB.AbnormRel increments by 1 for each release.
Note: The eRAB Release procedure releases one or multiple e-RABs. After the procedure, at least the default bearer is maintained. The UE Context Release procedure releases all connections. No bearer is maintained after this procedure. HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 9
Counters That Count Abnormal Releases by the Network (1/4) •
Currently, there are five counters that count e-RAB abnormal releases by the network:
L.E-RAB.AbnormRel.Radio (Number of e-RAB abnormal releases caused by the eNodeB)
L.E-RAB.AbnormRel.TNL (Number of e-RAB abnormal releases caused by the transmission network)
L.E-RAB.AbnormRel.Cong (Number of e-RAB abnormal releases caused by network congestion)
L.E-RAB.AbnormRel.HOFailure (Number of e-RAB abnormal releases caused by handover failures)
L.E-RAB.AbnormRel.MME (Number of e-RAB abnormal releases caused by the EPC)
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 10
Counters That Count Abnormal Releases by the Network (2/4) •
Abnormal releases caused by the EPC
As shown in point A of Fig1 and Fig2, if the eNodeB receives the E-RAB RELEASE COMMAND or UE CONTEXT RELEASE COMMAND message from the MME containing a cause value that is not “Normal Release”, “Detach”, “User Inactivity”, “cs fallback triggered”, or “Inter-RAT redirection”, L.E-RAB.AbnormRel.MME increments by 1.
Note: L.E-RAB.AbnormRel.MME does not include L.ERAB.AbnormRel. A release initiated by the EPC is not counted as a call drop in eRAN2.1SPC400 and later versions.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 11
Counters That Count Abnormal Releases by the Network (3/4) •
Abnormal release not caused by the EPC
As shown in point A of Fig3, if the eNodeB sends the ERAB RELEASE INDICATION message to the MME with a cause value indicating a radio error, L.ERAB.AbnormRel.Radio increments by 1. If the cause value indicates a transmission network error, L.ERAB.AbnormRel.TNL increments by 1. If the cause value indicates network congestion, L.E-RAB.AbnormRel.Cong
increments by 1. If the E-RAB RELEASE INDICATION message requires release of multiple e-RABs, the concerned counter increments by 1 for each e-RAB.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 12
Counters That Count Abnormal Releases by the Network (4/4) •
Abnormal release not caused by EPC
As shown in point A of Fig4, the eNodeB sends the UE CONTEXT RELEASE REQUEST message to the MME to release all e-RABs of the UE. If the cause value indicates a radio error, L.E-RAB.AbnormRel.Radio increments by 1. If the cause value indicates a transmission network error, L.ERAB.AbnormRel.TNL increments by 1. If the cause value indicates network congestion, L.E-RAB.AbnormRel.Cong
increments by 1. This counter measures the abnormal releases caused by preemption and resource congestion. If the cause value indicates a handover failure, L.ERAB.AbnormRel.HOFailure increments by 1. The concerned counter increments by 1 for each e-RAB. The counters no longer increment when the MME sends the UE CONTEXT RELEASE COMMAND message.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 13
Content • Definition of the Service Drop Rate • Symptoms of a Service Drop • Cause Analysis and Data Processing
• Checklist and Deliverables • Case Study
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Symptoms of a Call Drop as Observed in a Drive Test Huawei test UE and UE Probe, or other commercial UEs and their signaling trace software are used in a drive test. Symptoms shown by the traffic monitoring software installed on the drive test computer are: The throughput suddenly falls to a low value or zero. The UE begins to receive system information when a handover is not complete or when the UE is not in a re-establishment scenario. Low throughput
HUAWEI TECHNOLOGIES CO., LTD.
UE receives system information.
Huawei Confidential
Symptoms of a Call Drop as Observed from the Traffic Statistics The call drop problem of a commercial network is observed from the traffic statistics and is reflected by the call drop rate and call drop count. The symptoms shown by the traffic statistics exported from the M2000 are: Global call drop rate, call drop count, and number of successful service setups Call drop rate, call drop count, and time segment of top cells Top cells occupy a high percentage of call drops
High global call drop rate
Time segment of call drops
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Content • Definition of the Service Drop Rate • Symptoms of a Service Drop • Cause Analysis and Data Processing
• Checklist and Deliverables • Case Study
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Steps in Analyzing a Call Drop Problem (1/2)
Step 1: Determine the scope of the call drop problem: Analyze the traffic statistics and CHR to determine the scope of the call drop problem, whether it is a top-cell or top-site problem, entire-network problem, comprehensive problem, or top-terminal/top-UE problem. Note: The analysis method varies for different scenarios. In a scenario of degraded performance after upgrade, you need to compare the differences before and after the upgrade to determine the scope of the degradation. In a scenario of inventory optimization where the call drop performance is below expectation or to be improved, you need to determine the region of performance degradation.
Step 2: Classify the causes of a call drop problem: Analyze the data sources to classify the causes of a call drop problem.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 18
Steps in Analyzing a Call Drop Problem (2/2) Step 3: Do as required by the checklist: Do as required by the checklist to determine the root cause and the closing action. Note: The checklist is described in the next chapter.
Step 4: Close the problem: Close the problem and evaluate the result. If the result is unsatisfactory, repeat the preceding steps. If the closing actions are reproducible, consider the merits of copying the closing actions to the entire network.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 19
Determining the Scope of a Call Drop Problem – Principles of Selecting Top Cells (1/2) The principles of selecting top cells vary for different scenarios. Scenario 1: Performance degradation in the time dimension: The call drop performance degrades after an upgrade, or degrades suddenly due to unknown reasons.
Principles of selecting top cells Calculate the difference of the counters (call drop rate and
number of e-RAB abnormal releases) before and after the upgrade of each cell. Sort the cells by the difference of the call drop rate and the difference of the number of e-RAB abnormal releases to obtain the top
cells of degraded call drop rate and top cells of number of e-RAB abnormal releases. HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 20
Determining the Scope of a Call Drop Problem – Principles of Selecting Top Cells (2/2) Scenario 2: Performance degradation in an inventory optimization: The call drop performance of the live network is below expectation and needs to be optimized to the target value.
Principles of selecting top cells Sort the cells by the difference of the call drop rate and the difference of the number of e-RAB abnormal releases to obtain the top cells of degraded call drop rate and top cells of number of e-RAB abnormal releases.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 21
Determining the Scope of a Call Drop Problem – Criteria (1/2) Top-cell problem: After one-fifth of the top cells of high call drop rate and large number of e-RAB abnormal releases are removed from calculation of the entire-network call drop
performance, if the performance is significantly improved to the expected value, the call drop problem is defined as a top-cell problem.
Entire-network problem After one-fifth of the top cells of high call drop rate and large number of e-RAB abnormal releases are removed from calculation of the entire-network call drop performance, if the performance is not significantly improved, the call drop problem is defined as an entire-network problem.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 22
Determining the Scope of a Call Drop Problem – Criteria (2/2) Comprehensive problem After one-fifth of the top cells of high call drop rate and large number of e-RAB abnormal releases are removed from calculation of the entire-network call drop performance, if the call drop performance is improved a little to a value slightly below the expected value, the problem is defined as a comprehensive (top-cell plus entire-network) problem.
Top-UE problem After one-fifth of the top UEs are removed from calculation of the entire-network call drop performance, if the performance is significantly improved to the expected value, the problem is defined as a top-UE problem. Note Currently, the CHR of the LTE system provides no information about the terminal type. The terminal type is provided by complaining users or inferred from the symptoms. Due to security concerns, the eNodeB does not provide IMSI information. Therefore, top UEs can be inferred only from the TMSI, not from the IMSI.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 23
Classifying the Causes of Call Drop Problems – Obtaining Data Source After determining the scope of the call drop problem, analyze the following data sources to infer the causes of the problem: Traffic statistics
Traffic statistics can be obtained from the M2000/PRS. For details, see section 2.3.3 of LTE Service Drop Troubleshooting and Optimization Guide.doc.
Signaling trace on the network side
Signaling trace can be performed on the M2000. For details, see section 2.2.2 of LTE Service Drop Troubleshooting and Optimization Guide.doc.
Drive test data
The drive test data can be obtained by performing a drive test. For details, see section 2.1.3 of LTE Service Drop Troubleshooting and Optimization Guide.doc.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 24
Classifying the Causes of Call Drop Problems – Acquiring Tools The following table lists available tools, usages, and acquisition method. Tool Name TraceViewer
Probe
Acquisition Method
Usage
Plays back signaling messages traced on the
Released together with the product version and integrated in
LMT.
OfflineTool file package.
Installed on Huawei UE and traces signaling, scheduling, and signal quality information.
http://support.huawei.com/support/pages/editionctrl/catalog/Sh owVersionDetail.do?actionFlag=clickNode&node=000001099 409&colID=ROOTENWEB|CO0000000174
Installed on Huawei UE, counts and analyzes Assistant
NIC
PRS
OMstar
http://support.huawei.com/support/pages/editionctrl/catalog/Sh owVersionDetail.do?actionFlag=clickNode&node=000001099 information. 389&colID=ROOTENWEB|CO0000000174 http://support.huawei.com/support/pages/editionctrl/catalog/Sh owVersionDetail.do?actionFlag=clickNode&node=000001468 Batch data collection tool 041&colID=ROOTENWEB|CO0000000174 http://support.huawei.com/support/pages/editionctrl/catalog/Sh Parses and analyzes traffic statistics of the owVersionDetail.do?actionFlag=clickNode&node=000001430 eNodeB. 110&colID=ROOTWEB|CO0000000065 http://support.huawei.com/support/pages/editionctrl/catalog/Sh Parses and analyzes original traffic statistics owVersionDetail.do?actionFlag=clickNode&node=000001470 and CHR. Compares parameters. 066&colID=ROOTENWEB|CO0000000174 signaling, scheduling, and signal quality
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Classifying the Causes of Call Drop Problems – Interfaces of the Tracing Tools
Signaling Trace Management interface of the M2000 Huawei UE Probe
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Classifying the Causes of Call Drop Problems – Interfaces of the Analysis Tools
Huawei UE Probe
eNodeB TrafficReview
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Classifying the Causes of Call Drop Problems – Identifying Reconfiguration Messages Identifying the RRC CONNECTION RECONFIGURATION message Start the Message Browser to view the details of the message. If the message contains the IE cqiReportConfig, the message is a CQI reconfiguration message. If the message contains the IE measConfig, the message is a measurement control message.
If the message contains the IE targetPhysCellId, the message is a handover command.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Analyzing Traffic Statistics to Obtain Causes of Call Drop Problems •
Trend analysis
Obtain the call drop KPI of the global network for at least one to two weeks, or two weeks before and one week after the upgrade in case an upgrade has been performed. An example is shown in the upper right figure.
•
Cause analysis
The counters indicate whether an abnormal release is caused by the Uu interface or cell resource congestion, as shown in the lower left figure.
•
Top analysis
Analysis of the traffic statistics can show the top cells and top time segments that have the highest RRC connection setup failure and e-RAB setup failure, as shown in the lower right figure.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 29
Analyzing Signaling Trace to Obtain Causes of a Call Drop The signaling trace clearly shows the signaling procedure that causes the call drop and is effective for diagnosing problems found during a drive test or reproducible problems. The disadvantage is that the trace must be performed before the problem is triggered and that manual analysis is required. The signaling trace cannot be used for irreproducible or small-probability problems.
Standard interface trace (a major means): Obtain top cells and top time segments by
analyzing the traffic statistics, start the standard interface trace on the top cells and at top time segments, check which signaling procedure causes the call drop.
Single-UE global-network trace (a minor means): Query the IMSI of a TMSI from the
EPC, start the global-network trace of this IMSI. This method is effective for ensuring VIP service.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 30
Analyzing Drive Test Data to Obtain Causes of a Call Drop The advantage of a drive test is that the downlink signal strength, uplink transmit power, bit error rate, and scheduling information can be obtained,
depending on the drive test software and UE capability. The disadvantage is that in terms of signaling trace, only the signaling (including the RRC and NAS messages) of the Uu interface is traced. Therefore, it is desirable to combine a drive test with the signaling trace on the eNodeB.
Determine whether a call drop is caused by uplink or downlink problem.
The drive test can show whether the UE or eNodeB fails to receive the signaling message; the downlink RSRP/SINR obtained from the drive test indicates the downlink channel quality; the uplink transmit power indicates whether the uplink is insufficient.
Determine whether a call drop is caused by UE.
The UE log shows whether the UE correctly processes the received signaling
messages and whether the UE suddenly does not send any data.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 31
Content • Definition of the Service Drop Rate • Symptoms of a Service Drop • Cause Analysis and Data Processing
• Checklist and Deliverables • Case Study
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Checklist for the Entire-Network Problem (1/2) Standard Action Preliminary analysis of traffic statistics
Analysis Action
1. Analyze the traffic statistics to determine the range and cause of the call drop. 2. Analyze the trend of the call drop rate to determine change of the call drop rate. Version check 1. Check whether the eNodeB version is upgraded or a new patch is installed. 2. Check whether the EPC version is upgraded or a new patch is installed. Equipment 1. Global alarm check and transmission alarms Parameter configuration check
1. Global parameter configuration check 2. Inspection of EPC parameter change
HUAWEI TECHNOLOGIES CO., LTD.
Deliverables
Closing Action
1. Distribution of the causes and top causes 2. Actions that affect the call drop rate
1. Optimize the network according to the top causes of the call drop problem. 2. Describe the actions that affect the call drop rate and the impact.
New and old version numbers
Describe the changes that may affect the call drop rate based on the Release Notes.
Critical and major alarms
1. Analyze the impact of alarms on the call drop rate. 2. Clear the alarms and check whether the call drop KPI is restored. 1. Difference of 1. Determine whether the parameters before and parameter change affects the callafter the upgrade drop KPI. 2. Difference of 2. Roll back the parameters and parameters compared check whether the call-drop KPI is with the baseline restored. 3. Purpose and impact of the change of EPC parameters
Huawei Confidential
Page 33
Checklist for the Entire-Network Problem (2/2) Standard Action Operation record check
Neighbor relationship check
Major event check
Analysis Action Check whether batch operations affecting the global network are recorded and whether neighboring cells and PCI are re-planned. Check for missed configuration of neighbor relationship. Deployment of scattered sites causes incorrect neighbor relationship. Check for allocation of a large quantity of phone numbers and major activity (such as ceremony, holidays, and games)
Deliverables
Closing Action
Records of batch operations affecting the global network
Analyze the impact of batch operations on the call drop rate. Determine whether the batch operations can be rolled back.
Missed configuration of neighbor relationship
Add neighboring cells that are not configured in the neighbor relationship. Check whether the call drop KPI is restored.
1. Check the terminal type involved in the number allocation, quantity of number allocation, and subscription policy. 2. Determine the range and time segment of the major event.
Check whether the major event is coupled to the deterioration of the call drop rate in the time dimension.
Note The standard actions of a comprehensive problem (entire-network plus top-cell problem) are a combination of the checklist for the entire-network problem and the checklist for the top-cell problem. HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 34
Checklist for the Top-cell Problem (1/2) Standard Action
Analysis Action
Preliminary analysis of traffic statistics of top sites
1. Analyze the traffic statistics to determine the range and cause of the call drop. 2. Analyze the trend of the call drop rate to determine change of the call drop rate. Version check Check whether the of top sites eNodeB version is upgraded or a new patch is installed. Equipment and Alarm check of top sites transmission alarms of top sites Parameter configuration check of top sites
Deliverables
Closing Action
1. Distribution of the causes and top causes 2. Actions that affect the call drop rate
1. Optimize the network according to the top causes of the call drop problem. 2. Describe the actions that affect the call drop rate and the impact.
New and old version numbers
Describe the changes that may affect the call drop rate based on the Release Notes.
Critical and major alarms
Analyze the impact of alarms on the call drop rate. Clear the alarms and check whether the call drop KPI is restored.
Parameter configuration 1. Difference of parameters 1. Determine whether the parameter check of top sites before and after the upgrade change affects the call-drop 2. Difference of parameters KPI. compared with the baseline 2. Roll back the parameters and check whether the call-drop KPI is restored.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 35
Checklist for the Top-cell Problem (2/2) Standard Action
Analysis Action
Deliverables
Closing Action
Operation Check whether batch Records of batch operations Analyze the impact of batch record check operations affecting the global affecting the global network operations on the call drop of top sites network are recorded and rate. Determine whether the whether neighboring cells and batch operations can be rolled PCI are re-planned. back. Neighbor Check for missed Missed configuration of Add neighboring cells that are relationship configuration of neighbor neighbor relationship not configured in the neighbor check of top relationship. Scattered site relationship. Check whether cells deployment or network the call drop KPI is restored. optimization leads to incorrect neighbor relationship. Coverage Analyze the MCS and CQI Coverage evaluation report of Perform network optimization check of top contained in the traffic top cells to optimize the coverage. cells statistics, CHR, and drive test data to check for coverage overlap or weak coverage of the top cells. Interference Analyze the real-time trace Interference evaluation report Find out and remove the check of top data of the top cells to check of top cells interference. cells for inter-modulation interference and external interference. Check whether the major Major event Check for allocation of a large 1. Check the terminal type quantity of phone numbers and involved in the number event is coupled to the check major activity (such as allocation, quantity of number deterioration of the call drop ceremony, holidays, and games)allocation, and subscription rate in the time dimension. in the vicinity of top cells. policy. 2. Determine the range and time segment of the major event. HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 36
Diagnosing Radio Problems •
Fault Description
If the abnormal release is recorded in the counter L.E-RAB.AbnormRel.Radio, the abnormal release is caused by Uu interface and occurs in a non-handover scenario.
•
Possible Cause
The abnormal release is caused by weak coverage, uplink interference, or abnormal UE that lead to maximum number of RLC retransmissions, out-of-sync, or failure of signaling interactions. For details about diagnosing the interference problem, see LTE RF Channel Check and Troubleshooting Guide.
•
Fault Handling Procedure
Analyze the CHR to check whether some top UEs have the highest count.
Analyze the cause values recorded in the CHR.
If the call drop is caused by a factor other than the signaling procedures, analyze the DRB scheduling at layer 2 to determine whether the call drop is caused by weak coverage or interference.
If the call drop is caused by signaling procedures, observe the last ten signaling messages to determine the faulty signaling procedure. Determine whether the fault of the signaling procedure is due to failure to receive or process the signaling messages by either the UE or eNodeB.
The cause values recorded in the CHR are UEM_UECNT_REL_UE_RLC_UNRESTORE_IND, UEM_UECNT_REL_UE_RESYNC_TIMEROUT_REL_CAUSE, UEM_UECNT_REL_UE_RESYNC_DATA_IND_REL_CAUSE, UEM_UECNT_REL_UE_RLF_RECOVER_FAIL_REL_CAUSE, UEM_UECNT_REL_RRC_REEST_SRB1_FAIL, and
UEM_UECNT_REL_RB_RECFG_FAIL_RRC_CONN_RECFG_CMP_FAIL.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 37
Diagnosing Handover Failures •
Fault Description
If the abnormal release is recorded in the counter L.E-RAB.AbnormRel.HOFailure, the abnormal release is caused by outgoing handover failure.
•
Fault Handling Procedure
Obtain the top cells that have the highest counter L.E-RAB.AbnormRel.HOFailure, analyze the pairs of source and target cells to obtain the top target cells that have the
highest failure rate.
Analyze the CHR of the source and target cells to determine whether the handover failure is caused by failure to receive the handover command or random access failure. Examples of the cause values are UEM_UECNT_REL_HO_OUT_X2_REL_BACK_FAIL
and UEM_UECNT_REL_HO_OUT_S1_REL_BACK_FAIL.
Optimize the handover parameters and neighbor relationship and check whether the call drop KPI is improved.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 38
Diagnosing the Transmission Network Problem •
Fault Description
If the abnormal release is recorded in the counter L.E-RAB.AbnormRel.TNL, the abnormal release is caused by the transmission network.
•
Possible Cause
This call drop is caused by the abnormal transmission between the eNodeB and MME,
such as S1 interface break.
•
Fault Handling Procedure
Check for alarms about the transmission network. Clear the alarms and check whether the problem of abnormal release is solved.
Observe the M2000 and check whether alarms about the transmission network are recorded in the M2000.
Clear the alarms.
If abnormal releases are still recorded in the counter L.E-RAB.AbnormRel.TNL, collect
the logs and submit them to R&D engineers for further analysis.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 39
Diagnosing the Congestion Problem •
Fault Description
If the abnormal release is recorded in the counter L.E-RAB.AbnormRel.Cong, the call drop is caused by resource congestion.
•
Possible Cause
This call drop is caused by radio resource congestion, such as exceeding the
maximum number of users.
•
Fault Handling Procedure
If the long-term congestion of a top cell leads to call drops, a short-term solution is to enable the MLB algorithm or inter-operation to alleviate the load of the local cell.
The long-term solution is to expand the capacity.
Enable the MLB algorithm and check whether the congestion problem is alleviated.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 40
Diagnosing MME Faults •
Fault Description
If an abnormal release is recorded in the counter L.E-RAB.AbnormRel.MME, the abnormal release is initiated by the EPC. However, this abnormal release is not recorded in the counter L.E-RAB.AbnormRel.
•
Fault Handling Procedure
Analyze the information of the EPC.
The cause value recorded in the CHR is UEM_UECNT_REL_MME_CMD. Analyze the last ten signaling messages recorded in the CHR. If these messages show that the problem is not caused by the eNodeB, focus on analysis of the EPC.
Analyze the S1 interface trace of the top cells to obtain the distribution of the cause value.
Discuss with the EPC engineers about the analysis result and signaling messages.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 41
Deliverables •
Output of the activities in the checklist
•
If the front-line engineers fail to solve a difficult problem, collect the following information and submit them to R&D engineers for further analysis:
One-click log (Mandatory)
Standard interface signaling (Mandatory)
Signaling trace of the S1, X2, and Uu interfaces
Network configuration (Mandatory)
Logs of the LMPT and LBBP of the top cells
Topology information, engineering parameters, and configuration files of the top sites
TTI trace (Optional)
IFTS trace and cell trace. These traces generate large amount of data. Only the data of the top cells and
top time segments is collected.
Single-UE trace (Optional)
The single-UE trace is used for in-depth diagnosis of top UEs. The entire-network single-UE trace can be performed by using the IMSI queried from the EPC using the TMSI.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 42
Content • Definition of the Service Drop Rate • Symptoms of a Service Drop • Cause Analysis and Data Processing
• Checklist and Deliverables • Case Study
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Case 1: RRC Reestablishment Failure of a UE
As shown in the upper right figure, the cause value of the abnormal release is RRC_REEST_SRB1_FAIL. As shown in the middle right figure, this problem occurs repeatedly from 11:51 o'clock to 18:49 o'clock in cell 0. As shown in the lower right figure, the TMSI column shows that this problem is contributed by a single UE whose TMSI is C2 B0 B0 40 and the cause value is "Reconfiguration Failure". As shown in the lower left figure, the message type indicates that this reconfiguration message is not a handover command or measurement control. This message is probably for reconfiguration of the CQI, SRS, or transmission mode (TM). Upon reception of the RRC CONN REESTAB message, the UE does not respond. Therefore, the eNodeB releases the UE in 5s.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 44
Case 2: UE Exception
Analysis of the CHR shows that the cause value of the abnormal release is
RLC_UNRESTORE_IND. This cause value indicates that the maximum number of DRB RLC retransmissions is exceeded.
This problem occurs repeatedly from 10:51 to 13:49 in cell 2.
The TMSI column indicates that this problem is contributed by a single UE whose TMSI is C2 7F 20 56.
The last 16 DRB scheduling procedures at a period of 64ms indicate that the
symptoms are similar. The symptoms are that the UE encounters suddenly terminated data transmission shortly after the access. The duration from access to release is tens of seconds to 2 minutes, indicating that the problem is not caused by script test. The access type is MO-DATA, indicating that the user is performing a service.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 45
Case 3: Poor Uplink Quality •
As shown in the right figure, the uplink RSRP and SINR received by the eNodeB are poor from the last four 512 ms to the last sixteen 64 ms: The uplink
RSRP is below –135 dBm and the SINR of the SRS and DMRS is below –3 dB, indicating that the service drop is caused by uplink weak coverage.
•
As shown in the left figure, from the last four 512 ms to the last sixteen 64 ms, the uplink RSRP is about –130 dBm but the SINR of the uplink SRS and DMRS is below –3 dB, indicating that the service drop is due to weak coverage caused by weak uplink interference.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 46
Case 4: Target Cell Reconfiguration Failure •
Release cause TGT_ENB_RB_RECFG_FAIL is the cause value contained in the RB reconfiguration failure message during a handover. The symptom is that after the UE is successfully handed over to the target cell, the target eNodeB sends the PATH SWITCH REQ ACK message to the MME and, in 100 ms, sends the UE CONTEXT REL REQ message containing the cause value "unspecified". The lower left figure shows the last ten signaling messages.
•
Fault diagnosis
During the handover procedure, the EPC delivers the PATH_SWITCH_ACK message containing the downlink AMBR value that is inconsistent with the downlink AMBR contained in the S1/X2 handover request. Analysis shows that this is a defect of the RR module. The upper-layer control module of the RR module sends the AMBR Update message to the RB module who thinks that there is no need to deliver a reconfiguration message to the UE. Therefore, the RB module returns a null value to the upper-layer control module. However, the upper-layer control module regards this return value as an exception and releases the UE. This problem is solved in eRAN2.1SPC430.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 47
Case 5: Service Drop Caused by Inter-RAT Redirection •
Release cause: Inter-RAT redirection
IRHO_REDIRECTION_TRIGER is the release caused by inter-RAT redirection. In eRAN2.1SPC400/SPH401, this cause value is counted as a call drop, as shown in the following figure.
This problem is solved in eRAN2.1 SPC420, as shown in the right figure.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 48
Case 6: Service Drop Caused by Abnormal Transmission •
On December 11, the service drop rate of the entire network deteriorates for the Tele2 900M, Telenor 900M, and Tele2 2.6G bands, as shown in the following figure.
•
Huawei field engineers discussed with the customer and suspected the EPC. However, they got no positive answer.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 49
Case 7: Service Drop Caused by Abnormal Uu Interface •
Release cause
UE_RESYNC_TIMEROUT_REL_CAUSE indicates that the abnormal release is caused by resynchronization upon timeout of the resynchronization timer. The same problem is recorded by the standard interface trace as "Radio Connection With UE Lost".
UE_RLC_UNRESTORE_IND indicates that the abnormal release is caused by restoration failure after exceeding the maximum number of RLC retransmissions. The same problem is recorded by the standard interface as "Radio resources not available".
UE_RESYNC_DATA_IND_REL_CAUSE indicates that the abnormal release is caused by resynchronization triggered L2 report data. The same
problem is recorded by the standard interface trace as "Unspecified".
•
Cause analysis
The DRB scheduling information at the last 4 512ms and 16 64ms periods shows that most abnormal releases are caused by suddenly terminated data transmission, possibly caused by unplugging the data card or UE fault. The following figure shows the CHR information.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 50
Case 8: RRC Connection Reestablishment Failure •
Release cause (“Radio Connection With UE Lost” recorded in the standard interface trace)
RRC_REEST_SRB1_FAIL indicates failure to restore SRB1 during RRC reestablishment.
The last 10 signaling messages as shown in the following figure indicates that after sending the RRC_CONN_REESTAB message, the eNodeB fails to receive the RRC_CONN_REESTAB_CMP message from the UE before the 5s timer on the Uu interface expires.
The L2 scheduling information shows that the UE sends the ACK message upon reception of the RRC_CONN_REESTAB message.
We suspect that the problem is caused by failure of some UEs to send the RRC_CONN_REESTAB_CMP message. Some Samsung UEs have such a problem.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 51
Thank you www.huawei.com