LTE RAN troubleshooting guideline
Contents
1 Revision History 5
2 Purpose 6
3 RBS Troubleshooting Tools 7
3.1 Tools overview 7
3.2 Element Manager 9
3.3 COLI 13
3.4 NCLI 14
3.5 Moshell 17
4 OSS-RC TOOLS 23
4.1 Overview of OSS-RC 23
4.2 CEX 23
4.3 NSA Cabinet Viewer 24
4.4 SMO 24
4.5 AMOS 25
4.6 Performance Management 25
5 LTE RBS STATUS 27
5.1 RBS status overview 27
5.2 O&M 28
5.3 Alarms 31
5.4 Node synchronization 33
5.5 MO status 34
5.6 Cell availablity 35
5.7 S1-MME status 37
5.8 Traffic status 37
5.9 Node performance 37
5.10 Event history 39
6 Ericsson internal analysis tools 42
6.1 Internal tools overview 42
6.2 Teviewer/terouter 43
6.3 Viewer/router 44
6.4 Baseband tracing 45
6.5 TET.pl 45
6.6 Decode 46
6.7 Lteflowfox 47
6.8 LTELogTool 48
6.9 Cdae 49
6.10 Bbfilter 50
7 RBS observability 52
7.1 T&E monitoring 55
7.2 Lte exceptions 56
7.3 Data collect 58
7.4 RBS recovery 59
8 S1 connection Setup and Status 62
8.1 S1 overview 62
8.2 Check IP configuration 62
8.3 Check Connectivity from the LTE RBS 63
8.4 Check S1 RBS Attributes 64
8.5 S1 Setup Failures 65
8.6 Check SCTP Host info 66
Revision History
"Revision "Date "Responsable "Comment "
"PA1 "2013-06-26 " "Preliminary version "
Purpose
As a troubleshooter in cutting edge LTE technology, you need to
be:
Proficient with LTE RBS tools
Efficient in handling LTE RBS
Precise in data collection
Be confident in handling Ericsson's LTE RBS product
Enhance your customer support interaction
This document is written for:
Ericsson internal Service and Field Engineers working with the
LTE RAN
References to Ericsson internal tools will be made to show a
complete picture of the available tools used to troubleshoot the
RBS
RBS Troubleshooting Tools
1. Tools overview
Ericsson engineers use both official and unofficial tools to
perform their troubleshooting tasks. Unfortunately not one tool,
official or unofficial, is able to handle all the observability
requirements of the LTE RBS. Hence troubleshooters need to be
aware of a number of tools to successfully support the node.
This chapter aims to show a complete set of tools used to
troubleshoot the LTE RBS.The use of unofficial or internal tools
is a complicated matter when the end customer is involved (or is
aware).
Internal tools are often unsupported by design and have little in
the way of management (updates, feature requirements, etc). End
customers are usually not allowed (or restricted) from using
them. This creates problems when the tools allow for better
observability than officially supported tools.
Moshell is an example of an internal tool that was so useful it
found its way into an official product as a feature (namely OSS-
RC). The time for this to occur was lengthy, and many customers
did not have access to this powerful tool to support their
products.
Our support tools could be logically split into local (on-site)
and remote domains (example tools, some omitted). Connectivity in
the LTE RAN is generally provided by IP O&M interfaces.
By local domain we usually mean:
Tools can be connected directly to the site LAN of the node
We have direct access to the nodes interfaces (like the serial
port)
By remote domain we usually mean:
Connection to the node is performed via the OSS-RC COMINF (or a
server connected to the COMINF) with direct O&M connectivity to
the node
Note that some tools fit both categories and can be used locally
and remotely (e.g. Moshell, COLI, NCLI). Most tools are IP based
hence the minimum requirement is O&M connectivity (via the COMINF
cloud shown).
The Tools Server shown here is usually installed to provide
additional essential support or troubleshooting capabilities. It
is a common server(s) in most WCDMA networks. This is not an
officially supported node/product that is delivered by Ericsson.
So configurations vary.
Local tools:
EM – Element Manager
NCLI – Node Command Line Interface
COLI – Command Line Interface
Remote tools:
ITK – ISP Tool Kit
AMOS – Advanced MO Shell
CEX – Common Explorer
SMO – Software Management Organizer
CPP Management Protocols:
The Management Protocols offer the operator methods to connect to
the node (for log collection, alarm handling, configuration
tasks, licenses, etc). The CPP platform provides both unsecure
(e.g. Telnet) and secure (e.g. ssh) connection protocols, based
on the operators node security policies.
Most tools we use to troubleshoot the node require access to the
Service Layer via IIOP (e.g. Moshell, EM, etc).
CPP Management Architecture:
The CPP Management Arch uses a client/server model, where a
higher layer is a client to a lower layer. This means that each
layer receives orders from a higher layer and either handles
these locally or translates these into orders towards the next
lower layer.
The management structure is divided into four layers:
Presentation Layer – handles the representation of the underlying
MOM and the related operator interaction. Provides the
appropriate user interaction depending on the managing system.
This is where some of our troubleshooting tools will interact.
Service Layer – Provides generic access to the Management
Adaptation Layer (via IIOP communication – a protocol in CORBA).
Management Adaptation Layer – Abstracts the lower
hardware/software layer implementation to provide a level
suitable for managing systems.
Resource Layer – Handles the traffic related functionality (it
contains the actual managed resources in terms of hardware and
software)
The CPP management structure:
MS – Management Services terminate the management interfaces on
the node. MS include Configuration Service, Alarm Service, etc.
MAO – The task of a Management Adaptation Object is to adapt the
managed object specification by which a managing system sees the
actual resource(s) to a given resource interface. One MAO can
have relationships with one or more FROs.
FRO – As seen from the Management Adaptation Layer, the FROs
provide a managed resource view of the Resource Layer. FRO(s)
hide the behaviour of RO(s). The purpose of the FRO is to act as
an interface between the MAO and the RO, by handling the
configuration transactions and storing configuration data for the
RO.
RO – A Resource Object (RO) is the smallest software entity that
represents a logical or physical system resource. In the Resource
Layer the RO instances interact with each other to perform the
actual task(s) of the system.
MO – An managed object is a way of modelling the CPP node. An MO
is implemented by a hierarchy of MAO(s), FRO(s) and RO(s). All
MAOs together comprise the MOM (Managed Object Model) that
defines/communicates the manageable interface of the node. The
MOM is delivered with the RBS and is used by several of the tools
we will mention shortly.
1 Element Manager
------GUI for managing the LTE RBS
The Element Manager is used to interact with the systems MOs.The
EM application delivered with the RBS runs on Microsoft Windows
OS and multiple browsers.
OSS-RC can also launch the EM.
The Element Manager (EM) is a java based GUI used to manage RBS
nodes through their Managed Objects (MOs)
The EM provides the following functionality:
Provides multiple views organised by area (radio, software, etc)
Presents detailed information about the MO structure of a node
Allows the creation and deletion of MOs, setting of attributes
and calling of actions on MOs
Provides MO/attribute search capabilities
Provides detailed alarm and event data
Provides a graphical method of troubleshooting the LTE RBS (based
on views).
Connects to a single LTE RBS.
To initially download the LTE RBS EM for installation, use the
link:
http:///em/index.html
Once installed, the EM is launched via the Windows Start Menu as
shown below:
The MO structure is visible under the MO Tree section:
The Radio Network view is selected in the drop down menu.
The MOs with operationalState, availabilityStatus and
administrativeState attributes are displayed automatically.
2 COLI
------Board level command line shell
COLI is a fundamental tool in supporting CPP based nodes. Its
commands operate on other Ericsson products including WCDMA RBS,
WCDMA RNC and MGW (all based on CPP).COLI is used, in many cases,
to provide non graphical access to configuration, alarm and
status commands. Other tools, like AMOS/Moshell, make substantial
use of COLI commands in their command sets.
The RUL and Baseband module in LTE RBS provide COLI because they
also implement OSE (Operating System Enea).Command line shell
provided as part of the CPP platform.It is a board level shell.
The application (LTE) can extend the CPP COLI commands with
application level commandsCOLI connects to a node via serial
(local console) or by using an IP client application (Telnet,
ssh).
COLI command:
COLI provide access to some fundamental node troubleshooting
commands:
Trace and Error log output and commands (te log read…)
IP configuration and support commands (ping, ifconfig, …)
Hardware/software status commands (vii, pm_lminfo, …)
Some example LTE RBS COLI commands include:
ue - used for UE selective tracing
hicap - provides baseband traces via TN
mtd - access to the MTD baseband trace function
3 NCLI
------Board level command line shell with access to MOs
Another command line shell provided as part of the CPP platform.
Interacts with the nodes MOs. The Node Command Line Interface
(NCLI) is another command shell environment provided by CPP (via
the COLI). NCLI is used to manage and configure nodes, as it can
directly manipulate MOs in the system (like EM). It allows
operators to script-up manual procedures that would otherwise be
lengthy to perform manually. It can also aid in Integration and
Configuration tasks as NCLI takes command file input
NOTE: The EM, COLI & NCLI are all delivered with the LTE RBS.
The operator tab-completes the Equipment MO:
The NCLI commands are mostly related to MO interaction and
navigation. Bold commands are direct MO interaction commands.
Following are some examples:
"action "Execute an action on a "jump "Change working Managed "
" "Managed Object " "Object "
"alarms "Print active alarm list "cd "alias for jump "
"create "Create Managed Object "man "Display manual pages "
"delete "Delete Managed Object "help "alias for man "
"rm "alias for delete "pwldn "Print working Managed "
" " " "Object "
"exit "Terminates an NCLI session "pwd "alias for pwldn "
"get "Get attributes and children "search "Search for Managed "
" "of a particular Managed " "Objects "
" "Object " " "
"ls "alias for get "set "Set Managed Object "
" " " "attribute values "
"group "Manage group of Managed "tx "Transaction management "
" "Objects " " "
"history "Display command history "diff "Compare Managed Objects "
"info "Print information from the Managed Object Model (MOM) "
By default, the NCLI is not aware of the MOM structure – forces
'get . userLabel'. Thus operator requires a copy of the MOM to
reference. It is possible to make NCLI MOM aware however. An EKB
KO was written providing the details of activating this
functionality (which can be used for the LTE RBS). Now, 'get
' returns all attribute values (also supports tab completion)
as well as printing the MO children.
To call the manualRestart action on ManagedElement MO:
Get the parameter eNBId in the ENodeBFunction MO:
To search for Upgrade packages that are only deleteable:
To search for LoadModule MOs that match a name:
To print the active alarms on a node:
To set then get an MO's userLabel parameter:
4 Moshell
------Command line tool for LTE RBS management
External command line tool that interacts with MOs and CPP
services of a node.Provides tools/commands that interact with the
node on multiple levels (like file service, web service, COLI,
etc).Additional features like scripting, KPI analysis, etc, makes
this a powerful support tool.
Moshell is an Ericsson internally developed tool that originates
from support of the WCDMA product line. It has expanded to
support multiple CPP based nodes like the MGW, WCDMA RNC & RBS
and now the LTE RBS. Moshell is an advanced command line node
management tool that provides access to:
Configuration Service
Performance Management Service
Log Service
Alarm Service
COLI and File transfer
Moshell commands can be divided into the following categories:
Basic MO commands
Load, Get, Set, Delete, Create, Action, MOM,
etc
Other MO commands
Sw/hw inventory, DCG, fetch/process node logs,
etc
Performance Management commands
Read counter MOM, suspend/resume/delete/set
scanner, etc
Other commands
Scripting, file transfer, target monitor, file
tree, etc
Opening and quitting Moshell towards a node:
Connecting Moshell to multiple nodes:
Viewing and querying the MOM:
Load and Print MOs:
Get, Set and Action MOs and their attributes:
Moshell has several commands that fall into an "other" category
like:
wait / print / return / if / for / else : Moshell scripting
ftget / ftput / ftdel : FTP/SFTP file transfers
pgu : Program upgrade (loading black load modules)
mon : Setting up the Target Monitor
sql : Start/stop/check the SQL client on the node
bo / ba / br / be / bp : Manage board groups
ftree : Recursive listing of a node directory
OSS-RC TOOLS
1 Overview of OSS-RC
The OSS tools allow us to manage nodes remotely on a network
level. It is important to note that many tools have
functionality/commands that overlap (displaying alarms,
performing restart actions, etc).We usually find additional
servers connected to the COMINF with other Ericsson tools/scripts
to aid in troubleshooting.
These OSS-RC tools can be used effectively to collect data from
LTE RBS nodes.
In the following chapter OSS-RC tools are interpreted:
2 CEX
----- Operation, Maintenance and Network Status
CEX can be used for: Network Topology and Configuration Viewing;
Updating Network Topology; Launching Applications; Network Status
Monitoring and Troubleshooting.
CEX works on a network level and allows easy LTE RBS node
management.CEX, in general (for all RATs) uses:
1. Network Topology and Configuration Viewing
Detailed information relating to node configuration
2. Updating Network Topology
Access to configuration management tools (e.g. BSIM)
3. Launching Applications
Other LTE RAN support tools can be launched from CEX (EM, SMO,
AMOS, etc)
4. Network Status Monitoring and Troubleshooting
Provides node health (cell status, synchronisation status, etc)
and performance indications
3 NSA Cabinet Viewer
----- Hardware fault analysis
Node Status Analyzer (NSA) is a radio network troubleshooting
tool (used mainly for WCDAM networks).NSA does not offer full
support for the LTE RAN. Only the Cabinet Viewer application is
useful for LTE RAN.NSA can be launched from CEX.The collectTraces
action will be described in detail in the Data Collection & Node
Recovery section.
From the Cabinet Viewer a user can:
1. Restart an RBS
2. Restart an object (board, radio hw, etc)
3. Lock or unlock an object (board, etc)
4. Test an object (board, etc)
5. Save logs (T&E, alarm, event, availability)
6. Collect traces (calls action collectTraces)
4 SMO
----- Software and Hardware handling
One key aspect of SMO is that it works towards multiple network
elements.The tool is designed to handle CPP based nodes
(including the LTE RBS).Some of SMO's main uses for the LTE RAN
will include network upgrades and configuration version
management.In a troubleshooting aspect, SMO can be used to
restore previously backed up CVs, rollback nodes to previous
software levels, restart multiple nodes, etc
SMO provides:
1. SW Inventory - listing LMs per Upgrade Package
2. License Inventory - listing license key information (individual
node licenses can be viewed). This includes import and
installation of license key files.
3. Hardware data - information on the installed hardware (subracks,
boards, fans, etc)
4. Software distribution - downloading of upgrade package software
or other software/files
5. Remote software upgrade - upgrading nodes centrally from OSS-RC
6. Monitor of jobs - any job/activity can be monitored, for example
remote software upgrades
7. Node backup administration - creating node CVs, uploading CVs,
etc
5 AMOS
----- Command line tool for LTE RBS management
AMOS is a licensed OSS-RC feature.AMOS does not require another
workstation with COMINF connectivity as Moshell does.No
installation restrictions exist (it is delivered intact).
To start AMOS, type moshell in an OSS-RC terminal shell.AMOS can
be launched from CEX.
6 Performance Management
----- CELL and UE Trace features
WCDMA makes use of the UETR, CTR and GPEH functions that are
initiated from OSS-RC for performance investigations. LTE also
has similar Performance Management Recordings that include:
1. UE Trace (similar to UETR)
2. CELL Trace (similar to both CTR and GPEH)
These functions log events and radio environment measurements
that are selected by the operator.UE Trace - Operator selects one
UE to monitor and Cell Trace - RBS records all, or a subset, of
UEs. All data is stored in 15 minute recording periods to
file(s).
Internal events are generated by the RBS. They include: Events,
procedures and periodic reports on RBS, Cell and UE level.
Example:
INTERNAL_PROC_DEFAULT_EPS_BEARER_SETUP
INTERNAL_PER_CAP_LICENSE_UTIL_REP
External events are events external to the RBS. They include:
Layer 3 protocol messages in ASN.1 format. Example:
RRC_RRC_CONNECTION_RECONFIGURATION
S1_INITIAL_CONTEXT_SETUP_RESPONSE
Cell Trace is a powerful troubleshooting tool for analysing
faults on a cell or RBS
Cell Trace is used for:
1. Analysing performance degradation
2. Traffic/Call scenario troubleshooting
3. Detailed performance monitoring
4. Support of network planning, optimisation and expansion
activities
UE Trace is used for individual UEs. UE trace data is stored and
passed onto target RBS' in the case of X2 handover (shown in
detail later). 16 simultaneous UE Trace profiles per RBS are
supported. UE Trace activation is similar to Cell Trace
activation and wont be shown.
It provides support in:
1. Optimisation activities (like handover analysis)
2. Troubleshoot problem areas (drops, etc)
3. Site acceptance during rollout
LTE RBS STATUS
1 RBS status overview
One important first step in any troubleshooting activity is to
assess the current and past status of the node. As indicated
before, the location of the operator usually determines the tool-
set available to them. If we are investigating remotely or via
the OSS-RC, then tools such as AMOS, CEX, decoding scripts and
the EM could be used. If we are investigating locally, then local
tools such as a Serial terminal client, Moshell and EM could be
used.
The aspect of status (health, alarms, events, error, etc) can
easily be determined using the previously presented tools. In
emergency situations, tools like AMOS/Moshell provide the fastest
methods to collect complete logs and perform detailed
troubleshooting (GUI based tools increase support times).
When a problem arises, we usually look at different aspects of
RBS status. Investigations occur in an escalating order of
interest
Some of the RBS status we investigate includes:
O&M
Can we connect to the node (using Telnet, ssh,
other)? Is the node reachable?
Alarms
Are there critical alarms? What are they related
to (cell availability, disabled devices, etc)
Node Synchronization
MO Status
Are there disabled MOs? Are there manually
locked MOs? Is an MO dependency failed?
Cell availability
Is the site transmitting (on air)?
S1-MME Status
Is the RBS-Core interface (S1) up?
Traffic Status
Is the RBS handling users (traffic)?
Node Performance
Is the node performing as expected (KPI)?
Event History
Is there a history of re-occurring
events/alarms?
2 O&M
The LTE RBS provides 3 levels of O&M security access. Each level
defines what and how access is granted. For example, only SSH
connections for COLI are supported at Level 3 security.
In this example, we note that Telnet and FTP servers (unsecure)
are ON (Level 1 security).However, the Debug Server (used for the
Target Monitor, described later) is OFF. This can be altered on
Level 1 with the secmode command.The Security for O&M Node Access
Description (1551-CXA 110 3235) provides detailed information of
the LTE RBS O&M security access.
The following details node level O&M connectivity commands.
1. For O&M connectivity, we use the acl command on MO IpOam=1. Then
we see two actions available, ping (test layer 3 IP connectivity)
and traceRoute (show the routes taken by packets across an IP
network).
2. COLI also provides the same two network tools.
3. The ping action is called on IpOam=1. The result is returned
immediately to the user (no RTT is provided in either MO or COLI
command). This is not so critical from an O&M perspective.
The COLI command dumpcap can be used to monitor the O&M interface
(uses libpcap format):
$ dumpcap -i any -o
The output is stored under: /d/logfiles/sniffer/default.
3 Alarms
The alarm status provides a real time view of the alarms
generated by the node (MOs). Most tools (EM, NCLI, Moshell, CEX)
provide a way of viewing node alarms. OSS-RC also provides the
Alarm List Viewer that can work across the network.
EM and Moshell provide node level alarms:
Moshell's Multinode mode also provides alarm status:
In the OSS-RC the Alarm List Viewer application provides detailed
network alarm status. Alarms can be acknowledged (ceased),
emailed, saved and filtered in this tool.
4 Node synchronization
RBS synchronization (via a clock source) is essential for the
node to provide its services (radio interface frames). 3 types of
RBS synchronisation are supported:
1. GPS
2. NTP (SoIP - Synchronization over IP)
3. SASE (Stand-Alone Synchronization Equipment)
For SoIP, the transport network design must ensure the NTP
packets are adequately handled (QoS).
5 MO status
Some critical MOs in the system have additional status
information (via attributes to model the "health" of the MO).
Disabled MOs could indicate problems in the node. For an MO in a
bad state (like administrativeState unlocked and operationalState
disabled), an alarm would generally lead to being raised. Moshell
and the EM can be used to quickly determine MO health status.
Some MO operational states have a dependency on other MOs (seen
in the availabilityStatus attribute).
The EM contains an Attention view that summarises some failed
states (unlocked and disabled MOs or MOs with a failed flag – red
board LED lit).
Moshell provides a fast summary of the MO status (cell dependent
devices are shown):
6 Cell availablity
To quickly assess, the following tools are recommended:
1. CEX, NSD Status View
2. EM, Radio Network View
3. Moshell, st command
OSS-RC CEX Network Status perspective Status view:
In the EM, we determine Cell status by using the Radio Network
View:
Moshell can be used in Multi-node mode to provide detailed cell
status for multiple sites. If the network or cluster is defined
in a list, this can be performed quite efficiently when
troubleshooting:
7 S1-MME status
The S1 interface is logically split into the S1-MME (MME) and S1-
UP (S-GW). The S1-MME status is available via the TermPointToMme
MO. To verify that the S1 connection is up:
1. CEX NSD cell status will show disabled if the TermPointToMme MO
is disabled
2. EM Radio Network View provides the TermPointToMme status
3. Moshell can be used in single or Multinode mode to determine the
status of the TermPointToMme MO(s)
8 Traffic status
This would seem simple to determine, but most tools show very
little information on current traffic status. The LTE RBS
specific COLI command, 'ue', is one dynamic printout that can
show this information.
Moshell ordering the COLI ue command to determine number of
connected users:
9 Node performance
The performance counters provide additional observability
including call handling (successful setups, releases, etc) and
other node internal info (licensed limitations, hardware
limitations, etc). OSS-RC's CEX NSD feature should provide KPI
data (currently seems to only show WCDMA performance).
OSS-RC also offers ENIQ.
Moshell provides the PM commands which allow direct analysis on
the nodes counter files (or offline files). Moshell provides
powerful performance counter analysis tools via the PM commands
(pmr provides pre-defined KPI reports). These commands can also
present PDU defined KPIs to us on a node level (e.g. Mobility,
Accessability, Retainability, etc).
COLI will also support dynamic counter printouts. The LTE RBS
will introduce the COLI command getstat to provide dynamic peg
counter information. The command will support resetting,
filtering counters and per cell counter printouts.
10 Event history
Often the current status of the node is dependent on past events
(board restarts, application crashes, manual O&M actions, etc).
The LTE RBS provides system logs that can be used to assess the
impact of events on the node status
The following logs are of importance here:
1. System Log – Node/board/program restarts.
2. Alarm Log – Alarms that were raised and ceased.
3. Event Log – MO events.
4. Availability Log – Node/board/program restarts plus cell/node
availability.
5. Security Event Log – O&M connection setups (socket info)
6. Shell Audit Trail Log – COLI commands (with user/IP info).
7. Corba Audit Trail Log – MO write commands (set, action, create,
delete).
8. Exception Logs – Unexpected application events (call drops, etc).
9. GCPU/ULMA Core dumps – Baseband crash logs.
10. Post Mortem Dumps (PMDs) – MP crash logs.
All of these logs are found under /c/logfiles/ except for PMDs
which are found under /c/pmd/
CEX's NSD Logs View collects and parses node logs:
Moshell provides the lg command that automatically collects and
parses most of the logs mentioned:
As shown there are usually many ways to obtain node status
information using the presented tools. However, for deeper
analysis we must use internal tools and scripts.
Ericsson internal analysis tools
1 Internal tools overview
Ericsson staff make use of many internal tools and scripts to
analyse LTE RBS data (traces, logs, dumps, etc). These tools and
scripts provide flexibility over the official, customer centred,
support tools (e.g. EM, CEX, etc).
The internal tools and scripts are designed to meet some common
goals:
1. To enhance the LTE RBS observability for Ericsson internal
engineers required to perform I&V, support or troubleshooting
tasks.
2. To help provide fast and efficient support of the LTE RBS.
Following picture is internal tools environment overview:
Following are some primary analysis tools:
1. teviewer/terouter
-----Captures T&E and HiCap trace messages
2. viewer/router
-----Captures T&E and HiCap trace messages
-----Can capture more traces without overflow than
teviewer/terouter
3. TET.pl
-----Translates binary baseband messages to trace
and error messages
4. decode
-----Supports TET.pl with additional decoding
capabilities
5. ltedecoder
-----Decodes L3 ASN.1 messages and event data
6. lteflowfox
-----Produces visual flow of ASN.1 messages (i.e.
S1, X2, etc)
7. LTELogTool
-----Multifunctional LTE observability tool
8. Cdae
-----Decodes baseband dsp crash dumps
9. bbfilter
-----Formats baseband traces into troubleshooting
friendly format
2 Teviewer/terouter
One of CPP's limitations is the circular T&E log (wrap-around)
that cannot capture all trace messages.Capturing trace logs
directly on the node using the T&E log is thus insufficient for
our data collection needs.Hence, the LTE RBS provides methods to
forward the trace and error messages to a remote workstation
(client listening for the trace messages). The methods to forward
T&E messages include the Target Monitor and HiCap.To monitor
these trace messages we require the use of a handler (terouter)
and viewer (teviewer).
To remotely monitor node T&E and HiCap trace messages we use a
handler (terouter) and a viewer (teviewer) installed on a remote
workstation(s):
Terouter:
1. Provides a function to receive T&E messages from multiple LTE RBS
nodes
2. Forwards traces to subscribing log clients (teviewer, other)
3. Nodes use Target Monitor or HiCap to specify the terouter
workstation IP address (UDP payload).
teviewer
1. Connects to a terouter process and converts received binary data
into text display messages for one or multiple nodes.
2. The teviewer workstation can be different to the terouter
workstation, but default is localhost (i.e. --router=127.0.0.1).
Example teviewer usage:
3 Viewer/router
Later versions of terouter and teviewer have been released,
called router and viewer.
1. terouter and teviewer can only handle a maximum of 2 trace events
per TTI
2. Maximum of 2000 events per second without overflow
3. New viewer and router can handle many more events / second
without overflow
4. Exact same coli commands as with terouter / teviewer
4 Baseband tracing
Baseband traces are output in a binary format from the RBS. The
format of the BB directly correlates to the software version on
the RBS. Files to enable correct decoding are stored on the RBS.
Additional files are also stored on the RBS to allow additional
decoding / formatting.
All files required for correct baseband decoding can be fetched
using MOSHELL.
Mo script fetchLteTraceConversionFiles.mos.
All files will be stored in the current directory.
5 TET.pl
Baseband traces are provided in a binary format to reduce the
message size (and allow for more traces). TET.pl (Trace and Error
Translator) allows us to translate the baseband binary output to
readable T&E messages. Need to use the
lteRbsBbTraceTranslation.xml file related to the SW package.
Example usage via Moshell:
6 Decode
To support decoding of additional baseband signals (LPP messages,
MAC/RLC/PDCP PDUs and NC messages), we use the decode tool. Note
that TET.pl must be run on the log/input first.
Full baseband observabilty is provided with TET.pl and decode. It
requires the following additionally config files related to the
SW package:
1. ifModelLm.xml
2. lteRbsBbMtdInfoDul3.xml
3. Can pass to decode the directory path to the config files
5.ltedecoder
Trace log not already decoded by TET.pl:
Trace log already decoded by TET.pl:
To be able to decode the L3 ASN.1 LTE messages (e.g. RRC, S1, X2)
and binary stored event files (exceptions, events, UE/Cell trace)
we use the ltedecoder tool.
ltedecoder is as a Java application that can be downloaded
together with lteflowfox (shown later). Layer 3 traces must be
activated for decoding.
Example decoding for a Cell Trace event file:
Example decoding for ASN.1 messages:
7 Lteflowfox
Decoded L3 ASN.1 messages are sometimes large and difficult to
read using text editors only (for some messages we only require a
subset of the information elements).
lteflowfox provides a way to produce a flow (text or html) of
decoded messages with pre-selected content .
An online user manual is provided for reference.
Following is an example text based flow output:
8 LTELogTool
LTELogTool is an Ericsson open source multi platform protocol
analyzer framework developed in Java. It can be used to directly
connect to RBSs to perform real time analysis (like teviewer).
It supports multiple file types:
PCAP, NethawkM5, binary, CPP T&E, etc
It supports IP protocols:
IP, UDP, TCP, SCTP, RTP, etc.
And it supports LTE protocols:
RRC, S1AP, X2AP, NAS, baseband text, baseband signals, etc
Example of ASN.1 L3 message decoding using LTELogTool:
9 Cdae
Core Dumps from the baseband module are important for LTE
troubleshooting. Dumps provide very detailed information about
Baseband SW errors. Cdae (Core Dump Analyser Extended) is able to
unpack/decode the contents of a baseband dump. The dumps are
visible via T&E messages and are stored:
/c/logfiles/dspdumps/00_01/9.bbfilter.
To unpack/decode crash dumps:
To recursively unpack/decode all files in a directory:
Baseband crashes can be seen on the MP using the COLI llog
command:
Example unpacking of a baseband crash dump log:
10 Bbfilter
bbfilter is a parser that is useful for filtering traces into
readable format. Formats Include:
1. Transmission Mode, MCS, Number PRBs, TBS, Assignable Bits
2. Outer Loop Adjustment, Measured RX power, SINR
3. See bbfilter help system for full list
RBS observability
Detailed LTE RBS observability is provided using node traces and
the internal analysis tools.
Trace support is provided by the CPP platform. Macros are
provided to designers to place trace and error messages into
their code. The output text messages are visible via a wrap-
around trace log or towards a remote server (monitor). There is
no overload protection designed into the T&E, so users have to be
careful with the number of or types of traces set. LTE RBS
introduces the High Capacity (HiCap) trace, which is part of the
CPP trace & error package. HiCap provides an alternative way to
send trace output from the baseband: via the transport network.
The default trace groups include – ERROR, INFO, CHECK. These are
visible in the T&E log without specifically activating them.
The COLI te command is used to steer the trace settings:
1. Enable tracing
2. Disable tracing
3. Display tracing status/settings
4. Set defaults
5. Save enabled trace groups
6. Set trace groups to be enabled after a restart
7. Set filtering expressions on a trace group
Following is LTE trace groups:
LTE RAN includes Trace Objects for function (capability or
feature) level observability. The Trace Objects are divided into
3 main categories, UE specific, CELL specific and Global. The UE
and CELL specific traces require enabling the selective UE
tracing function (covered in detail later).
Example Trace Object names:
1. Ft_CELL_ANR_MEAS
2. Ft_UE_RAC_MEASUREMENTS
3. Ft_RRC_CONN_SETUP
'te e' is a COLI shortcut for 'te enable'. There are other T&E
sub commands that have this property. Following are some T&E
examples:
Enable binary send/receive trace group on Trace Object S1AP_ASN
(L3 messages) in the MP:
Enable trace2 (timers) and trace5 (decisions) trace groups on
process Scc_SctpHost_proc in the MP:
Enable trace group trace6 (send signal) on a process in ULMA4:
Enable all trace groups on a process (sw install) in the MP:
Read the T&E log on the MP:
Read last 30 seconds of the T&E log on one RUL:
Default trace groups (INFO, CHECK, ERROR) on a process:
Disable all trace groups on a process:
Print trace group status on all processes:
Sarch for a trace group on an ULMA (with Moshell):
Example T&E log output from DUL:
It is common to have a monitor connected to the LTE RBS when
troubleshooting to capture trace messages to file.
CPP's mechanism for sending T&E logs to a remote server relies on
the Target Monitor function (CPP). This provides the operator
with the ability to forward T&E logs using either TCP or UDP to a
remote workstation or T&E client (host waiting to receive UDP T&E
packets).
Baseband traces can also make use of this mechanism by using the
link handlers towards the MP, so that all traces are output via
the MP. The trace settings need to be controlled as the MP is
limited in trace buffer space. Also, remember that LTE schedules
on a TTI level. Therefore baseband traces are immediately
considered to be high intensity.
The HiCap feature allows us to bypass the CPP link handler and
directly send T&E logs to a remote T&E Client. HiCap allows for
more messages to be forwarded using the capacity of the TN.
1 T&E monitoring
Several options are available for remote monitoring of T&E logs
Manual setup using the Target Monitor tm COLI command:
1. TCP mode
2. UDP mode
3. Connect client to node/TCP port
4. Connect UDP client (e.g. teviewer) to node/UDP port
Automated setup using mon Moshell/AMOS command (TCP only):
2 Lte exceptions
Events that are triggered when traffical or procedural use cases
encounter an abnormal condition. Typical Examples of Exceptions
include:
1. Timeout of a procedure (e.g. RRC Connection Setup)
2. Transport network problems (e.g. Congestion)
3. Air Interface Issues (Admission, RLC Retramsisions exceeded,
etc).
LTE Exceptions can be used to quickly identify problem areas that
are affecting traffic.
Following is the LTE exception examples:
The following fields exist for all Exceptions
1. TIMESTAMP
2. CELL_ID
3. EVENT_ID
4. EVENT_DESCRIPTION
5. OPTIONAL_DATA
Normal procedure is to configure target monitor so that
exceptions can be logged to a workstation. Example of an
exception from the T&E Log is shown below:
Exceptions can also be logged to disk on the RBS
1. high severity active by default
2. medium / low have to be enabled
3. one log produced each ROP period
'exception' COLI command used for activation / configuration of
exception storage:
Following are steps for enabling RBS UE/Cell selective tracing:
1. Enable UE / CELL Trace Objects
2. UE selection / Cells to be traced on
3. Monitor trace and error buffer output
3 Data collect
The purpose of the LTE RBS DCG document is to:
1. Assist the troubleshooter in data collection for fault scenarios
2. Enable consistent data collection for TR / CSR
3. Ensure that enough info is collected to help design correct
faults
The DCG consists of the following workflow:
1. Collection of Mandatory Data for TR / CSR
2. Collection of specific data based on the type of fault identified
3. Collection of additional info that may be useful given time
constraints
We can use Amos/Moshell, OSS/EM, telnet/.ftp to collect data.
Following are commands called for CPP boards:
te log read */ Trace & error log /*
llog -l */ Restart log, long /*
dumpelg */ HW log /*
ps-w */ process status /*
listloaded */ Loaded SW modules /*
pdr 30
a50 30
ps ose_lnh
pboot sh pa */ Pboot flash parameters including HW PID /*
mmu */ Information on Memory Manager /*
rmmconf -f */ Information of memory configuration /*
segment */ Segments created by memory manager /*
poolinfo –l */ List information about pools /*
ss */ process stack usage /*
vii */ Visual interface (LED status) /*
spashwinfo all */ HW Driver information /*
spashwinfo ingrp */ ingress parameters /*
spashwinfo egrp */ egress parameters /*
spaspccinfo */ Plane change control info /*
spastopologyinfo
4 RBS recovery
RBS Recovery may be necessary due to:
1. unrecoverable software faults
2. customer allows no time for troubleshooting
3. all avenues of troubleshooting are exhausted
Restart of the RBS on the currently loaded CV is one of the RBS
Initial Basic Recovery Actions.
When restart of the RBS on the current CV did not restore the
node, or no backup CV exists on an FTP server, we should begin
RBS rollback recovery actions to initiate a rollback to the most
recent known working CV.
When the node is partially operational or a backup archive has
previously been taken, RBS CV restore recovery actions can be
initiated.
The action 'getFromFtpServer' will initiate the download of the
backup CV archive.Note that when the archive is downloaded, it
will show up in the CVs that are currently on the node, however
it is not possible to just set this CV to startable as with other
CVs created and existing on the node. The CV is of type
"downloaded", which means that the CV has to go through a restore
(another configurationVersion action) procedure before being
taken into operation.
S1 connection Setup and Status
1 S1 overview
The S1 setup procedure is initiated from the RBS. SCTP
association is established first. SCTP is carried by the IP
protocol. S1 application protocol (S1AP) setup is then
established.
Following is the procedure of S1 setup:
2 Check IP configuration
IpAccessHostEt=1 MO contains the RBS IP address for control and
user plane traffic
IpInterface=2 MO contains the default router for the subnet
TermPointToMme MO contains information on the S1 link to the MME
It will contain:
IP addresses of the MME
MME name
Ping can be used as a basic check for MME IP connectivity
3 Check Connectivity from the LTE RBS
Call the action "ping" on the IpAccessHostEt MO
It is a good idea to try and ping the default routers
No round trip time is given when calling this action
ipac_traceroute COLI command
4 Check S1 RBS Attributes
S1 Setup can fail if there are a mis-match of parameters between
the MME and RBS.
5 S1 Setup Failures
The S1 control interface is represented by the MO TermPointToMme
Basic Checks:
Check MME IP connectivity
Check for S1AP protocol setup errors
Advanced Checks:
SCTP status
MME configuration
Wireshark analysis
Example of an S1 setup (S1 Setup procedure) failure:
6 Check SCTP Host info
The COLI sctphost_stat command shows the SCTP communication
between endpoints:
SCTP Host info Cont.
SCTP Trace Info
Additional information on the SCTP handling is available via the
Scc_SctpHost_proc process
As an example, trace group Trace7 is activated in the example
below for an association that has failed:
SCTP protocol messages can be traced using BUS_SEND and
BUS_RECEIVE trace groups
Note: CPP trace group meanings follow different conventions than
those mentioned for the LTE RBS
MME Configuration
Show which RBSs are connected to the MME
Note it is not necessary to define RBSs in the MME, however the
RBSs and cells must belong to a valid PLMN & Tracking Area.