Slide 1
Troubleshooting Common Ericsson GSM Alarms A step by step guide.
Slide 2
Common Ericsson GSM Alarms • MO FLT= OML FAULT • DIGITAL PATH FAULT SUPERVISION • CELL LOGICAL CHANNEL AVAILABILITY SUPERVISION • MO FLT= PERMANENT FAULT • MO FLT= BTS INTERNAL
Slide 3
Common Ericsson GSM Alarms Cont. • EXTERNAL ALARMS; see Additional Info • PWR COMMERCIAL;RECT 24V MAINS;BATTERY • CORRELATED LIKE: MO FLT= TS SYNC FAULT • CELL LOGICAL CHANNELS SEIZURE SUPERVISION • CP AP COMMUNICATION FAULT
Slide 4
Common Ericsson GSM Alarms Cont. • MO FLT= LOOP TEST FAILED • LOCAL MODE & OPERATOR CONDITION
Slide 5
MO FLT= OML FAULT • A fault exists in the communications link between the BSC and the BTS. • Alarm can come in as either Major or Critical. • Can come in on the CF, TRX, or both. • 90% of the time is due to a down or faulty T1. • Can also result from a faulty TRX or loss of communication from the BSC to the TRX.
NOTE: Prior to troubleshooting this or any other Ericsson alarm, always perform a right click on the alarm in Netcool and chose the ―View alarms at this Location‖ option to view all alarms at this site to see if this could be a secondary affect caused by power, maintenance, ECT and always search CTS to see if there is an existing Ticket on the site for this or any related issue.
Slide 6
Recommended Troubleshooting Steps SKCAB07
SCRMCAT021
MO=RXOCF-55 SLOGAN=OML FAULT
1. Log into appropriate Complex/BSC and retrieve an alarm list for the parent RXOTG by running the RXASP:MO=RXOTG-__; for the CF listed.
The TG number will be the same as the CF listed in the alarm.
Slide 7
RXASP:MO=RXOTG-55; Connecting to SKCAB07... (Use 'quit' to logoff)
RSITE ALARM SITUATION SCRMCAT021 SCRMCAT021 OML FAULT SCRMCAT021 OML FAULT SCRMCAT021 OML FAULT SCRMCAT021 OML FAULT SCRMCAT021 OML FAULT SCRMCAT021 OML FAULT SCRMCAT021 OML FAULT SCRMCAT021 OML FAULT SCRMCAT021 OML FAULT SCRMCAT021 OML FAULT SCRMCAT021 OML FAULT
END
As we can see the MO’s in this TG show multiple OML faults indicating a down or dirty T1. The next step is to status the primary DIP for the site to determine if it is blocked.
Slide 8
2. Status the primary DIP
STATE LOOP TSLOTL DIPEND FAULT ABL AIS
SECTION
END
Use the DTSTP:DIP=55rb3; command to status the DIP. This printout shows the state of the DIP is ABL or Automatically Blocked and has an AIS (Alarm Indication Signal). This indicates that this site has a dead T1 and a Ticket should be created and escalated to Telco to test the circuit.
Slide 9
3. Create a CTS Ticket Connecting to SKCAB07... (Use 'quit' to l ogoff)
RSITE ALARM SITUATION SCRMCAT021 SCRMCAT021 OML FAULT SCRMCAT021 OM L FAULT SCRMCAT021 OM L FAULT SCRMCAT021 OM L FAULT SCRMCAT021 OM L FAULT SCRMCAT021 OM L FAULT SCRMCAT021 OM L FAULT SCRMCAT021 OM L FAULT SCRMCAT021 OM L FAULT SCRMCAT021 OM L FAULT SCRMCAT021 OML FAULT
END
SECTION
Open a CTS ticket on this site from the OML fault alarm. Paste this information along with the RXASP printout and the Granite (ED) path into the Ticket and send it to Telco to have the circuit tested. No further actions will be needed at this point. If the DTSTP printout shows the DIP as WO or Working, the DIP may be working but taking errors. In this case proceed on to step 4, checking the quality of the DIP.
NOTE: The primary DIP is usually numbered the same as the RXOTG. However, sometimes the OML links are placed onto the secondary DIP at
the site. If this is the case, the primary DIP may show WO when a portion of the site is down (when in reality it is the second DIP that has faulted out). You may need to determine what the secondary DIP is (if it exists) and status it. It is important to get the correct DIP information to the field / telco group so the correct circuit will be tested.
Slide 10
4. Status the DIP Quality
The next step would be to check the quality of the DIP by running the DTQUP:DIP=___; command to see if it is taking errors The quality printout shows this DIP is taking errors in the N-UAS section which is the Unavailable seconds for unacceptable and degraded performance level for the near end (incoming direction). In this case a ticket will need to be generated and sent to telco as well with this printout, the RXSP printout and the ED path. Any errors in these fields indicates a faulty T1 that will need to be ticketed and tested by telco. If the DIP counters read all zeros with no errors, the DIP appears clean and you have verified the secondary DIP’s, then create a CTS ticket to send to telco with your notes and troubleshooting steps and have them test the circuit to verify. If
the circuit tests clean and the OML faults are still present then forward the ticket to the field to have them check the TRX’s. If the circuit test clean and the OML faults clear, wait about 20 minutes and status the site again. If the OML faults clear then the site has restored and you can close the TT. If the faults return then send the ticket to the field tech with your troubleshooting notes to have him investigate the issue. It is important to note at this point that the CF/TG number are not always the same as the DIP number. Secondary T1’s will have a number different than that of the TG. There can be 2 or more DIP’s assigned to a CF/TG. In these cases you will need to run a few extra commands to find the correct DIP number to status and include in your ticket. Consider the following example using the same commands used on the previous example.
Slide 11
MO FLT= OML FAULT MO= RXOCF-90 SLOGAN=OML FAULT
RSITE LAD400 LAD400 LAD400 LAD400 LAD400 LAD400
ALARM SITUATION OML FAULT OML FAULT OML FAULT OML FAULT OML FAULT
END
Status the TG using the RXASP command. As we can see there are OML faults on this CF so the next step is to status the DIP as we did in the last example.
Slide 12
Status the DIP using the DTSTP command
STATE LOOP TSLOTL DIPEND FAULT WO
SECTION
END
As we see here the DTSTP printout indicates the DIP is in the WO or working state so we will check the DIP quality printout for errors using the DTQUP command.
Slide 13
Status DIP Quality DTQUP
As shown in the DIP quality printout there are no reported errors on this DIP. The next step would be to look for a secondary DIP on this TG and then status it to see if it is down or taking errors.
Slide 14
RXAPP Command
APSTATE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE IDLE
64K TEI YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES YES
The first command to run when looking for a secondary DIP is the RXAPP:MO=RXOTG-__; This will show you the range of RBLT3 device numbers attached to each DIP listed in sequential order. Look at the result printout and find the range of RBLT or RBLT3 devices listed. Usually it is easy to tell if there is more than one DIP assigned because the RBLT range will have a noticeable skip in the numbering. In this case, the devices start with RBLT3-1560 through 1583 (first DIP). Then jumps to RBLT3-2160 through 2183 (second DIP). Then Jumps again to RBLT3-5496 through 5513 (third DIP). To find the DIP number form these devise ranges simply divide any number in the device range by 24 and that will give you your DIP number. 1560/24=65 so 60RB3 is your first DIP. Then 2160/24=90, so 90RB3 is your second DIP.
5496/24=229 so 229RB# will be your third DIP. Now we can status all three DIP’s to see which one is faulty.
Slide 15
RXAPP Command Cont. RBLT3-21 67 RBLT3-21 68 RBLT3-21 69 RBLT3-21 70 RBLT3-21 71 RBLT3-21 72 RBLT3-21 73 RBLT3-21 74 RBLT3-21 75 RBLT3-21 76 RBLT3-21 77 RBLT3-21 78 RBLT3-21 79 RBLT3-21 81 RBLT3-21 83 RBLT3-54 96 RBLT3-54 97 RBLT3-54 98 RBLT3-54 99 RBLT3-55 00 RBLT3-55 01 RBLT3-55 02 RBLT3-55 03 RBLT3-55 04 RBLT3-55 05 RBLT3-55 06 RBLT3-55 07 RBLT3-55 08 RBLT3-55 09 RBLT3-55 10 RBLT3-55 11 RBLT3-55 12 RBLT3-55 13 RBLT3-55 14 RBLT3-55 15 RBLT3-55 16 RBLT3-55 17 RBLT3-55 18 RBLT3-55 19
8 UNDEF IDLE YES 9 UNDEF IDLE YES 10 UNDEF IDLE YES 11 UNDEF IDLE YES 12 UNDEF IDLE YES 13 UNDEF IDLE YES 14 UNDEF IDLE YES 15 UNDEF IDLE YES 16 UNDEF IDLE YES 17 UNDEF IDLE YES 18 UNDEF IDLE YES 19 UNDEF IDLE YES 20 UNDEF IDLE YES 22 UNDEF IDLE YES 24 UNDEF IDLE YES 287 UNDEF IDLE YES 288 UNDEF IDLE YES 289 UNDEF IDLE YES 290 UNDEF IDLE YES 291 UNDEF IDLE YES 292 UNDEF IDLE YES 293 UNCONC SPEECH/DATA YES 294 UNDEF IDLE YES 295 UNDEF IDLE YES 296 UNDEF IDLE YES 297 UNDEF IDLE YES 298 UNDEF IDLE YES 299 UNDEF IDLE YES 300 UNDEF IDLE YES 301 UNDEF IDLE YES 302 UNDEF IDLE YES 303 UNDEF IDLE YES 304 UNDEF IDLE YES 305 UNDEF IDLE YES 306 UNDEF IDLE YES 307 UNDEF IDLE YES 308 CONC TRXC SIGNAL NO 2 3 4 5 309 CONC TRXC SIGNAL NO 0 1 310 CONC CF/TRXC SIGNAL NO 62 6 8 9 10
Look at the result printout and find the range of RBLT or RBLT3 devices listed. Usually it is easy to tell if there is more than one DIP assigned because the RBLT range will have a noticeable skip in the numbering. In this case, the devices start with RBLT3-1560 through 1583 (first DIP). Then jumps to RBLT3-2160 through 2183 (second DIP). Then Jumps again to RBLT3-5496 through 5513 (third DIP). To find the DIP number form these devise ranges simply divide any number in the device range by 24 and that will give you your DIP number. 1560/24=65 so 60RB3 is your first DIP. Then 2160/24=90, so 90RB3 is your second DIP.
5496/24=229 so 229RB3 will be your third DIP. Now we can status all three DIP’s to see which one is faulty
Slide 16
Status DIP 65RB3
SECTION
Looking at both the status and the quality printouts for 65RB3 we can tell that this DIP is up and clean with the state as WO and the error counters at all zeros. Since we already know that 90RB3 is clean from the previous troubleshooting steps, we will now status 229RB3.
Slide 17
Status DIP 229RB3
SECTION
As we can see this DIP is ABL (auto blocked) and the counters are registering several errors. This will be the DIP that you will put in the ticket and retrieve the ED path for. From here Create a Ticket in CTS and put the RXASP printout with this DIP information and the ED path for this DIP and send it to telco.
Slide 18
RXASP:MOTY=RXOCF; For Large Outages
RSITE ALARM SITUATI ON AR2158 OML FAULT AR2424 OML FAULT AR2154 OML FAULT AR2421 OML FAULT AR2138 OML FAULT AR2170 OML FAULT AR2201 OML FAULT AR2180 OML FAULT AR2183 OML FAULT AR2174 OML FAULT AR2174 OML FAULT AR2174 OML FAULT AR2152 OML FAULT AR2447 OML FAULT AR2171 OML FAULT AR2181 OML FAULT AR2153 OML FAULT AR2153 OML FAULT AR2156 OML FAULT AR2156 OML FAULT AR2155 OML FAULT
END
During a possible outage situation where you have multiple sites showing OML fault off the same BSC you can run the RXASP:MOTY=RXOCF; (you can replace CF with TG, TRX, TS, ect..). This command will give you a printout of all the CF’s that are in alarm with the site and condition for a quick count of how many sites are affected.
Slide 19
DTSTP:DIP=ALL,STATE=ABL;
SECTION
Along with the RXASP:MOTY=RXOCF; command you can run the DTSTP:DIP=ALL,STATE=ABL; command to help correlate the DIP’s that are down to the sites that are in alarm from the previous command during an outage situation. This will give you a printout of all the DIP’s to the BSC that that are blocked. You can also use this information to cross reference you DIP’s in Granite (ED) for your circuit information. It is important to note that this printout will give you all the DIPS at the BSC that ore ABL. Some of these may be switch DIPs or secondary circuits not related to cell sites.
Slide 20
DIGITAL PATH FAULT SUPERVISION DIGITAL PATH FAULT SUPERVISION
DIP=81RB3; FUALT=AIS
• Indicates A T1 is down or is above the error threshold. • Can be either a T1 to a cell site or a ―Switch DIP‖. • Usually accompanied be an OML Fault Alarm. • Comprised of 24 devices, 1 for each DSO.
Slide 21
Recommended Troubleshooting Steps 1. Log into the appropriate Complex/BSC and status the DIP using the DTSTP:DIP=____; (81RB3 in this case) Command to check and see if the DIP is down or taking errors.
Slide 22
Status DIP Connecting to RVCAB08... (Use 'quit' to logoff)
STATE LOOP TSLOTL DIPEND FAULT ABL AIS
As we can see here the DIP is ABL and the State is AIS (alarm indication signal). When troubleshooting this alarm you can also begin, as I have done here, be running the RXASP command and use the DIP number as the TG number since often times the DIP number and the TG number are the same. This may be a shortcut to correlating a site and TG to your DIP. You can cross reference the info in the switch (which we will cover next) or in ED to check accuracy. In this case The DIP is not the TG number so we will have to find the site/devices the DIP is connected to the long way by checking the devices and SNT (switching Network Terminal) it is connected to.
Slide 23
2.Find the SNTwith DTDIP Command
SNT ETM3-2
DIPP DIPNUM SDIP DIPOWNER 54 303 2ETM3 DIPM3
TYPE IEX
3. Status the SNT for the DIP's and their Device ranges
The first step to researching the DIP is to find and status the SNT. The fist command you need to run is the DTSTP:DIP=____; command to find the SNT. With this we can see that the SNT is ETM3-2. with that in mind we can now query the SNT to see all the DIP’s connected to it and the devises connected to them.
Slide 24
3. Status the SNT for the DIP's and their Device ranges NTCOP:SNT=EMT3-2;
SNTV SNTP DIP DEV SNTINL 1 XM-0-0-6 234RB3 RBLT3-5616&&-5639 0 235RB3 RBLT3-5640&&-5663 1 236RB3 RBLT3-5664&&-5687 2 237RB3 RBLT3-5688&&-5711 3 238RB3 RBLT3-5712&&-5735 4 239RB3 RBLT3-5736&&-5759 5 240RB3 RBLT3-5760&&-5783 6 241RB3 RBLT3-5784&&-5807 7 79RA3 RALT3-1896&&-1919 13 80RA3 RALT3-1920&&-1943 14 81RA3 RALT3-1944&&-1967 15 82RA3 RALT3-1968&&-1991 16 83RA3 RALT3-1992&&-2015 17 84RA3 RALT3-2016&&-2039 18 85RA3 RALT3-2040&&-2063 19 86RA3 RALT3-2064&&-2087 20 87RA3 RALT3-2088&&-2111 21 88RA3 RALT3-2112&&-2135 22 89RA3 RALT3-2136&&-2159 23 90RA3 RALT3-2160&&-2183 24 91RA3 RALT3-2184&&-2207 25 92RA3 RALT3-2208&&-2231 26 93RA3 RALT3-2232&&-2255 27 94RA3 RALT3-2256&&-2279 28 95RA3 RALT3-2280&&-2303 29 96RA3 RALT3-2304&&-2327 30 97RA3 RALT3-2328&&-2351 31 98RA3 RALT3-2352&&-2375 32 99RA3 RALT3-2376&&-2399 33 100RA3 RALT3-2400&&-2423 34 101RA3 RALT3-2424&&-2447 35 102RA3 RALT3-2448&&-2471 36 103RA3 RALT3-2472&&-2495 37
To find all the DIP’s connected to this SNT and their range of devises connected to them, you will need to run the NTCOP:SNT=EMT3-2; command. Scroll through the DIP's connected until you find the one you are looking for. In this case 81RB3. Note: Some data (DIPS) were removed from this printout for spacing reasons.
Slide 25
3. Status the SNT for the DIP's and their Device ranges NTCOP:SNT=EMT3-2; Cont.
SNT ETM3-2
109RA3 RALT3-2616&&-2639 43 78RB3 RBLT3-1872&&-1895 51 79RB3 RBLT3-1896&&-1919 52 80RB3 RBLT3-1920&&-1943 53 81RB3 RBLT3-1944&&-1967 54 82RB3 RBLT3-1968&&-1991 55 83RB3 RBLT3-1992&&-2015 56 84RB3 RBLT3-2016&&-2039 57 85RB3 RBLT3-2040&&-2063 58 86RB3 RBLT3-2064&&-2087 59 87RB3 RBLT3-2088&&-2111 60 88RB3 RBLT3-2112&&-2135 61 89RB3 RBLT3-2136&&-2159 62 90RB3 RBLT3-2160&&-2183 63 91RB3 RBLT3-2184&&-2207 64 92RB3 RBLT3-2208&&-2231 65 93RB3 RBLT3-2232&&-2255 66 94RB3 RBLT3-2256&&-2279 67 95RB3 RBLT3-2280&&-2303 68 96RB3 RBLT3-2304&&-2327 69 97RB3 RBLT3-2328&&-2351 70 98RB3 RBLT3-2352&&-2375 71 99RB3 RBLT3-2376&&-2399 72 6RTG3 RTGLT3-144&&-167 81 7RTG3 RTGLT3-168&&-191 82 8RTG3 RTGLT3-192&&-215 83 EQLEV PROT SDIP SUBSNT DEFPST SNTP 1 2ETM3 0 XM-0-0-6 1 XM-0-0-17
MODE 2176 0
END
Now that we have found our DIP and its device range (located to the right of the DIP number in the printout) we can status this device range. The devise range numbers are the numeric value configured in the switch and the site for all 24 devises connected to that DIP.
Note: Some data (DIPS) were removed from this printout for spacing reasons.
Slide 26
4.Status the DIP’s Device Range STDEP:DEV=RBLT3-1944&&-1967
STAT E BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC BLOC
BLS ADM ABS CONFIG ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 PC ABL H'00 P ABL H'00 P ABL H'00 P ABL H'00 P
This shows the status of the devise range which we can see is ABL. Also it shows us that the devises are provisioned for those time slots under the config column with the PC or P (provisioned Configured and Provisioned) next to them. If they were not configured it would show NP (Not Provisioned). It is important to note the Config status as if they were listed as NP then this DIP has not yet been provisioned in the Switch and the only further action would be to send a Minor TT to the switch to have them turn this DIP down until provisioned so we don’t keep getting alarms on it. From here we can now find the TG and site connected to the DIP.
Slide 27
5. Find the Site and TG. RXMDP:MOTY=RXOTS,DEV=RBLT3-1952;
DEVT RBLT3-1952
SDEV 1111
END
To find the TG and the site from the previous info, run the RXMDP:MOTY=RXOTS,DEV=RBLT3-1952; command. This can be ran on any provisioned device in the range. Also, you can replace the TS (time slot) for TRX. What this output shows us is that the there is a timeslot allocated to this rblt3 deice and it is RXOTS-79-9-4. 79 is your TG number, 9 is your TRX number and 4 is the timeslot number. With this in mind we can now run the RXASP:MO=RXOTG-79; command since we know this is on TG 79 from the printout. This will of course give us the site ID and the alarm situation for that TG.
Slide 28
6. Status the TG to get the site ID
RSITE LA8041
ALARM SITUATION
END
Our status here shows that this TG appears to be alarm free. However we know this is not the case. With Ericsson, when you have a secondary DIP that goes down, it may not always show up in the alarm printout for the RXASP command but that doesn’t mean it’s not in alarm. We can view all the radios and the timeslots and their status by running the RXMSP:MO=RXOTG-79,SUBORD; command to verify that this TG does in fact have a bad T1.
Slide 29
7. Verifying the TG has a down T1. RXMSP:MO=RXOTG-79,SUBORD;
BLO BLA LMO BTS 0000 0000 STA 0000 0000 STA 0000 0000 DIS 0000 0000 DIS 0000 0000 STA 0000 0000 0000 ENA 0000 0000 0000 ENA 0000 0000 0000 ENA 0000 0000 0840 DIS 0000 0000 0840 DIS 0000 0000 0840 DIS 0000 0000 0840 DIS 0000 0000 0000 ENA 0000 0000 0000 ENA 0000 0000 0000 ENA 0000 0000 STA 0000 0000 0000 ENA 0000 0000 0840 DIS 0000 0000 0840 DIS 0000 0000 0840 DIS 0000 0000 0840 DIS 0000 0000 0840 DIS 0000 0000 0840 DIS 0000 0000 0000 ENA 0000 0000 0000 ENA 0000 0000 0000 ENA 0000 0000 STA 0000 0000 STA
CONF CONF CONF ENA ENA ENA UNCONF UNCONF UNCONF UNCONF ENA ENA ENA ENA UNCONF UNCONF UNCONF UNCONF UNCONF UNCONF ENA ENA ENA
From this Printout we can see the TRX’s, subordinate timeslots, and their status. As we look we can see that in the BTS column there are several timeslots that are DIS or disabled as well as the CONF column where you can see the same timeslots listed as UNCONF or unconfigured along with the LMO column with the HEX code stating an issue. At this point with all the information we have we
would create a CTS ticket to send to Telco with the DTSTP, RXASP, and the RXMSP along with the correct ED path for the circuit to be tested. Looking at this we see that not all the timeslots are in order or on the same radio. This is done to split the rescores from the different T1’s to each sector for redundancy purposes.
Slide 30
7. RXMSP:MO=RXOTG-79,SUBORD; Cont. RXOTS-79-6-6 OPER RXOTS-79-6-7 OPER RXOTX-79-6 OPER RXOTRX-79-7 OPER RXORX-79-7 OPER RXOTS-79-7-0 OPER RXOTS-79-7-1 OPER RXOTS-79-7-2 OPER RXOTS-79-7-3 OPER RXOTS-79-7-4 OPER RXOTS-79-7-5 OPER RXOTS-79-7-6 OPER RXOTS-79-7-7 OPER RXOTX-79-7 OPER RXOTS-79-9-0 OPER RXOTS-79-9-1 OPER RXOTS-79-9-2 OPER RXOTS-79-9-3 OPER RXOTS-79-9-4 OPER RXOTS-79-9-5 OPER RXOTS-79-9-6 OPER RXOTS-79-9-7 OPER RXOTX-79-9 OPER RXOTRX-79-10 OPER RXORX-79-10 OPER RXOTS-79-10-0 OPER RXOTS-79-10-1 OPER RXOTS-79-10-2 OPER RXOTS-79-10-3 OPER RXOTS-79-10-4 OPER RXOTS-79-10-5 OPER RXOTS-79-10-6 OPER RXOTS-79-10-7 OPER RXOTX-79-10 OPER END
0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0000 ENA ENA 0000 0000 STA 0000 0000 0000 ENA ENA 0000 0000 0000 ENA ENA 0000 0000 0000 ENA ENA 0000 0000 0000 ENA ENA 0000 0000 0000 ENA ENA 0000 0000 0000 ENA ENA 0000 0000 0000 ENA ENA 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0000 ENA ENA 0000 0000 0000 ENA ENA 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0000 ENA ENA 0000 0000 STA 0000 0000 0000 ENA ENA 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0840 DIS UNCONF 0000 0000 0000 ENA ENA
Slide 31
MO FLT= BTS INTERNAL MO FLT= BTS INTERN AL 2 alarms: RXOTX-195-0 RXOTX-195-1
• A fault has occurred in the CF or in one or more of the MO’s • Can occur in the CF, TRX, RXOTX, RXORX • Usually presents in Netcool as a Major • Can sometimes be restored remotely
Slide 32
Recommended Troubleshooting Steps 1.
Retrieve alarm printout from parent RXOTG
RADIO X-CE IVER ADMINIS TRA TION MANAGE D OBJE CT ALARM SITUA TIONS MO RXOTG-195 RXOCF-195 RXOTX-195-0 RXOTX-195-1
RSITE LA0166 LA0166 LA0166 LA0166
ALARM SITUA TION BTS INT UNAFFECTE D BTS INT AFFE CTE D BTS INT AFFE CTE D
END
Determine if the fault is service-affecting or if any other MO’s are in alarm by looking at the alarm list for the parent RXOTG. If it shows ―BTS INT UNAFFECTED‖, this will generally not be service-affecting and will not warrant a critical ticket on the RXOCF itself If any other MO’s show ―AFFECTED‖ alarm situations, then troubleshoot them appropriately. The fault codes reflected by the RXOCF will aid in determining the problems on any subordinate MO’s. When the RXOTX or RXORX show ―BTS INT AFFECTED‖, it is out of service (blocked). Usually the RXOCF or the parent RXOTRX will show a ―BTS INT UNAFFECTED‖ but will not be blocked. However, because the TX and RX are
blocked, from an operational standpoint the TRX is out of service as well since it has no operating transmit or receive function. Look the fault code up in the maintenance manual fault list to determine what it means. The manual may also direct you to pull the fault codes from the parent MO’s that are showing unaffected faults to help further isolate the problem. 2) Retrieve the fault code(s) for the RXOCF
Slide 33
2. Retrieve the fault code(s) RXMFP:MO=RXOTG195,SUBORD,FAULTY;
BTSSWVER ERA-G04-R08-V01
4 KRY 101 1856/1
R 3C
RU RUREVISION 5 BGM1361001/3
R3A
RUPOSITION C:0 R:C SH: 1 SL: 37 RU RUREVISION 7 SEB1121095/1
TR41515198 RUSERIALNO B991781311 RULOGICALID FC FCU_01 0
R5B
RUPOSITION C:0 R:C SH: 8 SL:---
RUSERIALNO TU85160324 RULOGICALID CABI 2206 0
STATE BLST ATE INTERCNT CONCNT CONERRCNT LASTFLT LFREASON OPER 00031 FAULT CODES CLASS 2A 8 REPLACEMENT UNITS
2) Retrieve the fault code(s) for the MO Note: The actual faults are listed at the end of the printout.
Slide 34
2. Retrieve the fault code(s) Cont. 40 MO RXOTX-195-0
BTSSWVER ER A-G04-R08-V01
RU RUREVISION 0 RUPOSIT ION
RUSERIALNO
RULOGICALID
STATE BLST ATE INTERCNT CONCNT CONERRCNT LASTFLT LFREASON NOOP BLO 00000 FAULT CODES CLASS 1B 4 MO BTSSWVER RXOTX-195-1 ER A-G04-R08-V01 RU RUREVISION 0 RUPOSIT ION
RUSERIALNO
RULOGICALID
STATE BLST ATE INTERCNT CONCNT CONERRCNT LASTFLT LFREASON NOOP BLO 00000 FAULT CODES CLASS 1B 4 END
2) Retrieve the fault code(s) for the MO Note: The actual faults are listed at the end of the printout. From here we can see that the fault in on RXOTX-195-0 and 195-1. The fault code is 1B4. We can now reference our documentation to see what the fault is and what actions are required.
Slide 35
3. Reference the Fault code Fault No. AO TX I1B: 4 Fault name: TX antenna VSWR limits exceeded Related fault: SO CF I2A:8 – VSWR limits exceeded SO CF RU:40 – Antenna Description: When VSWR at CDU output exceeds the class 2 limit defined in IDB with OMT (default value: 1.8), the fault SO CF I2A:8 arises with RU map "Antenna". When VSWR exceeds the class 1 limit (default value: 2.2), the fault AO TX I1B:4 arises on TX. Possible reas ons: Faulty IDB Faulty CDU TX antenna/feeder faulty or disconnected Pfwd/ Prefl cables faulty Measurement receiver in TRU/CU (in some cases) faulty. Action
Reference the fault code given from our documentation to view the fault and see what are the next steps to take. From our documentation we can see that this is a VSWR alarm Fault Internal 1A 4 ―TX Antenna VSWR Limits Exceeded‖ This fault means that the VSWR (voltage standing wave ratio) on the transmit antenna or antenna feeder is too high. To keep the reflected power from burning up the transmitter or power amplifiers, the transmitter is shut down. The RXOCF will usually also show unaffected fault internal 2A 8 as well. This fault is NOT remote-repairable. Even if it is a bogus alarm, generally any blocking/deblocking of TX, TRX, CF, etc will only remove the alarm for a short period of time, and it will return on the next periodic VSWR measurement.
Cut a ticket to the field with RXASP and RXMFP printout and explanation that TX has high VSWR. The radio is out of service.
Slide 36
CELL LOGICAL CHANNEL AVAILABILITY SUPERVISION CELL LOGICAL CHANNEL AVAILABILITY SUPERVISION CELL=SD0379Y; CHTY PE=TCH
• Generated to inform you that a number of available channels have fallen below defined limits. • Usually accompanied or correlated with additional alarms for the MO’s that have caused the channel failures. • Can present as a Critical or Major in Netcool. • Can also be caused by a down T1.
Slide 37
Recommended Troubleshooting Steps 1. Right click on the alarm in Netcool and chose the ―View Alarms At This Location‖ option to see if there are secondary or correlated faults that may be causing the Alarm. 1A. If there are no secondary or correlated alarms, search CTS on the site for previously opened tickets.
Usually these are accompanied by secondary and/or correlated alarms. These can come in as Major or Critical alarms. If you do find a secondary alarm, no matter the severity, troubleshoot that alarm as instructed before moving on to the next step. If there are no other issues at the site then we will proceed to find the correct TG for the site in the alarm.
Slide 38
2. Find the correct TG correlated with the alarm. CELL LOGICAL CHANNEL AVAILABILITY SUPERVISION CELL=SD0379Y; CHTY PE=TCH
Connecting to SDCAB09... (Use 'quit' to logoff)
CELL SD0379Y SD0379Y
CHGR 0 1
END
From looking at the alarm we do not have the TG number to status the alarm situation. However, we can see the Cell ID in the alarm. From that we can get the TG number. By looking at the alarm we see in the text that the alarm is on sd0379 on sector ―Y‖. We can query the cell with the site and sector information. To find the correct TG to a cell in these alarms you must first run the RXTCP:CELL=SD0379Y,MOTY=RXOTG; command. This will give you the TG ID from the cell ID and the sector. We can now see from the printout that the correct TG to status is 91 and status the TG with the RXASP command.
Slide 39
3. Status the Alarm Situation on the TG.
RSITE SD0379 SD0379 SD0379 SD0379
ALARM SITUA TION BTS INT UNAFFE CTED BTS INT AFFECTE D BTS INT AFFECTE D
END
We can now see from the RXASP printout that there are internal faults on this TG in 2 of the radios. They are on RXOTX 91-5 and 91-4. From here we can check the status of the sectors and get our fault codes for the problem. Basically working this issue like any other internal fault problem. You will need to run the RXMFP:MO=RXOTG-91,SUBORD,FAULTY; command to get your fault codes for these devises. We will run this command next.
Slide 40
4. Run RXMFP command to get the fault codes.
BTSSWVER ERA-G04-R08-V01
RU RUREVISION 0 BOE 602 17/1
R1B/A
RUPOSITION C:0 R:C SH: 6 SL: 10 RU RUREVISION 1 BFL 119 437/1
RULOGICALID DX DXU_22 0
R1E
RUPOSITION C:1 R:C SH: 7 SL: 0 RU RUREVISION 2 SEB 112 1147/1
RUSERIALNO TU86267283
RUSERIALNO TR42305300 RULOGICALID CD CDU_L8 0
R2A
RUPOSITION C:0 R:C SH: 9 SL:---
RUSERIALNO BK41017805 RULOGICALID CABI 2250 TRX 0
STATE BLSTATE INTERCNT CONCNT CONERRCNT LASTFLT LFREASON OPER 00000 FAULT CODES CLASS 2A 9 REPLACEMENT UNITS 59
Slide 41
4. Run RXMFP command to get the fault codes. Cont. MO BTSSWVER RXOTX-91-4 ERA-G04-R08-V01 RU RUREVISION 0 RUPOSITION
RUSERIALNO RULOGICALID
STATE BLSTATE INTERCNT CONCNT CONERRCNT LASTFLT LFREASON NOOP BLO 00000 FAULT CODES CLASS 1B 2 MO BTSSWVER RXOTX-91-5 ERA-G04-R08-V01 RU RUREVISION 0 RUPOSITION
RUSERIALNO
RULOGICALID
STATE BLSTATE INTERCNT CONCNT CONERRCNT LASTFLT LFREASON NOOP BLO 00000 FAULT CODES CLASS 1B 2 END <
We already know from the RXASP printout that the faults are internal on RXOTX91-4 and 91-5. From here we can see that the faults codes on RXOTX91-4 and 91-5 are fault code class 1B2. We can now refer to our documentation to see what this fault is and if it can be cleared remotely or if it will need a ticket to be dispatched to the field.
Slide 42
5. Reference our documentation to find the proper fault code procedure. Fault No. AO TX I1B: 2 Fault name: CDU output power limits exceeded Related fault: SO CF I2A:9 – Power limits exceeded Description: When TX power at CDU output is 7 dB lower than exp ected, fault SO CF I2A:9 arises. When the difference is 10 dB, fault AO TX I1B:2 arises. Possible reas ons: There is probably a fault on the TX path. Other reas on: TX high temperature or saturation (see AO TX I1B:12 and AO TX I1B:14). Action: Try the following actions until the fault is corrected: Check all TX cables, both inside and outside cabinet. Check the CDU — CDU P fwd/Prefl cables. Check the RU logs to see which TRU is emitting the fault. Switch positions bet ween TRUs/CDUs to find out it is the units or the RF cables that are faulty. Reinstall the IDB.
From our documentation we can see that this is a CDU power limits exceeded fault. As the procedure indicates, there is no remote action that can fix this issue. At this point you would create a ticket in CTS and send it to the field with the RXASP, and the RXMFP printouts and a statement stating the problem. It is important to note that the RXASP printout will not always display active alarms but their could still be resources down in the sector. To check this we would run the RXMSP command for this TG.
Slide 43
6. If needed, Status the resources in the sector.
BLO BLA LMO 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0840 0000 0000 0840 0000 0000 0840 0000 0000 0840 0000 0000 0840 0000 0000 0840 0000 0000 0840 0000 0000 0840 0002 0014 0140 0000 0000 0000 0000 0000 0000 0000 0840 0000 0000 0840 0000 0000 0840 0000 0000 0840 0000 0000 0840 0000 0000 0840 0000 0000 0840 0000 0000 0840 0002 0014 0140
BTS CONF STA STA DIS CONF DIS CONF STA ENA ENA DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF STA ENA ENA DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF DIS UNCONF
END
Again if there are no alarms displayed on the RXASP command then we check the sectors to see if the resources are available. In this case we can see the sector is OOS and needs attention. Often times if a secondary DIP goes down, the Cell Logic Availability alarm will present itself with no OML fault or other alarms in Netcool or in the RXASP printout.
Slide 44
MO FLT= PERMANENT FAULT CORRELATED LIKE: MO FLT= PERMANENT FAULT RXOTRX-32-9 RXOTRX-32-8
• Can Present in Netcool as Critical or Major • Can come in on the CF, TRX, TX, or RX • Arises on an MO when the MO has attempted self-recovery from a transient fault several times and has failed. • May be related to a faulty T1.
Slide 45
Recommended Troubleshooting Steps 1.
Log into the appropriate BSC to verify the alarm with the RXOTG command.
RSITE TX0060 TX0060 TX0060
ALARM SITUATION PERMANENT FAULT PERMANENT FAULT
END
From the RXASP printout we can see that the permanent fault is on RXOTRX32-8 and 32-9. We can status the sector to confirm the ratios status if desired with the RXMSP command. You can always attempt to restore the radio (or radio function depending on the MO, TX, RX) by blocking the TRX, testing it, and then unblocking it. This will give the radio another chance to load and restore. It may restore for 10 or 15 minutes and then go back into fault. If this happens send ticket to the field explaining that you have attempted unsuccessfully to restore it. This is the case more often then not, and this will require either hardware replacement or a hard reset at the site. We will now go over the procedure to restore the TRX. NOTE: If the permanent faults are accompanied by OML faults in this printout, investigate the issue as a bad T1.
Slide 46
2. Block the TRX and subordinate devices.
To try and attempt to restore the TRX you must first block it down along with all its subordinate devices. Use the RXBLI:MO=RXOTRX-32-8,SUBORD: command to accomplish this. The Subord on the end of the command is to make sure that all devices within the TRX are down. When you run this command you will notice that the switch capitalize the command before any other output is given. It is asking you to confirm your request to block. You will need to enter a ; and then return to confirm. Then it will give you the printout you see here confirming that the TRX is blocked. The next step is to test the TRX with the RXTEI command.
Slide 47
3. Test the TRX with the RXTEI command.
RESULT ORDERED
END
Run the RXTEI command to test the TRX. This will usually take a minute or 2 to run and then it will give you the results. Either a pass or fail.
Slide 48
3. Test the TRX with the RXTEI command. Cont. RADIO X-CEIVER ADMINISTRATION TEST OF MANAGED OBJECT RESULT MO RXOTRX-32-8
RESULT LOADING FAILED
BTSSWVER ERA-G04-R08-V01
END
RADIO X-CEIVER ADMINISTRATION TEST OF MANAGED OBJECT RESULT MO RXOTRX-32-8
RESULT TEST WAS PERFORMED
BTSSWVER ERA-G04-R08-V01
NO FAULT INDICATIONS END
When the test is complete it will printout one of these 2 responses. From these we can see the pass or fail indications. If it returns as failed the next step would be to create a CTS ticket with the RXASP, and test result printouts and send it to the field. If it comes back passed then wait a couple of minutes and status the sector with the RXMSP command and status the TG with the RXASP command to insure that the sector has restored. It is important to note at this point that the alarm in Netcool may clear after blocking and unblocking even though the problem has not gone away. Also, If the test passes and the sector clears there is a good chance that the alarm and the alarm condition will return in 15 or 20 minutes. In that case create a ticket and send it to the field with the printouts and all troubleshooting steps taken.
Slide 49
EXTERNAL ALARMS; see Additional Info EXTERNAL ALARMS; see Additional Info RECTIFIER 24V+ MINOR;RECT -48V MINO
• Can present in Netcool as Critical or Major. • Can cover a wide range of situations many of which don’t affect performance. • In general, these faults are not remoterepairable and will just need to be ticketed. • Alarm fault information will be in the ―Additional Information‖
Typically, these are just ―Rip and Ship‖ alarms. They are usually on equipment either outside of the cabinet (I.E. Generators, Tower lights and amps, antenna, ect…) or separate from the BTS equipment (rectifiers, fuse panels, door alarms, ect..) They can be viewed in a printout with the ALLIP:ACL=EXT; command. NOTE: Although not considered External faults. Any temperature alarm, High Temp or Low Temp alarms can be treated much the same way. Since there is nothing we can do remotely for them, they will just need to be ticketed and sent straight out to the field. They can also be viewed in the ALLIP printouts.
Slide 50
Recommended Troubleshooting Steps Connec ting to SDCAB08... (Use 'quit' to logoff)
RSIT E SD 0366
CLASS 2
EXTERNAL ALARM RECTIFIER 24V+ MINOR RECT -48V MINOR A2/EXT "SDCAB08_L000403" 350 071212 1619 RADIO X-CEIVER ADMINISTRATION BTS EXTERNAL F AULT MO RXOCF-68
RSIT E SD 0339
CLASS 2
EXTERNAL ALARM RECTIFIER 24V+ MINOR END
Run the ALLIP:ALCAT=EXT; command to view the all external alarms in the BSC. You will have to scroll through and find the site for which the Netcool alarm has been generated for. Since we already know the fault from the additional information in Netcool we can match it in the alarm printout. Since the these alarms can not be fixed or restored remotely, you would now need to create a CTS ticket, paste the allip information in the ticket and send it to the field tech. If the alarm is not present in the external alarm list, these alarms can also be view in the A2 or major alarm list by running the ALLIP:ACL=A2; as well. This will just give you all the major alarms on the BSC and will probably be a much longer list.
Slide 51
PWR COMMERCIAL;RECT 24V MAINS;BATTERY CORRELATED LIKE: MO FLT= MAINS FA ILURE 2 alarms: RXOCF-181 RXOCF-180
• Can present as Critical or Major in Netcool • Can have several different text formats in Netcool • May present as Internal or external fault in Netcool • Indicates a loss of commercial power to the site
Slide 52
Recommended Troubleshooting Steps 1. Run the RXASP:MO=RXOTG-181&-180;
RSITE LA3109 LA3109 LA3109 LA3109
ALARM SITUATION MAINS FAILURE MAINS FAILURE
END
With Mains Fail or commercial power alarms there is no remote action to recover from this. These will require a CTS ticket to be dispatched to the field. Typically, this is done after waiting about 20 minutes from the alarm presentation time. The waiting period is to insure that the power does not come back shortly after the alarm presents itself. This is common in regions that are experiencing extreme weather events. The alarm printout can be viewed by running the RXASP command on the affected CF as shown in this slide. Paste this printout in the CTS ticket to send to the field tech.
Slide 53
2. Check the ALLIP Printout A2/APT "ANCAB11_L000403" 719 071201 1203 RADIO X-CEIVER AD MINISTRATION MAN AGED OBJECT FAULT MO RXOCF-181
RSITE AL AR M SLOGAN LA3109 MAINS FAILURE
A2/APT "ANCAB11_L000403" 720 071201 1204 RADIO X-CEIVER AD MINISTRATION MAN AGED OBJECT FAULT MO RXOCF-180
RSITE AL AR M SLOGAN LA3109 MAINS FAILURE
END
It is possible that the alarm condition may not present itself in the RXASP command. This is especially true with alarms that have ― Commercial Power Failure‖ in the text. In this case you may need to view the ALLIP printout. They can be in the ALLIP:ALCAT=POWER; or =EXT; or ACL=A1; or A2 depending on the configuration.
Slide 54
Summary: CORRELATED LIKE: MO FLT= TS SYNC FAULT • Single or multiple Timeslots to one or several TRX’s are OOS • Often can be a side affect of a down or faulty T1 • Can be caused by congestion • Can be restored individually if it is not a T1 issue
Slide 55
Recommended Troubleshooting Steps 1.
Retrieve alarm printout from parent RXOTG to determine which RXOTS MO(s) are affected
RSITE LAC312 LAC312
ALARM SITUATION TS SYNC FAULT
END
RXOTS (Timeslots) can generate TS SYNC FAULT alarms when the span takes slips, or the path between the transcoder and the TRX is disturbed (ie congestion in subrate switch). Generally, the RXOTS MO’s will show OPER, but the corresponding TCH’s will show BLOC. To clear the TS faults, you must block, loop test, and deblock the affected RXOTS MO(s). It is also helpful to check the DIP for errors to resolve the origin of the problem. Here we see that RXOTS-25-3-2 has faulted If the RXOIS is in alarm PERMANENT FAULT, proceed directly to document ―RXOIS: PERMANENT FAULT‖. A faulty RXOIS can cause RXOTS problems.
In general, if a site only has 2 or 3 TS in fault, you should be able to clear them. If all the TS under a TRX are down or multiple (more than 7) are down on the site, additional troubleshooting will be required.
Slide 56
2. Block the affected RXOTS MO(s
STA TE RESULT
RXOTS-25-3-2
COM
EXECUTE D
You must block the affected timeslots prior to testing them.
Slide 57
3. Test the affected RXOTS MO
RESULT ORDERED
END < COMMAND SESSION SUSPENDED RADIO X-CEIVER ADMINISTRATION TEST OF MANAGED OBJECT RESULT MO RXOTS-25-3-2
RESULT TEST WAS PERFORMED
BTSSWVER ERA-G04-R08-V01
NO FAULT INDICATIONS END
Test the affected RXOTS MO with the RXTEI command. After a few seconds, a result printout should appear with ―NO FAULT INDICATIONS‖ as seen here
Slide 58
4. Run a Loop-Test on the affected RXOTS MO RECONN E XE CUTED
RESULT TEST SUCCESSFUL
TRCA TERDEV
BSCATERDEV
ABISDEV RBLT3-616
END
Run a Loop-Test on the affected RXOTS MO ). This tests continuity between the transcoder and the TRX/TS. The Result should be ―TEST SUCCESSFUL‖. Then unblock the TS and check it for a status. If the result comes back loop test failed, try the test one more time. Sometimes it will still come back. If it fails out again. Unblock the TS and then cut a ticket to the field for a possible hardware hard reset or replacement.
Slide 59
5. Unblock the affected RXOTS
STA TE RESULT PREOP EXECUITE D
Unblock the affect TS. It will show ORDERED, then after a few seconds, EXECUTED
Slide 60
6. Status the affected RXOTS
STATE BLS TA TE OPER
BLO BLA LMO BTS CONF 0000 0000 0000 ENA ENA
Wait a few seconds after unblocking so the TS can reset and then status the TS. It should read operational. If the TS does not come back after a few attemps to status it and all the DIPs on the CF are clean. Create a TT and send it to the field with all of your troubleshooting steps and printouts and send it to the field.
Slide 61
CELL LOGICAL CHANNELS SEIZURE SUPERVISION CELL LOGICAL CHANNELS SEIZURE SUPERVISION
• BSC threshold alarm for busy or blocked traffic channels • Can be due to site outages, congestion, hardware failure • Will need to be cleared manually • Does not require a CTS ticket
A threshold alarm from the BSC triggered when a certain number of traffic channels are continuously busy or unavailable. These are usually due to secondary faults or conditions at the equipment or site they are originating from. T1 failure, radio issues, ect… These are not alarm conditions within the BSC, rather they are just informational alarms telling a certain number of traffic channels are down. There is no remote troubleshooting remedy for this condition. The condition can be cleared manually by clearing the traffic channel alarm data in the BSC.
Slide 62
Recommended Troubleshooting Steps 1. Retrieve the traffic channel seizure information.
Use the RLVAP; command to view the counter for the traffic channels in alarm. From the printout we can see which site and sector they belong to. In this case SF0323Z. Again these are usually from a secondary condition which has most likely alarmed at the cell site and equipment level. The next step is to clear the channel data from the counter.
Slide 63
2. Clear the Channel Data
To clear the alarm you must clear the counters for the channels. Use the RLVAP:CHTYPE=TCH; command. You can check you work if you like by running the RLVAP command again to see if the counters have cleared as shown here. The alarm should clear from Netcool after a few minutes.
Slide 64
CP AP COMMUNICATION FAULT CP AP COMMUNIC ATION FAULT
• AP/CP can occur for a wide range of issues • Can present in Netcool as Critical or Major • Can sometimes require reporting to Outage Management for a SIR • Should always be ticketed after verification
AP/CP faults can come in for a wide range of issues as both critical and major. The can be verified by running the ALLIP for both A1 and A2. The severity of these alarms is not always represented well by netcool meaning just because it presents as a major in netcool it still may be a critical issue. CP issue tend to be more critical and less common then AP faults.
Slide 65
Recommended Troubleshooting Steps 1. Verify the alarm using ALLIP on A1 and A2
NODE B
NODENAME DNCOB08APGB
RESOURCE GROUP PROCESS LBB clusSvc CAUSE DATE TIME Process death 20071211 102325 A1/APZ "DNCOB08_L000403" 396 071211 1215 AP SYSTEM ANALYSIS AP APNAME 1 DNCOB08
NODE B
OBJECT COUNTER Logic alDisk % Free Space
NODENAME DNCOB08APGB INSTANCE C:
LIMIT <6
VALUE 2.45
A1/APZ "DNCOB08_L000403" 397 071211 1218 AP PROCESS STOPPED AP APNAME 1 DNCOB08
NODE NODENAME B DNCOB08APGB
RESOURCE GROUP PROCESS LBB ACS_PRC_ClusterControl CAUSE DATE TIME Process death 20071211 121850 END
From the ALLIP A1 we can see that the AP process has stopped. This will require an immediate ticket to the Switch and a call as well. Also, report to the Outage management team. If the AP or CP have gone single sided, as in this case, they will need to send a SIR notification for lack of redundancy. Also, check the ALLIP A2 printout as some of the alarms will only present as majors even though they are critical.
Slide 66
2. Check ALLIP A2 A2/APZ "DNCOB08_L000403" 859 071213 1105 CP AP COMMUNICATION FAULT FAULT NETWORK FAULT DEV DEVIP NETWORKIP REMOTEIP OCITS-1 192.168.170.128 192.168.170.0 192.168.170.2 RP 0
EM 0
RPTYPE
TYPE
A2/APZ "DNCOB08_L000403" 860 071213 1105 CP AP COMMUNICATION FAULT FAULT NETWORK FAULT DEV DEVIP NETWORKIP REMOTEIP OCITS-0 192.168.169.128 192.168.169.0 192.168.169.2 RP 0
EM 0
RPTYPE
TYPE
A2/APZ "DNCOB08_L000403" 878 071213 1114 AP NOT REDUNDANT AP APNAME 1 DNCOB08
NODE A
NODENAME DNCOB08APGA
NODE NOT AVAILABLE DNCOB08APGB CAUSE DATE TIME Node is down 20071213 111438 END
From here we can see that CP AP communications are down, the AP is not redundant and is down. Paste these in the ticket along with the A1 printout to send to the switch. It is important to note that not all AP alarms are not this serious. Most are majors with process errors or faults that can just be sent to the switch without callout. However, anytime you lose redundancy with either the AP or CP, a callout will need to be made and the Outage Management team advised.
Slide 67
MO FLT= LOOP TEST FAILED MO FLT= LOOP TEST FAILED
• Occurs when a manual or automatic loop test has been performed while the DIP takes errors, or the path between the transcoder and the TRX is disturbed • Presents in Netcool as a Major • Can sometimes be restored If it is not a DIP issue
Slide 68
Recommended Troubleshooting Steps 1. Retrieve alarm printout from parent RXOTG
RADIO X-CE IVER ADMINIS TRA TION MANAGE D OBJE CT ALARM SITUA TIONS MO RXOTG-63 RXOTS-63-0-0
RSITE DLLS TXE 049 DLLS TXE049
ALARM SITUA TION LOOP TES T FAILED
END
As we can see from the printout RXOST-63-0-0 is affected. If the RXOIS is in alarm PERMANENT FAULT, proceed directly to document ―RXOIS: PERMANENT FAULT‖. A faulty RXOIS can cause RXOTS problems. Also, always check the DIP for errors since this is likely caused by a faulty DIP. If the DIP shows errors create a ticket and treat this like a regular DIP issue. If the DIPs are clean you can try to block the affected TS, run the TEI test, loop test, and then unblock the TS like we did with the TSSYNC faults earlier,
Slide 69
2. Block the TS and test RXBLI:MO=R XOT S-63-0-0; RADIO X-CEIVER ADM INISTRAT ION MANUAL BL OCKING OF M ANAG ED OBJECT COM MAND RESULT MO ST AT E RESULT RXOTS-63-0-0 COM EXECUTED END
BSCATERD EV ABISDEV RBLT24-1519
Block the affected RXOTS MO(s). Test the affected RXOTS MO(s) (NOT REQUIRED, BUT RECOMMENDED) After a few seconds, a result printout should appear with ―NO FAULT INDICATIONS. Run a Loop-Test on the affected RXOTS MO(s). This tests continuity between the transcoder and the TRX/TS. The result should be ―TEST SUCCESSFUL‖. If RESULT = ―ABIS PATH UNAVAILABLE‖, you may have a faulty DIP or the DIP resources have not been assigned correctly. If RESULT = ―TEST PASSED‖, the fault may have just been transient. Proceed with unblocking
Slide 70
3. Unblock the TS RXBLE:MO=R XOTS-63-0-0; RADIO X-CEIVER AD MINISTRATION MANUAL DEBLOCKING OF MAN AGED OBJECT COMMAND RESULT MO RXOTS-63-0-0
STATE RESULT PREOP ORDERED
END
< RADIO X-CEIVER ADMINISTR ATION MANUAL DEBLOCKING OF MAN AGED OBJECT RESULT MO RXOTS-63-0-0
STATE RESULT OPER EXECUTED
END
Unblock the affected RXOTS MO(s). It will show ORDERED, then after a few seconds, EXECUTED if you are successful. Otherwise it will just return a NOOP result. At that point you would have to create a ticket and send it to the field.
Slide 71
LOCAL MODE & OPERATOR CONDITION CORRELATED LIKE: MO FLT= LOC AL MODE 1 alarms: RXOCF-2 CORRELATED LIKE: MO FLT= OPERATOR CONDITION 1 alarms: RXOCF-54
A2/APT "BDE C02E _D000100" 157 020805 0036 RADIO X-CE IVER ADMINIS TRA TION MANAGE D OBJE CT FAULT MO RXOTRX-2-8
RSITE DNVRCO1250
ALARM SLOGAN LOCAL MODE
A2/APT "BSC02_E 000O0056" 827 020318 1232 RADIO X-CE IVER ADMINIS TRA TION MANAGE D OBJE CT FAULT MO RXOCF-63
RSITE DLLS TXL009
ALARM SLOGAN OPERATOR CONDITION
The RXOCF is currently in a maintenance mode, normally set locally by a tech in the site. 1) If a tech has reported for maintenance on the site or there is a ticket cut against this device or other devices in the site, notate the alarm as LOCAL MODE. 2) If no tech has called in and no relevant ticket exists against site, cut ticket and note this information. There are times that devices go into local mode due to problems. ). The RXOCF is currently reporting that the cabinet door is open, hopefully by a tech at the site.
1) If a tech has reported for maintenance on the site or there is a ticket cut against this device or other devices in the site, notate the alarm as OPERATOR CONDITION. 2) If no tech has called in and no relevant ticket exists against site, cut a minor ticket and note this information. There are times that the door alarm contacts fail, or the weather may be affecting the door seal. Note, the RXMFP printout on the RXOCF will show an External Fault 2B 9 (operator condition).