Nokia OSS-Based 5G KPI Troubleshooting

πŸ”§ Field & LAB Proven  |  🧠 Interview-Ready  |  πŸ“Š OSS-Driven

πŸ“‘ Nokia 5G KPI Optimization & Interview Preparation β¬… Back to KPI Optimization

Scenario 1: Sudden RRC Setup Success Rate Degradation

OSS Symptoms & Alarms:

PM Counters:
NRRC.ConnEstabSucc.Sum drops 30% in 2 hours

Example (Hourly):

  • 06:00–07:00 β†’ 12,500

  • 07:00–08:00 β†’ 12,100

  • 08:00–09:00 β†’ 8,400

  • 09:00–10:00 β†’ 8,200


Alarms:
N/A (no hardware alarms)


Correlation:
NNGAP.InitCtxtSetupFail.Sum increases with cause
“radioNetwork-resource-not-available”

Example:

  • Normal hour β†’ 220

  • Degraded hour β†’ 1,150


Hourly Trend:
Degradation starts at 08:00 AM daily


Expert Troubleshooting Steps:


Step 1: Root Cause Analysis via PM Counters

Check RRC failure breakdown

Counters analyzed (hourly):

  • NRRC.ConnEstabFail.Sum

    • Normal hour β†’ 1,050

    • Degraded hour β†’ 4,200

  • NRRC.ConnFail_Congestion.Sum

    • Normal hour β†’ 320

    • Degraded hour β†’ 3,150

  • NRRC.ConnFail_Radio.Sum

    • Normal hour β†’ 410

    • Degraded hour β†’ 450

  • NRRC.ConnFail_Terminal.Sum

    • Normal hour β†’ 320

    • Degraded hour β†’ 350

  • NPRACH.SuccTotal

    • Normal hour β†’ 9,800

    • Degraded hour β†’ 9,750

Observation:

  • Majority of RRC failures are from NRRC.ConnFail_Congestion.Sum

  • NRRC.ConnFail_Radio.Sum remains stable

  • NPRACH.SuccTotal remains stable

Conclusion:
RRC degradation is not due to radio conditions or PRACH failures.
Root cause points to control-plane congestion.


Step 2: Resource Saturation Analysis

Analyze control plane resource utilization

Counters analyzed during 08:00–10:00 AM:

  • NCCE.UtilDL.P95

    • Normal hour β†’ 68%

    • Degraded hour β†’ 92%

  • NPRACH.AttTotal

    • Normal hour β†’ 10,400

    • Degraded hour β†’ 14,800

  • NRRC.ConnRej.Sum

    • Normal hour β†’ 480

    • Degraded hour β†’ 2,900

  • NGAP.UECtxtRelReq.Sum

    • Normal hour β†’ 310

    • Degraded hour β†’ 1,780

Observation:

  • PDCCH CCE utilization crosses 90%

  • RRC rejections rise sharply

  • Core network context releases increase

Conclusion:
Control-plane resource saturation is confirmed during busy hours.


Step 3: Parameter Configuration Audit

Check current mobility and access parameters

Parameters audited:

  • acBarringFactor

    • Current value β†’ 0.95

  • rrcConnectionRejectWaitTimer

    • Current value β†’ 1000 ms

  • maxConnectedUsers

    • Current value β†’ 200

  • prachConfigurationIndex

    • Current value β†’ 98

Audit window:
Last 7 days

Observation:

  • No parameter change detected

  • Values remained constant before degradation

Conclusion:
Issue is traffic-driven, not configuration-driven.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
acBarringFactor0.950.65Reduce access attempts during congestion
rrcConnectionRejectWaitTimer1000 ms500 msFaster retry for rejected UEs
maxConnectedUsers200180Protect existing connections
prachFreqOffset012Spread RACH attempts across resources
ssbPerRACHOccasion816Better beam correspondence for initial access

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Trend
RRC Setup Success Rate85.2%96.8%+11.6%Improved
RRC Reject Rate12.5%2.3%-10.2%Reduced
PDCCH CCE Utilization (P95)92%78%-14%Reduced
Average RRC Setup Time128 ms89 ms-39 msReduced
Initial Context Setup Failures8.2%1.1%-7.1%Reduced

Final Technical Conclusion

The sudden RRC Setup Success Rate degradation is caused by control-plane congestion during predictable busy hours.
By optimizing access control, retry timing, and RACH distribution, signaling load is stabilized and RRC performance is restored without hardware expansion.

Scenario 2: High RLF Rate in Specific Beam Directions

OSS Symptoms & Alarms:

PM Counters:
NRLF.Detected.Sum spikes in beams 2, 5, 8

Example (Hourly):

  • Normal beams β†’ 120–180
  • Beam 2 β†’ 980
  • Beam 5 β†’ 1,120
  • Beam 8 β†’ 1,050

Alarms:
No RF alarms, but beam-specific failures observed

Correlation:
High NBFI.Count.Sum in same beams

Example:

  • Normal beams β†’ 90–140
  • Beam 2 β†’ 860
  • Beam 5 β†’ 940
  • Beam 8 β†’ 910

Pattern:
Occurs during specific hours 18:00–22:00


Expert Troubleshooting Steps:


Step 1: Beam Failure Pattern Analysis

Analyze beam failure patterns using the following counters:

  • NBFI.Count.Sum
  • NRLF.Detected.Sum
  • NRSRP.Beam.Avg
  • NRSRQ.Beam.Avg

Example observations (18:00–22:00):

Beam 2

  • NBFI.Count.Sum β†’ 860
  • NRLF.Detected.Sum β†’ 980
  • NRSRP.Beam.Avg β†’ βˆ’96 dBm
  • NRSRQ.Beam.Avg β†’ βˆ’15 dB

Beam 5

  • NBFI.Count.Sum β†’ 940
  • NRLF.Detected.Sum β†’ 1,120
  • NRSRP.Beam.Avg β†’ βˆ’97 dBm
  • NRSRQ.Beam.Avg β†’ βˆ’16 dB

Beam 8

  • NBFI.Count.Sum β†’ 910
  • NRLF.Detected.Sum β†’ 1,050
  • NRSRP.Beam.Avg β†’ βˆ’95 dBm
  • NRSRQ.Beam.Avg β†’ βˆ’15 dB

Observation:

  • RLF and beam failure instances spike only in specific beams
  • RSRP and RSRQ remain within acceptable range
  • Issue is beam-specific, not cell-wide

Conclusion:
High RLF is not caused by coverage loss but by beam instability.


Step 2: Beam Management Configuration Audit

Check beam management parameters for affected beams:

Parameters audited:

  • beamFailureRecoveryTimer
  • beamFailureInstanceMaxCount
  • beamReportingPeriodicity
  • ssbPeriodicity
  • csiRsDensity

Observed configuration (Pre-Optimization):

  • beamFailureRecoveryTimer β†’ 100 ms
  • beamFailureInstanceMaxCount β†’ 5
  • beamReportingPeriodicity β†’ 160 ms
  • ssbPeriodicity β†’ 20 ms
  • csiRsDensity β†’ one

Observation:

  • Recovery timer too high for fast beam dynamics
  • Beam reporting periodicity too slow
  • CSI-RS density insufficient for accurate beam tracking

Conclusion:
Beam management configuration is not optimized for high-mobility or interference-prone scenarios.


Step 3: Inter-beam Interference Analysis

Analyze inter-beam interference using:

  • serving_beam
  • interfering_beam
  • interference_events
  • interference_level_db

Example observations:

  • Serving beam 2 interfered by beam 7
    • interference_events β†’ 420
    • avg_interference β†’ βˆ’6 dB
  • Serving beam 5 interfered by beam 9
    • interference_events β†’ 510
    • avg_interference β†’ βˆ’5 dB
  • Serving beam 8 interfered by beam 3
    • interference_events β†’ 470
    • avg_interference β†’ βˆ’6 dB

Conclusion:
Significant inter-beam interference exists, leading to frequent beam failures and RLF.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
beamFailureRecoveryTimer100 ms50 msFaster beam recovery
beamFailureInstanceMaxCount53More sensitive beam failure detection
beamReportingPeriodicity160 ms80 msFaster beam reporting
ssbPeriodicity20 ms10 msMore frequent beam sweeping
csiRsDensityonethreeDenser CSI-RS for better beam management

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Trend
Beam Failure Rate15.2%3.8%βˆ’11.4%Reduced
RLF due to Beam Failure8.5%1.2%βˆ’7.3%Reduced
Beam Switch Delay45 ms22 msβˆ’23 msReduced
Beam Measurement Accuracy78%92%+14%Improved
User Throughput (affected beams)65 Mbps142 Mbps+77 MbpsImproved

Final Technical Conclusion

The high RLF rate was caused by beam instability combined with inter-beam interference during peak hours.
By optimizing beam recovery timing, reporting periodicity, CSI-RS density, and sweeping frequency, beam robustness improved significantly, resulting in reduced RLF and enhanced user throughput.

Scenario 3: UL Throughput Degradation During Peak Hours

OSS Symptoms & Alarms:

PM Counters:
NTHP.UlMacCellVol drops 40% during 18:00–21:00

Example (Hourly UL Throughput):

  • 16:00–17:00 β†’ 330 Mbps
  • 17:00–18:00 β†’ 310 Mbps
  • 18:00–19:00 β†’ 190 Mbps
  • 19:00–20:00 β†’ 180 Mbps
  • 20:00–21:00 β†’ 195 Mbps

Alarms:
No hardware alarms


Correlation:
High NULInterference.Avg and NPUSCH.PowerHeadroom.Avg negative

Example:

  • NULInterference.Avg
    • Normal hour β†’ βˆ’98 dBm
    • Peak hour β†’ βˆ’92 dBm
  • NPUSCH.PowerHeadroom.Avg
    • Normal hour β†’ 4.2 dB
    • Peak hour β†’ βˆ’2.5 dB

Pattern:
Coincides with UL interference increase during peak hours


Expert Troubleshooting Steps:


Step 1: UL Interference Analysis

Analyze UL interference patterns using the following counters:

  • NULInterference.Avg
  • NULInterference.Max
  • NTHP.UlMacCellVol
  • NPUSCH.TxPower.Avg
  • NBLER.UL.Avg

Example observations (hourly):

18:00–19:00

  • NULInterference.Avg β†’ βˆ’92 dBm
  • NULInterference.Max (P95) β†’ βˆ’88 dBm
  • NTHP.UlMacCellVol β†’ 190 Mbps
  • NPUSCH.TxPower.Avg β†’ 22.8 dBm
  • NBLER.UL.Avg β†’ 15.6%

19:00–20:00

  • NULInterference.Avg β†’ βˆ’91 dBm
  • NULInterference.Max (P95) β†’ βˆ’87 dBm
  • NTHP.UlMacCellVol β†’ 180 Mbps
  • NPUSCH.TxPower.Avg β†’ 23.1 dBm
  • NBLER.UL.Avg β†’ 16.2%

Observation:

  • UL interference increases significantly during peak hours
  • UL throughput drops sharply in the same period
  • UE transmit power approaches maximum
  • UL BLER increases beyond acceptable threshold

Conclusion:
UL throughput degradation is driven by high uplink interference.


Step 2: Power Control Parameter Audit

Check UL power control configuration parameters:

Parameters audited:

  • p0NominalPUSCH
  • alpha
  • deltaMCS-Enabled
  • ulTargetBLER
  • srsPeriodicity
  • bsrTimer

Observed configuration (Pre-Optimization):

  • p0NominalPUSCH β†’ βˆ’76 dBm
  • alpha β†’ 0.8
  • deltaMCS-Enabled β†’ FALSE
  • ulTargetBLER β†’ 10%
  • srsPeriodicity β†’ 20 ms
  • bsrTimer β†’ 20 ms

Observation:

  • Target PUSCH power too low for interference conditions
  • Partial path loss compensation applied
  • MCS-based power adjustment disabled
  • UL BLER target too relaxed

Conclusion:
UL power control configuration is not optimized for high-interference peak hours.


Step 3: Scheduling Analysis for UL

Analyze UL scheduler behavior using:

  • NPRB.UtilUL.Avg
  • NPUSCH.Scheduled.UEs
  • NBSR.Received.Sum
  • NUL.Sched.Delay.Avg

Example observations (18:00–21:00):

  • NPRB.UtilUL.Avg β†’ 88%
  • NPUSCH.Scheduled.UEs β†’ 64
  • NBSR.Received.Sum β†’ 18,500
  • NUL.Sched.Delay.Avg β†’ 14.2 ms

Observation:

  • High UL PRB utilization
  • Increased scheduling delay
  • Large number of buffer status reports indicating backlog

Conclusion:
UL scheduler is stressed due to interference-driven retransmissions and power limitations.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
p0NominalPUSCHβˆ’76 dBmβˆ’70 dBmIncrease target power to overcome interference
alpha0.81.0Full path loss compensation
deltaMCS-EnabledFALSETRUEEnable MCS-based power adjustment
ulTargetBLER10%5%More aggressive MCS selection
srsPeriodicity20 ms40 msReduce SRS overhead for more PUSCH
bsrTimer20 ms10 msFaster BSR reporting

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Trend
UL Throughput (Peak Hour)125 Mbps320 Mbps+195 MbpsImproved
UL PRB Utilization88%75%βˆ’13%Reduced
UL BLER15.2%6.8%βˆ’8.4%Reduced
PUSCH Tx Power Headroomβˆ’2.5 dB3.8 dB+6.3 dBImproved
UL Interferenceβˆ’92 dBmβˆ’98 dBm+6 dBImproved

Scenario 4: Intra-Frequency Handover Failure Increase

OSS Symptoms & Alarms:

PM Counters:
NHO.FailIntraFreq.Sum increases from 2% to 12%

Example (Daily Average):

  • Normal period β†’ 2.1%
  • Degraded period β†’ 11.8%

Alarms:
No neighbor relation alarms


Correlation:
Failures concentrated in specific neighbor pairs

Example:

  • CELL_A β†’ CELL_B β†’ 420 failures
  • CELL_C β†’ CELL_D β†’ 390 failures
  • Other pairs β†’ < 50 failures

Pattern:
Affects cells with overlapping coverage areas


Expert Troubleshooting Steps:


Step 1: HO Failure Analysis by Neighbor Pair

Analyze HO failures between specific cell pairs using:

  • NHO.FailIntraFreq.Sum
  • ho_success
  • failure_cause = too-late
  • failure_cause = too-early
  • failure_cause = wrong-cell
  • rsrp_source
  • rsrp_target

Example observations (Top failing pairs):

CELL_A β†’ CELL_B

  • total_ho_attempts β†’ 3,200
  • successful_ho β†’ 2,760
  • too-late β†’ 290
  • too-early β†’ 95
  • wrong-cell β†’ 55
  • avg_source_rsrp β†’ βˆ’103 dBm
  • avg_target_rsrp β†’ βˆ’110 dBm

CELL_C β†’ CELL_D

  • total_ho_attempts β†’ 2,850
  • successful_ho β†’ 2,420
  • too-late β†’ 260
  • too-early β†’ 110
  • wrong-cell β†’ 60
  • avg_source_rsrp β†’ βˆ’104 dBm
  • avg_target_rsrp β†’ βˆ’111 dBm

Observation:

  • Majority of failures are too-late
  • Target cell RSRP is significantly weaker at HO execution
  • Failures are localized to overlapping coverage regions

Conclusion:
HO triggering occurs too late in overlapping coverage scenarios.


Step 2: Mobility Parameter Configuration Audit

Compare mobility parameters between problematic cells:

Parameters analyzed:

  • a3Offset
  • hysteresis
  • timeToTrigger
  • cellIndividualOffset
  • qOffsetCell

Example comparison (CELL_A vs CELL_B):

  • a3Offset β†’ CELL_A: dB3, CELL_B: dB2, difference: 1 dB
  • hysteresis β†’ CELL_A: dB2, CELL_B: dB1, difference: 1 dB
  • timeToTrigger β†’ CELL_A: 480 ms, CELL_B: 320 ms, difference: 160 ms
  • cellIndividualOffset β†’ CELL_A: 0 dB, CELL_B: +3 dB, difference: 3 dB
  • qOffsetCell β†’ CELL_A: 0 dB, CELL_B: 0 dB, difference: 0 dB

Observation:

  • HO triggering thresholds are misaligned between neighbor cells
  • Longer timeToTrigger delays HO execution
  • No bias applied toward the stronger target cell

Conclusion:
Mobility parameter mismatch is contributing to late HO execution.


Step 3: Measurement Reporting Analysis

Analyze measurement report quality using:

  • mr_quality
  • mr_delay
  • ue_id

Example observations (last 6 hours):

CELL_A

  • median_mr_quality β†’ 62
  • avg_reporting_delay β†’ 145 ms
  • unique_reporting_ues β†’ 380

CELL_B

  • median_mr_quality β†’ 58
  • avg_reporting_delay β†’ 162 ms
  • unique_reporting_ues β†’ 360

Observation:

  • Measurement quality degrades near HO region
  • Reporting delay increases during mobility
  • Fewer UEs report timely measurements

Conclusion:
Delayed and filtered measurements worsen late HO behavior.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
a3OffsetdB3dB2Earlier handover trigger
hysteresisdB2dB1Reduce measurement filtering
timeToTriggerA3480 ms320 msFaster reaction to changing conditions
cellIndividualOffset0 dB+3 dB (for target)Boost target cell attractiveness
filterCoefficientRSRPfc4fc2Faster RSRP filtering
reportAmountA3infinity4Limit excessive reporting

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Trend
Intra-Freq HO Success Rate87.5%98.2%+10.7%Improved
HO Failure (Too Late)6.2%0.8%βˆ’5.4%Reduced
HO Failure (Too Early)3.1%0.5%βˆ’2.6%Reduced
Ping-Pong HOs8.5%2.1%βˆ’6.4%Reduced
Average HO RSRPβˆ’112 dBmβˆ’105 dBm+7 dBImproved

Final Technical Conclusion

The intra-frequency HO failure increase was caused by late HO triggering due to mobility parameter mismatch and delayed measurement reporting in overlapping coverage areas.
After aligning A3 thresholds, reducing filtering, and optimizing reporting behavior, HO performance improved significantly with reduced failures and ping-pong events.

Scenario 5: PDU Session Establishment Failures for URLLC Slice

OSS Symptoms & Alarms:

PM Counters:
NPDU.SessEstabFail.Sum for SNSSAI 010203 increases

Example (Hourly):

  • Normal hour β†’ 120
  • Degraded hour β†’ 1,450

Alarms:
Slice resource utilization alarms observed


Correlation:
Failures occur when NSlice.RB.Util.SNSSAI_010203 > 80%

Example:

  • NSlice.RB.Util.SNSSAI_010203 (normal) β†’ 65%
  • NSlice.RB.Util.SNSSAI_010203 (peak) β†’ 92%

Pattern:
Affects only URLLC slice (SNSSAI 010203)
eMBB slice (SNSSAI 010101) remains unaffected


Expert Troubleshooting Steps:


Step 1: Slice Resource Analysis

Analyze slice resource utilization and failures using the following counters:

  • NSlice.RB.Util.SNSSAI_010203
  • NSlice.UE.Count.SNSSAI_010203
  • NPDU.SessEstabAtt.SNSSAI_010203
  • NPDU.SessEstabFail.SNSSAI_010203
  • NPDU.SessEstabAtt.SNSSAI_010101
  • NPDU.SessEstabFail.SNSSAI_010101

Example observations (18:00–21:00):

URLLC Slice – SNSSAI 010203

  • NSlice.RB.Util.SNSSAI_010203 β†’ 92%
  • NSlice.UE.Count.SNSSAI_010203 β†’ 86
  • NPDU.SessEstabAtt.SNSSAI_010203 β†’ 2,350
  • NPDU.SessEstabFail.SNSSAI_010203 β†’ 1,450

eMBB Slice – SNSSAI 010101

  • NSlice.RB.Util.SNSSAI_010101 β†’ 58%
  • NPDU.SessEstabAtt.SNSSAI_010101 β†’ 3,200
  • NPDU.SessEstabFail.SNSSAI_010101 β†’ 95

Observation:

  • High PDU session failures observed only for URLLC slice
  • eMBB slice shows normal behavior
  • Failures strongly correlate with URLLC RB utilization crossing 80%

Conclusion:
PDU session failures are caused by URLLC slice resource exhaustion.


Step 2: QoS Policy Configuration Audit

Check URLLC slice QoS configuration using:

  • param_name
  • param_value
  • expected_value
  • compliance_status

Example audit results (SNSSAI 010203):

  • guaranteedFlowBitRateUL
    • configured β†’ 10 Mbps
    • expected β†’ 50 Mbps
    • compliance_status β†’ NON_COMPLIANT
  • packetDelayBudget
    • configured β†’ 20 ms
    • expected β†’ 10 ms
    • compliance_status β†’ NON_COMPLIANT
  • preemptionCapability
    • configured β†’ may-not-preempt
    • expected β†’ may-preempt
    • compliance_status β†’ NON_COMPLIANT
  • preemptionVulnerability
    • configured β†’ preemptable
    • expected β†’ not-preemptable
    • compliance_status β†’ NON_COMPLIANT

Observation:

  • URLLC QoS policies are not aligned with strict latency and priority requirements
  • URLLC traffic cannot preempt lower-priority traffic

Conclusion:
QoS misalignment contributes to session establishment failures under load.


Step 3: Admission Control Analysis

Analyze admission control decisions for URLLC slice using:

  • requested_snssai
  • requested_5qi
  • decision
  • rejection_reason
  • available_resources
  • required_resources

Example observations (last 2 hours):

  • UE-A
    • requested_snssai β†’ 010203
    • requested_5qi β†’ 6
    • decision β†’ REJECT
    • rejection_reason β†’ INSUFFICIENT_RB
    • available_resources β†’ 18 RBs
    • required_resources β†’ 30 RBs
  • UE-B
    • requested_snssai β†’ 010203
    • requested_5qi β†’ 6
    • decision β†’ REJECT
    • rejection_reason β†’ SLICE_CAPACITY_LIMIT
    • available_resources β†’ 15 RBs
    • required_resources β†’ 28 RBs

Observation:

  • URLLC session requests rejected due to lack of guaranteed resources
  • Admission control blocks URLLC when slice utilization is high

Conclusion:
Admission control thresholds are too restrictive for URLLC traffic.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
sliceMaxRBPercentage20%30%Increase resource allocation for URLLC
guaranteedFlowBitRateUL10 Mbps50 MbpsIncrease guaranteed rate for URLLC
packetDelayBudget20 ms10 msTighter delay budget for URLLC
preemptionCapabilitymay-not-preemptmay-preemptAllow URLLC to preempt eMBB
preemptionVulnerabilitypreemptablenot-preemptableProtect URLLC from preemption
5qi6MaxRetxThreshold42Fewer retransmissions for lower latency

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Trend
URLLC PDU Session Success Rate71.5%99.2%+27.7%Improved
URLLC Slice RB Utilization92%75%βˆ’17%Reduced
URLLC Latency (5QI 6)28 ms12 msβˆ’16 msReduced
URLLC Packet Loss Rate1.8%0.1%βˆ’1.7%Reduced
eMBB Impact (Throughput)0%βˆ’8%βˆ’8%Acceptable

Final Technical Conclusion

The PDU session establishment failures were caused by URLLC slice resource exhaustion combined with misaligned QoS and admission control policies.
After increasing URLLC resource allocation, enabling preemption, and tightening QoS parameters, URLLC session success rate and latency improved significantly with minimal acceptable impact on eMBB traffic.

Scenario 6: DL Throughput Degradation with High MCS but Low Rank

OSS Symptoms & Alarms:

PM Counters:
High NMCS.Avg (24–27) but low NMIMO.Rank.Avg (1.2–1.5)

Example (Affected UE Categories):

  • NMCS.DL.Avg β†’ 25.8
  • NMIMO.Rank.Avg β†’ 1.3

Alarms:
No MIMO hardware alarms


Correlation:
Occurs when NUL.SRS.SNR.Avg < 5 dB

Example:

  • Normal condition β†’ 8.1 dB
  • Degraded condition β†’ 4.2 dB

Pattern:
Affects specific UE categories (e.g., Category X)


Expert Troubleshooting Steps:


Step 1: MIMO Performance Analysis

Analyze MIMO and SRS performance correlation using the following counters:

  • NMIMO.Rank.Avg
  • NMCS.DL.Avg
  • NUL.SRS.SNR.Avg
  • NCQI.Avg
  • NTHP.DlUeVol

Example observations (per UE category):

UE Category X

  • NMIMO.Rank.Avg β†’ 1.3
  • NMCS.DL.Avg β†’ 26.2
  • NUL.SRS.SNR.Avg β†’ 4.2 dB
  • NCQI.Avg β†’ 11.8
  • NTHP.DlUeVol β†’ 185 Mbps
  • sample_size β†’ 420 UEs

UE Category Y

  • NMIMO.Rank.Avg β†’ 2.4
  • NMCS.DL.Avg β†’ 25.1
  • NUL.SRS.SNR.Avg β†’ 7.6 dB
  • NCQI.Avg β†’ 13.9
  • NTHP.DlUeVol β†’ 310 Mbps
  • sample_size β†’ 380 UEs

Observation:

  • High MCS values are maintained
  • MIMO rank selection remains low for Category X UEs
  • DL throughput is limited despite good MCS
  • Low SRS SNR correlates strongly with low rank selection

Conclusion:
DL throughput degradation is caused by poor uplink channel sounding quality, not modulation limitation.


Step 2: SRS Configuration Audit

Check SRS configuration for different UE categories:

Parameters audited:

  • srs_bandwidth
  • srs_periodicity
  • srs_max_ports
  • srs_power_control

Example configuration (Category X):

  • srs_bandwidth β†’ BW4
  • srs_periodicity β†’ 20 ms
  • srs_max_ports β†’ 2
  • srs_power_control β†’ Enabled
  • ue_count β†’ 420

Observation:

  • SRS bandwidth too narrow for accurate channel estimation
  • SRS periodicity too long for fast channel variations
  • Limited SRS ports restrict MIMO rank estimation

Conclusion:
SRS configuration is insufficient to support higher MIMO ranks.


Step 3: Channel Correlation Analysis

Analyze channel correlation metrics using:

  • correlation_level
  • rank_selected
  • throughput_mbps
  • mcs

Example observations (last 6 hours):

High Correlation

  • sample_count β†’ 1,250
  • avg_rank β†’ 1.2
  • avg_throughput β†’ 190 Mbps
  • avg_mcs β†’ 26

Medium Correlation

  • sample_count β†’ 980
  • avg_rank β†’ 2.1
  • avg_throughput β†’ 315 Mbps
  • avg_mcs β†’ 25

Low Correlation

  • sample_count β†’ 760
  • avg_rank β†’ 3.4
  • avg_throughput β†’ 420 Mbps
  • avg_mcs β†’ 24

Observation:

  • High channel correlation results in rank-1 or rank-2 selection
  • Lower correlation enables higher MIMO layers and throughput
  • MCS remains high across all correlation levels

Conclusion:
High channel correlation combined with poor SRS quality limits rank adaptation.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
srsBandwidthBW4BW2Wider SRS for better channel estimation
srsPeriodicity20 ms5 msMore frequent SRS for fast-changing channels
srsMaxPorts24Enable more SRS ports for better MIMO
codebookSubsetRestrictionfully-restrictedpartially-restrictedAllow more precoding flexibility
pmiRiReportPeriodicity80 ms20 msFaster PMI/RI reporting
csiRsDensityonethreeDenser CSI-RS for better channel estimation

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Trend
Average Rank1.32.8+1.5Improved
DL Throughput (Category X UEs)185 Mbps420 Mbps+235 MbpsImproved
SRS SNR4.2 dB8.5 dB+4.3 dBImproved
MIMO Layer Utilization32%68%+36%Improved
CQI Reporting Accuracy65%88%+23%Improved

Final Technical Conclusion

The DL throughput degradation occurred due to poor uplink sounding reference quality, which limited accurate MIMO rank estimation despite high MCS values.
After optimizing SRS bandwidth, periodicity, reporting frequency, and CSI-RS density, MIMO rank utilization improved significantly, resulting in substantial DL throughput gains.

Scenario 7: Latency Spikes for Gaming / AR Services (5QI = 79)

OSS Symptoms & Alarms:

PM Counters:
NDelay.UP.E2E.5QI_79.P95 spikes from 25 ms to 65 ms during evening hours

Example (Hourly P95 Latency):

  • 16:00–17:00 β†’ 24 ms
  • 17:00–18:00 β†’ 26 ms
  • 18:00–19:00 β†’ 52 ms
  • 19:00–20:00 β†’ 61 ms
  • 20:00–21:00 β†’ 65 ms

Alarms:
β€œPacket Delay Threshold Exceeded” for 5QI = 79


Correlation:
High NRLC.ReasTimeout.Sum and NHARQ.Retx.Avg

Example:

  • NRLC.ReasTimeout.Sum
    • Normal hour β†’ 180
    • Peak hour β†’ 1,250
  • NHARQ.Retx.Avg
    • Normal hour β†’ 1.2
    • Peak hour β†’ 3.8

Pattern:
Coincides with peak gaming traffic during 18:00–23:00


Troubleshooting Steps:


Step 1: Latency Component Analysis

Decompose E2E latency by protocol layer using:

  • NDelay.PDCP.Tx.Avg
  • NDelay.RLC.Proc.Avg
  • NDelay.MAC.Sched.Avg
  • NDelay.HARQ.RTT.Avg
  • NDelay.UP.E2E.Avg

Example observations (per minute, peak hour):

  • NDelay.PDCP.Tx.Avg β†’ 3.5 ms
  • NDelay.RLC.Proc.Avg β†’ 18.2 ms
  • NDelay.MAC.Sched.Avg β†’ 14.8 ms
  • NDelay.HARQ.RTT.Avg β†’ 12.0 ms
  • NDelay.UP.E2E.Avg β†’ 64.5 ms
  • gaming_sessions (5QI=79) β†’ 420 active sessions

Observation:

  • RLC processing delay and HARQ RTT dominate E2E latency
  • PDCP delay remains low
  • MAC scheduling delay increases during congestion

Conclusion:
Latency spike is mainly caused by RLC retransmissions and HARQ retries under peak load.


Step 2: Gaming Traffic Pattern Analysis

Analyze gaming traffic characteristics using:

  • five_qi
  • packet_size (P95)
  • packets_per_second
  • inter_arrival_time_ms
  • ue_id

Example observations:

5QI = 79 (Gaming / AR)

  • p95_packet_size β†’ 120 bytes
  • avg_packet_rate β†’ 920 packets/sec
  • avg_inter_arrival β†’ 1.1 ms
  • active_gamers β†’ 420 UEs

5QI = 80

  • p95_packet_size β†’ 220 bytes
  • avg_packet_rate β†’ 410 packets/sec
  • avg_inter_arrival β†’ 3.5 ms
  • active_gamers β†’ 180 UEs

5QI = 6

  • p95_packet_size β†’ 1,200 bytes
  • avg_packet_rate β†’ 95 packets/sec
  • avg_inter_arrival β†’ 12 ms
  • active_gamers β†’ 90 UEs

Observation:

  • 5QI=79 traffic consists of very small packets at very high frequency
  • Highly sensitive to buffering, retransmissions, and scheduling delay

Conclusion:
Default QoS handling is not optimal for bursty, latency-critical gaming traffic.


Step 3: QoS Policy Verification

Check gaming QoS policy configuration using:

  • pdcp_sn_size
  • rlc_mode
  • dl_data_split_threshold
  • scheduling_priority
  • preemption_capability
  • preemption_vulnerability

Example configuration (5QI = 79):

  • pdcp_sn_size β†’ 18 bits
  • rlc_mode β†’ AM
  • dl_data_split_threshold β†’ 100 bytes
  • scheduling_priority β†’ Medium
  • preemption_capability β†’ disabled
  • preemption_vulnerability β†’ preemptable

Observation:

  • RLC AM introduces retransmission delays
  • PDCP SN size adds unnecessary overhead for small packets
  • No semi-persistent scheduling configured

Conclusion:
QoS policy is not tuned for ultra-low latency gaming services.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
pdcpSnSize (5QI=79)18 bits12 bitsReduced SN overhead for gaming packets
rlcMode (5QI=79)AMUMEliminate RLC retransmission delay
dlDataSplitThreshold100 bytes50 bytesFaster transmission of small gaming packets
harqMaxRetx (5QI=79)42Fewer retransmissions for latency-sensitive traffic
spsInterval (5QI=79)disabled10 msSemi-persistent scheduling for periodic gaming traffic
drxInactivityTimer20 ms5 msShorter inactivity for responsive gaming

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Impact
95th Percentile Latency (5QI=79)65 ms28 msβˆ’37 msSignificant Improvement
Packet Delay Variation (Jitter)22 ms8 msβˆ’14 msExcellent
Gaming Packet Loss Rate2.1%0.4%βˆ’1.7%Excellent
RLC Reassembly Timeouts8.5%1.2%βˆ’7.3%Excellent
HARQ Round Trip Time12 ms8 msβˆ’4 msGood
Overall Cell Throughputβ€“βˆ’2%βˆ’2%Minor Impact

Final Technical Conclusion

Latency spikes for gaming and AR services (5QI=79) were caused by RLC retransmissions, excessive HARQ retries, and non-optimized QoS policies during peak gaming hours.
After switching to RLC UM, reducing retransmissions, enabling SPS, and optimizing PDCP and DRX parameters, latency and jitter were significantly reduced with only a minor, acceptable impact on overall cell throughput.

Scenario 8: Persistent High DL BLER in Macro Cell

OSS Symptoms & Alarms:

PM Counters:
NBLER.DL.Avg consistently > 15% (threshold: 10%)

Example (Hourly Average):

  • Normal period β†’ 8.5%
  • Degraded period β†’ 16.8%
  • Peak hour β†’ 18.2%

Alarms:
β€œRadio Link Quality Degraded” alarm active


Correlation:
High NRLC.RetxDL.Sum and low NCQI.Avg

Example:

  • NRLC.RetxDL.Sum
    • Normal β†’ 4,800
    • Degraded β†’ 18,900
  • NCQI.Avg
    • Normal β†’ 10.8
    • Degraded β†’ 8.2

Pattern:
Affects all UEs in sector 2, not localized to specific users or locations


Troubleshooting Steps:


Step 1: BLER Analysis by UE Category & Location

Analyze BLER patterns across UE categories using:

  • NBLER.DL.Avg
  • NCQI.Avg
  • NMCS.DL.Avg
  • NRSRP.DL.Avg
  • NSINR.DL.Avg
  • azimuth_degrees

Example observations (Sector 2):

UE Category 4

  • affected_ues β†’ 180
  • NBLER.DL.Avg β†’ 17.2%
  • NCQI.Avg β†’ 8.0
  • NMCS.DL.Avg β†’ 22.5
  • NRSRP.DL.Avg β†’ βˆ’96 dBm
  • NSINR.DL.Avg β†’ 11.2 dB
  • median_azimuth β†’ 210Β°

UE Category 6

  • affected_ues β†’ 240
  • NBLER.DL.Avg β†’ 16.4%
  • NCQI.Avg β†’ 8.4
  • NMCS.DL.Avg β†’ 23.1
  • NRSRP.DL.Avg β†’ βˆ’95 dBm
  • NSINR.DL.Avg β†’ 10.8 dB
  • median_azimuth β†’ 208Β°

Observation:

  • High BLER across all UE categories
  • BLER not dependent on UE type or specific location
  • RSRP and SINR are moderate, not severely degraded

Conclusion:
Issue is cell-wide link adaptation, not UE-specific radio coverage.


Step 2: Link Adaptation Performance Analysis

Check link adaptation effectiveness using:

  • NBLER.DL.Avg
  • NBLER.Target.Avg
  • NMCS.DL.Avg
  • NCQI.Avg
  • NCQI.ReportingDelay.Avg

Example observations (hourly):

18:00–19:00

  • actual_bler β†’ 17.8%
  • target_bler β†’ 10%
  • mcs_used β†’ 23.0
  • reported_cqi β†’ 8.1
  • cqi_reporting_delay β†’ 95 ms
  • high_bler_samples β†’ 420

19:00–20:00

  • actual_bler β†’ 18.2%
  • target_bler β†’ 10%
  • mcs_used β†’ 23.4
  • reported_cqi β†’ 8.0
  • cqi_reporting_delay β†’ 98 ms
  • high_bler_samples β†’ 460

Observation:

  • Actual BLER significantly exceeds target BLER
  • MCS selection is aggressive despite low CQI
  • CQI feedback is delayed

Conclusion:
Link adaptation loop is not reacting fast enough to channel degradation.


Step 3: RF Configuration and Beam Analysis

Analyze RF parameters and beam performance using:

  • param_name
  • current_value
  • recommended_value
  • deviation_percentage
  • impact_on_bler

Example audit findings (Top contributors):

  • pdschTargetBlerDl
    • current β†’ 10%
    • recommended β†’ 5%
    • deviation β†’ +100%
    • impact_on_bler β†’ High
  • cqiTableIndex
    • current β†’ 1 (256QAM)
    • recommended β†’ 2 (64QAM)
    • deviation β†’ Mismatch
    • impact_on_bler β†’ High
  • dlAlpha (OLPC)
    • current β†’ 0.8
    • recommended β†’ 0.6
    • deviation β†’ +33%
    • impact_on_bler β†’ Medium

Observation:

  • BLER target is too aggressive
  • CQI and MCS tables favor high throughput over reliability
  • Power control loop insufficiently conservative

Conclusion:
RF and link adaptation parameters are tuned too aggressively for macro coverage.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
pdschTargetBlerDl10%5%More conservative target for better reliability
cqiTableIndex1 (256QAM)2 (64QAM)Use more robust CQI table
mcsTable256QAM64QAMConservative MCS for better BLER
dlAlpha (OLPC)0.80.6More conservative outer loop power control
initialMcsDl2015Start with lower MCS for new connections
csiReportPeriodicity80 ms40 msFaster CSI feedback for better adaptation

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Impact
Average DL BLER16.8%7.2%βˆ’9.6%Excellent
RLC DL Retransmissions18.5%8.2%βˆ’10.3%Excellent
Average CQI8.210.5+2.3Good
DL Throughput320 Mbps280 Mbpsβˆ’40 MbpsAcceptable Trade-off
User Experience (MOS)3.23.9+0.7Improved
RLF Rate5.2%2.1%βˆ’3.1%Excellent

Final Technical Conclusion

The persistent high DL BLER in the macro cell was caused by over-aggressive link adaptation and RF parameter configuration, not by poor coverage or UE limitations.
After adopting more conservative BLER targets, robust CQI/MCS tables, faster CSI feedback, and tuned power control, DL reliability improved significantly with an acceptable throughput trade-off.

Scenario 9: VoNR MOS Score Degradation in Dense Urban

OSS Symptoms & Alarms:

PM Counters:
NMOS.Avg.5QI_1 drops from 4.1 to 3.2

Example (Hourly Average):

  • Normal period β†’ 4.1
  • Degraded period β†’ 3.4
  • Peak degradation β†’ 3.2

Alarms:
β€œVoice Quality Degradation” alarm active for multiple cells


Correlation:
High NPDV.5QI_1.StdDev (>20 ms) and NPacketLoss.5QI_1.Avg (>2%)

Example:

  • NPDV.5QI_1.StdDev
    • Normal β†’ 9 ms
    • Degraded β†’ 25 ms
  • NPacketLoss.5QI_1.Avg
    • Normal β†’ 0.4%
    • Degraded β†’ 2.5%

Pattern:
Affects handover regions between CELL_12, CELL_13, CELL_14


Troubleshooting Steps:


Step 1: VoNR Quality Metrics Correlation

Correlate MOS with underlying metrics using:

  • NMOS.Avg
  • NPacketLoss.5QI_1.Avg
  • NPDV.5QI_1.StdDev
  • NDelay.UP.E2E.5QI_1.Avg
  • NBLER.UL.5QI_1.Avg
  • NROHC.CompressionRatio.Avg

Example observations (last 2 hours):

CELL_12 β†’ CELL_13

  • avg_mos β†’ 3.3
  • packet_loss β†’ 2.4%
  • jitter β†’ 24 ms
  • latency β†’ 38 ms
  • ul_bler β†’ 6.5%
  • rohc_ratio β†’ 1.9:1
  • call_count β†’ 420

CELL_13 β†’ CELL_14

  • avg_mos β†’ 3.2
  • packet_loss β†’ 2.7%
  • jitter β†’ 26 ms
  • latency β†’ 42 ms
  • ul_bler β†’ 7.2%
  • rohc_ratio β†’ 1.8:1
  • call_count β†’ 390

Observation:

  • MOS degradation correlates strongly with jitter and packet loss
  • UL BLER increases during mobility
  • ROHC compression efficiency is low

Conclusion:
VoNR quality degradation is driven by packet loss, jitter, and inefficient header compression, especially during handovers.


Step 2: Handover Impact on VoNR Quality

Analyze VoNR quality degradation during handovers using:

  • ho_type
  • mos_before_ho
  • mos_after_ho
  • mos_drop
  • ho_interruption_time
  • packet_loss_during_ho

Example observations:

Intra-Freq HO

  • pre_ho_mos β†’ 4.0
  • post_ho_mos β†’ 3.3
  • avg_mos_drop β†’ 0.7
  • interruption_time β†’ 85 ms
  • ho_packet_loss β†’ 2.1%
  • sample_count β†’ 320

Inter-gNB HO

  • pre_ho_mos β†’ 4.1
  • post_ho_mos β†’ 3.2
  • avg_mos_drop β†’ 0.8
  • interruption_time β†’ 110 ms
  • ho_packet_loss β†’ 2.6%
  • sample_count β†’ 280

Observation:

  • MOS drop occurs primarily during HO execution
  • Longer interruption time leads to higher packet loss

Conclusion:
Handover execution time and interruption directly impact VoNR MOS.


Step 3: ROHC Performance Analysis

Check ROHC compression efficiency and failures using:

  • NROHC.CompressionRatio.Avg
  • NROHC.FailureRate.Avg
  • NPDU.HeaderSize.Avg
  • NPDU.PayloadSize.Avg

Example observations:

UE Category 3

  • compression_ratio β†’ 1.8:1
  • failure_rate β†’ 6.5%
  • avg_header_size β†’ 42 bytes
  • avg_payload_size β†’ 33 bytes
  • ue_count β†’ 260

UE Category 6

  • compression_ratio β†’ 1.9:1
  • failure_rate β†’ 5.8%
  • avg_header_size β†’ 40 bytes
  • avg_payload_size β†’ 34 bytes
  • ue_count β†’ 310

Observation:

  • Header size comparable to payload size
  • Compression failure rate high for VoNR
  • ROHC contexts insufficient for concurrent calls

Conclusion:
ROHC inefficiency contributes to packet loss and jitter during mobility.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
rohcMaxCid515More compression contexts for concurrent VoNR calls
rohcProfile0x00010x0006Use optimized profile for voice traffic
ttiBundling (5QI=1)disabledenabledTTI bundling for better UL coverage in voice
ulTargetBler (5QI=1)10%1%Ultra-low BLER target for voice
spsInterval (5QI=1)disabled20 msSPS for consistent voice packet scheduling
hoExecutionTimer1000 ms500 msFaster handover execution for voice

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Impact
Average MOS Score3.24.0+0.8Excellent
Packet Loss Rate (5QI=1)2.5%0.3%βˆ’2.2%Excellent
Jitter (Packet Delay Variation)25 ms8 msβˆ’17 msExcellent
ROHC Compression Ratio1.8:13.5:1+1.7Γ—Excellent
Handover MOS Drop0.80.2βˆ’0.6Excellent
VoNR Call Drop Rate1.8%0.4%βˆ’1.4%Excellent

Final Technical Conclusion

The VoNR MOS degradation in dense urban areas was caused by handover-induced packet loss, high jitter, UL BLER, and inefficient ROHC compression.
By optimizing ROHC contexts, enabling SPS and TTI bundling, tightening UL BLER targets, and reducing HO execution time, VoNR quality was restored to near-ideal levels across all affected cells.

Scenario 10: Latency Optimization for Industrial IoT (URLLC)

OSS Symptoms & Alarms:

PM Counters:
NDelay.UP.E2E.5QI_80.P99 > 50 ms (requirement: 20 ms)

Example (Latency Distribution):

  • Normal period β†’ 18 ms
  • Degraded period β†’ 52 ms
  • Peak violation β†’ 58 ms

Alarms:
β€œURLLC Service Level Agreement Violation”


Correlation:
High NPDCP.ReorderingDelay.Avg and increased scheduling delays

Example:

  • NPDCP.ReorderingDelay.Avg
    • Normal β†’ 2.5 ms
    • Degraded β†’ 12.8 ms
  • Scheduling delay (5QI=80)
    • Normal β†’ 3.2 ms
    • Degraded β†’ 14.5 ms

Pattern:
Affects specific time-critical industrial applications (robot control, motion control)


Troubleshooting Steps:


Step 1: URLLC Traffic Pattern Analysis

Analyze URLLC traffic characteristics using:

  • packet_size_bytes
  • packets_per_second
  • e2e_delay (P99 / P99.9)
  • reliability_percentage
  • transaction_count

Example observations (last 1 hour):

Motion Control Application

  • avg_packet_size β†’ 64 bytes
  • packet_rate β†’ 1,200 packets/sec
  • p99_latency β†’ 55 ms
  • p999_latency β†’ 82 ms
  • reliability β†’ 99.92%
  • transaction_count β†’ 18,500

PLC Control Application

  • avg_packet_size β†’ 72 bytes
  • packet_rate β†’ 980 packets/sec
  • p99_latency β†’ 48 ms
  • p999_latency β†’ 74 ms
  • reliability β†’ 99.94%
  • transaction_count β†’ 15,200

Observation:

  • Very small packets with extremely high frequency
  • Tail latency (P99, P99.9) violates URLLC SLA
  • Reliability slightly below URLLC target

Conclusion:
Latency spikes are driven by tail latency accumulation, not average delay.


Step 2: Scheduling Priority Analysis

Check scheduling behavior for URLLC traffic using:

  • scheduling_delay_5qi_80
  • scheduling_delay_5qi_9
  • priority_weight_5qi_80
  • preemption_count_5qi_80

Example observations (INDUSTRIAL_CELL_01):

Scheduler: Proportional Fair

  • urllc_sched_delay β†’ 14.2 ms
  • embb_sched_delay β†’ 6.8 ms
  • urllc_priority β†’ 0.35
  • urllc_preemptions β†’ 2
  • unique_ues_scheduled β†’ 46

Scheduler: QoS-Aware

  • urllc_sched_delay β†’ 6.1 ms
  • embb_sched_delay β†’ 8.9 ms
  • urllc_priority β†’ 0.75
  • urllc_preemptions β†’ 9
  • unique_ues_scheduled β†’ 44

Observation:

  • URLLC traffic not consistently prioritized
  • Insufficient preemption of eMBB traffic
  • Scheduler behavior contributes to latency tail

Conclusion:
Scheduling priority for URLLC is insufficient during congestion.


Step 3: End-to-End Delay Breakdown

Break down URLLC latency components using:

  • delay_component
  • delay_ms (Avg / P95 / P99)

Example observations (last 30 minutes):

PDCP Reordering

  • avg_delay β†’ 10.5 ms
  • p95_delay β†’ 18 ms
  • p99_delay β†’ 25 ms
  • sample_count β†’ 9,800

MAC Scheduling

  • avg_delay β†’ 12.8 ms
  • p95_delay β†’ 22 ms
  • p99_delay β†’ 31 ms
  • sample_count β†’ 9,800

HARQ Processing

  • avg_delay β†’ 6.2 ms
  • p95_delay β†’ 10 ms
  • p99_delay β†’ 14 ms
  • sample_count β†’ 9,800

Observation:

  • PDCP reordering and MAC scheduling dominate tail latency
  • Combined delays exceed URLLC SLA at P99

Conclusion:
End-to-end URLLC latency violation is caused by scheduler delay + PDCP reordering.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
pdcpDuplication (5QI=80)disabledenabledPacket duplication for ultra-reliability
maxHarqTx (5QI=80)48More HARQ retransmissions for reliability
logicalChannelGroup (5QI=80)10Highest scheduling priority
prioritisedBitRate (5QI=80)01000 kbpsGuaranteed bit rate for URLLC
bucketSizeDuration (5QI=80)100 ms10 msSmaller bucket for bursty URLLC traffic
schedulingRequestId (5QI=80)10Highest priority SR

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Impact
99th Percentile Latency (5QI=80)52 ms18 msβˆ’34 msExcellent
99.9th Percentile Latency85 ms25 msβˆ’60 msExceptional
Reliability (1-Packet Loss)99.9%99.999%+0.099%Excellent
PDCP Duplication Overhead0%100%+100%High Cost
eMBB Throughput Impact0%βˆ’15%βˆ’15%Acceptable
URLLC SLA Compliance65%98%+33%Excellent

Final Technical Conclusion

The URLLC latency SLA violation was caused by scheduler prioritization gaps and PDCP reordering delays, which primarily impacted tail latency (P99 / P99.9).
By enabling PDCP duplication, enforcing strict scheduling priority, increasing HARQ reliability, and optimizing bucket and SR parameters, URLLC latency and reliability were restored to industrial-grade requirements with an acceptable trade-off on eMBB throughput.

Scenario 5: BLER Optimization for Massive MIMO Cells

OSS Symptoms & Alarms:

PM Counters:
Sector-specific high BLER in NBLER.DL.Beam_X.Avg

Example (Top Impacted Beams):

  • Beam 7 β†’ 18.6%
  • Beam 11 β†’ 17.9%
  • Beam 14 β†’ 16.8%
  • Other beams β†’ < 9%

Alarms:
β€œBeam Quality Degradation” on specific beams


Correlation:
Low NMIMO.Rank.Avg and poor NCQI.Beam_X.Avg

Example:

  • NMIMO.Rank.Avg
    • Normal beams β†’ 3.2
    • Affected beams β†’ 1.9
  • NCQI.Beam_7.Avg β†’ 7.4
  • NCQI.Beam_11.Avg β†’ 7.1

Pattern:
Affects users located in specific angular sectors


Expert Troubleshooting Steps:


Step 1: Beam-Specific Performance Analysis

Analyze performance by beam index using:

  • NBLER.DL.Beam.Avg
  • NRSRP.Beam.Avg
  • NSINR.Beam.Avg
  • NMIMO.Rank.Beam.Avg
  • NTHP.DL.Beam.Avg

Example observations (MIMO_CELL_03):

Beam 7

  • azimuth_degrees β†’ 110Β°
  • elevation_degrees β†’ 6Β°
  • avg_bler β†’ 18.6%
  • avg_beam_rsrp β†’ βˆ’98 dBm
  • avg_beam_sinr β†’ 10.5 dB
  • avg_beam_rank β†’ 1.8
  • served_ues β†’ 95
  • beam_throughput β†’ 85 Mbps

Beam 11

  • azimuth_degrees β†’ 165Β°
  • elevation_degrees β†’ 7Β°
  • avg_bler β†’ 17.9%
  • avg_beam_rsrp β†’ βˆ’97 dBm
  • avg_beam_sinr β†’ 11.0 dB
  • avg_beam_rank β†’ 2.0
  • served_ues β†’ 102
  • beam_throughput β†’ 92 Mbps

Observation:

  • High BLER is beam-specific, not cell-wide
  • SINR is moderate but rank selection is conservative
  • Throughput per beam is significantly degraded

Conclusion:
BLER degradation is linked to beam-level MIMO behavior, not RF coverage.


Step 2: MIMO Configuration Audit

Check MIMO and beamforming configuration using:

  • config_parameter
  • current_value
  • recommended_value
  • compliance_status

Example audit results:

  • codebookSubsetRestriction
    • current β†’ fully-restricted
    • recommended β†’ partially-restricted
    • compliance_status β†’ NON-COMPLIANT
  • csiRsDensity
    • current β†’ one
    • recommended β†’ three
    • compliance_status β†’ NON-COMPLIANT
  • beamReportingPeriodicity
    • current β†’ 160 ms
    • recommended β†’ 40 ms
    • compliance_status β†’ NON-COMPLIANT
  • rankIndicatorRestriction
    • current β†’ rank-4-allowed
    • recommended β†’ rank-2-only
    • compliance_status β†’ NON-COMPLIANT

Observation:

  • CSI and beam reporting too sparse for fast channel variation
  • Precoding flexibility is restricted
  • Rank selection not optimized for BLER stability

Conclusion:
MIMO configuration is over-optimized for peak throughput, causing BLER instability.


Step 3: Channel Correlation Analysis

Analyze channel correlation for MIMO performance using:

  • correlation_level
  • selected_rank
  • throughput_mbps
  • bler_percentage
  • ue_speed_kmh

Example observations (last 6 hours):

High Correlation

  • sample_count β†’ 1,150
  • avg_selected_rank β†’ 1.8
  • avg_throughput β†’ 210 Mbps
  • avg_bler β†’ 18.2%
  • avg_ue_speed β†’ 12 km/h

Medium Correlation

  • sample_count β†’ 980
  • avg_selected_rank β†’ 2.4
  • avg_throughput β†’ 295 Mbps
  • avg_bler β†’ 11.5%
  • avg_ue_speed β†’ 18 km/h

Low Correlation

  • sample_count β†’ 760
  • avg_selected_rank β†’ 3.1
  • avg_throughput β†’ 380 Mbps
  • avg_bler β†’ 6.2%
  • avg_ue_speed β†’ 25 km/h

Observation:

  • High channel correlation leads to low rank and high BLER
  • Rank selection improves as correlation decreases

Conclusion:
Channel correlation directly impacts MIMO efficiency and BLER.


Parameter Optimization Strategy:

ParameterPre-OptimizationPost-OptimizationRationale
codebookSubsetRestrictionfully-restrictedpartially-restrictedMore precoding flexibility
csiRsDensityonethreeDenser CSI-RS for better channel estimation
beamReportingPeriodicity160 ms40 msFaster beam reporting for mobility
rankIndicatorRestrictionrank-4-allowedrank-2-onlyConservative rank for better BLER
pmiRiReportPeriodicity80 ms20 msFaster PMI/RI reporting
srsBandwidthBW4BW8Wider SRS for better UL channel estimation

Pre vs Post Optimization Impact:

KPIPre-OptimizationPost-OptimizationΞ”Impact
Average DL BLER14.2%6.8%βˆ’7.4%Excellent
MIMO Rank Utilization2.82.2βˆ’0.6Acceptable
Beam Switching Success Rate88%96%+8%Good
CSI Reporting Accuracy72%89%+17%Excellent
Cell Throughput850 Mbps720 Mbpsβˆ’130 MbpsTrade-off
User Consistency Index65%82%+17%Excellent

Final Technical Conclusion

The high BLER in the Massive MIMO cell was caused by beam-specific MIMO misconfiguration and high channel correlation, not by coverage or hardware faults.
By increasing CSI-RS density, improving reporting periodicity, relaxing precoding restrictions, and enforcing conservative rank selection, BLER and user consistency improved significantly with an acceptable throughput trade-off.