Skip to content

Building resilience and boosting observability for a telecoms giant

Building resilience and boosting observability for a telecoms giant

A global multimedia entity powers over 40 million connections across broadband, mobile, TV and home phones. Service downtime was affecting their business, and needed to be addressed.

The challenge of rapid growth

Major outages to sales and service channels were too frequent, meaning customers couldn’t pay bills or make changes to their accounts, and often at key transaction times. Not only were these outages frustrating because the client was unable to identify the cause of the problems quickly, but some of these were very public and damaging to the brand itself. With such a huge volume of coverage, it’s business critical that disruptions like this are managed quickly and effectively.

The client needed an experienced partner with enterprise software expertise to review how they monitor, respond to and prevent major incidents with their core online channels (collectively, their Site Reliability Engineering - SRE - practices), give them a clear view of what was happening, and get them to a place of action. 

By focusing on both operations and technology at company-wide scale, we built capability within the team for greater resilience pre and post-incident.

A holistic diagnosis

First, we needed an overview of the business structure and operations. An initial diagnostic phase gave us a holistic understanding of the system issues and where the problems lay, meaning we could objectively assess not just the technology, but the resources and processes the client had in place to tackle the service downtime.

Using our proprietary Accelerate framework, we identified core issues with monitoring and alerting, incident management and observability. The diagnostic phase revealed an over-reliance on manual operations, and an inability to effectively prioritise tasks, making the team reactive instead of proactive.

Thinking long, acting fast

Our solution was to produce a scalable and actionable roadmap that aligned processes with technical interventions for greater observability and responsiveness. Our team divided to conquer; the Team Lead maintained an ongoing immersion in the wider operational context, feeding information to the engineers, who then developed solutions for both technological and operational resilience.

Alerts were automated, streamlining monitoring and incident reporting, and onboarding for digital services was standardised and simplified. Custom-built dashboards captured metrics and data, giving the client the observability they needed. This unlocked the team’s capacity to respond instead of react.

The true test of Nearform’s collaboration came during the first product launch post-engagement. The business suffered zero downtime and the average time to recovery was reduced by 93%. With new tools and processes in place, the client’s capabilities expanded and gradually their SRE practices improved.

Our impact

More resilience, better business

The client now has the resilience it needs to protect it against the loss of revenue from major outages, leveraging technology in service of business continuity.

Greater confidence

By improving their review and response practices, the team gained confidence in the deployment of specific technical interventions to various services when and where they needed them. We empowered the client’s team to enhance both operational efficiency and customer experience.

Teamwork. Unlocked.

The experience and expertise of the Nearform team delivered a solution that quickly enabled the internal team to align and prioritise tasks, giving them control over their operations.

40+ million

connections across broadband, mobile, TV and home phone

6 weeks

concept ideation to dashboard implementation

Our capabilities

ENGINEERING

Site Reliability Engineering
Developer experience
Nearform's Accelerate framework
Digital resilience

You may also like

Insight, imagination and expertly engineered solutions to accelerate and sustain progress.

Contact