Streamlining Escalation: A Modular System to Reduce Alert Fatigue and Accelerate Innovation @AlertOps

A UX redesign to streamline book discovery, simplify navigation, and bring Goodreads into 2025

SETTING THE SCENE


People weren’t confused by the alerts - they were overwhelmed by how they were wired.


When I joined the AlertOps team, customers were telling us a consistent story: configuring alerts was complex. The product’s escalation policies lumped automation, outbound integrations and notifications into a single object. If you wanted to route a critical alert differently after business hours or send a custom message to stakeholders, you had to tangle with nested rules and dependencies. Many users gave up and relied on support engineers to manage configurations.


This problem isn’t unique to AlertOps. In complex IT environments, alert noise overwhelms teams and makes it difficult to spot serious problems. The lack of modularity also slowed down our ability to innovate. Adding new types of notifications or pre‑packaged playbooks required careful untangling of existing logic. We realized we needed to rethink how AlertOps handled notifications, both to reduce alert fatigue and to unlock future growth.

HUMAN IMPACT


On‑call engineers told us they felt “pinged for everything.” At 2 AM they’d wake up to multiple notifications about the same incident, all triggered from different parts of the workflow. Operations managers expressed frustration that they couldn’t hand off configuration to their teams, which slowed onboarding and sales cycles. When we mapped these pain points on an empathy map, we saw a common thread:

People wanted clarity and control - the ability to decide who gets notified, when and how, without wrestling with automation logic.



VISION & STRATEGY

From the outset, we set ambitious but focused goals:


  1. Reduce complexity by separating notifications from automation. Splitting responsibilities would allow users to build modular playbooks and notification policies.


  2. Accelerate innovation. A modular system would simplify the addition of new functionality and shorten the sales cycle by enabling pre‑packaged templates.


  3. Minimize alert fatigue and improve response times. By delivering fewer but more relevant notifications, we aimed to reduce mean time to acknowledge (MTTA) and mean time to resolve (MTTR). MTTA measures the average time it takes for a team to acknowledge an incident, while MTTR tracks how long it takes to fully resolve it.

We defined success metrics up front: a 30 % reduction in notifications sent per incident, a 20 % improvement in MTTA, a 10 % improvement in MTTR, and adoption by at least 80 % of administrators within three months.


OUR SOLUTION



Modular Architecture


We split the single escalation policy into two pieces:


  • Playbooks house workflows, outbound integrations and actions - automation lives here. Each playbook lists its name, service‑level objective (SLA) and status. Tabs for Settings, Workflows, Outbound Integrations, Outbound Actions, Usage and Access provide structure. Within a playbook, administrators can build workflows (e.g., Open Alert – High), assign reusable actions (Send Webhook, Update Status), and choose which notification policy to invoke.


  • Notification Policies focus solely on who gets notified and when. A table lists policies by name, type (User Preference or Centralized), priority and status. The editor presents the escalation chain - Primary → Secondary → Manager → Stakeholder - and allows administrators to pick contact methods (email, phone, SMS), set retry intervals and define how long to wait before escalating. Tabs for Rules, Communication, Usage and Access control more advanced behavior. Policies are reusable across playbooks, breaking the circular dependency between automation and notifications.
    New: Dynamic cards, personalized recs, filters for mood/length/genre

This separation allows us to pre‑package playbooks and notification templates as product tiers, shortens onboarding and simplifies the UI. It also opens the door to visual tools like playbook maps and “what‑if” simulations.




CONSTRAINTS & TRADEOFFS


  • Migration: Existing escalation policies had to be cloned into a playbook and a notification policy to preserve workflows and communication settings.


  • RBAC: We rethought permissions to let administrators manage playbooks and notification policies separately.


  • Product tiers: Basic notification features are available to everyone, while advanced options (like dynamic recipient groups) remain premium.


  • Resource limits: We delivered a solid foundation first and postponed more ambitious ideas, such as drag‑and‑drop builders and AI‑generated policies.





IMPACT & REFLECTIONS

Although this new architecture was pending release at the time of writing, we expect it to cut notifications per incident by roughly 30 % and reduce MTTA/MTTR by giving teams precise control over escalation paths. Modularity lets us roll out integrations and templates faster and gives customers the confidence to manage their configurations without expert help.


Looking ahead, we see opportunities to build playbook visualization and what‑if tools, offer curated templates as part of product packages, and explore AI‑assisted recommendations. Personally, this project taught me to balance big‑picture architecture with day‑to‑day constraints. By disentangling automation and notifications, my design makes AlertOps easier to use today and more flexible for tomorrow.