Case Study

SENTINEL/OPS

Production API health monitoring & incident management dashboard

Live monitoring dashboard deployed on Railway + Vercel, processing 1,200+ endpoint checks per minute with sub-millisecond aggregate queries and real-time WebSocket event streaming.

Architecture

Data flow from user interface through API layer to persistence and cloud deployment

The problem

Support engineers spend too much time discovering incidents through customer reports rather than internal monitoring. SENTINEL/OPS was built to demonstrate what production-grade API health tooling looks like: real polling, real alerting, real SLA tracking, not a mocked dashboard.

Architecture decisions

The core insight was separating concerns into four discrete BullMQ queues: scheduling, HTTP probing, rule evaluation, and notification. This means the notifier never sees raw HTTP responses and the checker never writes alert logic.

Scheduler enqueues check jobs at configurable intervals per endpoint
Checker performs HTTP probe, writes result to TimescaleDB hypertable
Checker also calls broadcast() for zero-latency WebSocket push
Evaluator reads continuous aggregate (not raw rows) for rule evaluation
Notifier fires Twilio SMS and/or SendGrid email based on alert config

TimescaleDB as the backbone

TimescaleDB's continuous aggregates pre-compute 1-minute windowed averages of latency, uptime, and error rates. Dashboard queries hit the `check_results_1min` materialized view, sub-millisecond regardless of data volume. The generated `mttr_minutes` column auto-calculates mean time to recovery when `resolved_at` is written, eliminating application-layer math.

Testing strategy

The 64-test suite covers the full stack: Vitest unit tests for SLA utility functions, Zustand store action tests, and API route tests with mocked BullMQ and TimescaleDB clients via supertest. The goal was to test contract boundaries, not implementation details.

← All Projects SupportDesk →

>Architecture

>The problem

>Architecture decisions

>TimescaleDB as the backbone

>Testing strategy

Architecture

The problem

Architecture decisions

TimescaleDB as the backbone

Testing strategy