IsUp - Infrastructure Recommendations

UptimeRobot's Approach (What They Do)

Based on IP analysis and their recent architecture migration:

┌─────────────────────────────────────────────────────────────────────┐
│                    UptimeRobot Infrastructure                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   Multi-Cloud VM Strategy (Traditional)                              │
│                                                                      │
│   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐             │
│   │     AWS      │  │ DigitalOcean │  │   Hetzner    │             │
│   │  (Premium)   │  │   (Value)    │  │   (Budget)   │             │
│   ├──────────────┤  ├──────────────┤  ├──────────────┤             │
│   │ US-East      │  │ NYC, AMS     │  │ Germany      │             │
│   │ US-West      │  │ SGP, SYD     │  │ Finland      │             │
│   │ EU-Frankfurt │  │              │  │              │             │
│   │ AP-Tokyo     │  │              │  │              │             │
│   └──────────────┘  └──────────────┘  └──────────────┘             │
│                                                                      │
│   ~110 checker IPs across 4 regions                                 │
│   Traditional VMs running custom monitoring software                │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Their Stack:

  • PHP + Node.js backend
  • MySQL database
  • Redis caching
  • Multi-cloud VMs for monitoring nodes
  • Moved FROM dedicated servers TO cloud (2024-2025)

What Should IsUp Use?

Best for: Fast time-to-market, lowest ops burden, global edge by default

┌─────────────────────────────────────────────────────────────────────┐
│                    Cloudflare Workers Architecture                   │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │              Cloudflare Edge (300+ locations)                │   │
│   │                                                              │   │
│   │   ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐          │   │
│   │   │ Worker  │ │ Worker  │ │ Worker  │ │ Worker  │  ...     │   │
│   │   │ US-East │ │ EU-West │ │ Asia    │ │ Oceania │          │   │
│   │   └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘          │   │
│   │        │           │           │           │                │   │
│   │        └───────────┴─────┬─────┴───────────┘                │   │
│   │                          │                                  │   │
│   └──────────────────────────┼──────────────────────────────────┘   │
│                              │                                      │
│                              ▼                                      │
│   ┌──────────────────────────────────────────────────────────────┐ │
│   │                     Central Services                          │ │
│   │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐     │ │
│   │  │  Vercel  │  │   Neon   │  │ Upstash  │  │ Tinybird │     │ │
│   │  │  (App)   │  │ (Postgres)│  │ (Redis)  │  │(Analytics)│    │ │
│   │  └──────────┘  └──────────┘  └──────────┘  └──────────┘     │ │
│   └──────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘

Pros:

  • 300+ edge locations (vs UptimeRobot's ~4 regions)
  • Zero server management
  • Built-in cron triggers for scheduled checks
  • Extremely low latency globally
  • $5/month for 10M requests
  • Durable Objects for state management
  • KV storage for configuration

Cons:

  • 50ms CPU time limit per request (fine for HTTP checks)
  • Can't do raw TCP/ICMP (need workarounds for ping/port)
  • Vendor lock-in to Cloudflare

Cost Estimate:

ComponentMonthly Cost
Workers (10M requests)$5
KV Storage$5
Durable Objects$5-10
Total$15-20

Best for: Need TCP/UDP/ICMP, want containers, more control

┌─────────────────────────────────────────────────────────────────────┐
│                       Fly.io Architecture                            │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │                  Fly.io Global Network                       │   │
│   │                                                              │   │
│   │   ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │   │
│   │   │   iad       │ │   ams       │ │   nrt       │           │   │
│   │   │ (Virginia)  │ │ (Amsterdam) │ │  (Tokyo)    │           │   │
│   │   │             │ │             │ │             │           │   │
│   │   │ 2x checker  │ │ 2x checker  │ │ 2x checker  │           │   │
│   │   │ containers  │ │ containers  │ │ containers  │           │   │
│   │   └─────────────┘ └─────────────┘ └─────────────┘           │   │
│   │                                                              │   │
│   │   ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │   │
│   │   │   syd       │ │   gru       │ │   lhr       │           │   │
│   │   │  (Sydney)   │ │ (Sao Paulo) │ │  (London)   │           │   │
│   │   └─────────────┘ └─────────────┘ └─────────────┘           │   │
│   │                                                              │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Pros:

  • Full Docker containers (any language, any protocol)
  • Can do ICMP ping, TCP port checks natively
  • 30+ regions available
  • Easy horizontal scaling
  • Persistent volumes for local state
  • Built-in private networking

Cons:

  • More ops overhead than serverless
  • Containers run 24/7 (vs pay-per-invocation)
  • Need to manage scaling yourself

Cost Estimate:

ComponentMonthly Cost
6 regions x 2 shared-cpu-1x$30-60
Fly Postgres$15
Total$45-75

Option 3: Hybrid (Best of Both Worlds)

Best for: Production-ready, handles all monitor types

┌─────────────────────────────────────────────────────────────────────┐
│                     Hybrid Architecture                              │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   HTTP/HTTPS/SSL Checks          TCP/ICMP/Port Checks               │
│   ┌───────────────────┐          ┌───────────────────┐              │
│   │ Cloudflare Workers│          │     Fly.io        │              │
│   │                   │          │                   │              │
│   │ • Fast & cheap    │          │ • Full protocol   │              │
│   │ • 300+ locations  │          │   support         │              │
│   │ • HTTP only       │          │ • 6-10 regions    │              │
│   └─────────┬─────────┘          └─────────┬─────────┘              │
│             │                              │                        │
│             └──────────────┬───────────────┘                        │
│                            │                                        │
│                            ▼                                        │
│             ┌──────────────────────────────┐                        │
│             │     Central API (Vercel)     │                        │
│             │     + Neon + Upstash         │                        │
│             └──────────────────────────────┘                        │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Detailed Comparison

FactorCloudflare WorkersFly.ioAWS/DO/Hetzner (UptimeRobot style)
Setup TimeHoursDaysWeeks
Ops BurdenMinimalLowHigh
Regions300+30+DIY (3-10 typically)
HTTP Checks✅ Excellent✅ Good✅ Good
TCP/Port⚠️ Workarounds✅ Native✅ Native
ICMP Ping❌ No✅ Native✅ Native
Cost (MVP)$15-30/mo$50-100/mo$100-300/mo
Cost (Scale)$100-500/mo$200-500/mo$500-2000/mo
ScalingAutoManualManual
Vendor Lock-inHighMediumLow

My Recommendation

Phase 1 (MVP): Cloudflare Workers Only

Start with Cloudflare Workers for everything. Accept limitations:

  • HTTP/HTTPS monitoring: ✅ Perfect
  • Keyword monitoring: ✅ Perfect
  • SSL monitoring: ✅ Can check certs via HTTPS
  • DNS monitoring: ⚠️ Use DNS-over-HTTPS APIs
  • Ping monitoring: ❌ Skip for MVP
  • Port monitoring: ❌ Skip for MVP

Why?

  • Ship faster
  • Lowest cost
  • 90% of users only need HTTP monitoring anyway
  • Add Fly.io later for ping/port

Phase 2 (Growth): Add Fly.io for Advanced Checks

When users request ping/port monitoring:

  • Deploy Fly.io containers in 6 key regions
  • Route ping/port checks to Fly.io
  • Keep HTTP checks on Cloudflare Workers

Phase 3 (Scale): Evaluate Multi-Cloud

At scale (100k+ monitors), consider:

  • Adding Hetzner for cost optimization in EU
  • Adding more regions based on customer demand
  • Potentially moving to Kubernetes for flexibility

Infrastructure Stack Summary

ComponentServiceWhy
App HostingVercelEasy deploys, great DX, auto-scaling
DatabaseNeon (Postgres)Serverless, scales to zero, branching
Cache/QueueUpstash RedisServerless, per-request pricing
HTTP MonitorsCloudflare Workers300+ locations, dirt cheap
TCP/Ping MonitorsFly.ioFull protocol support
Time-SeriesTinybird or ClickHouse CloudFast analytics at scale
EmailResendModern, great API
SMSTwilioReliable, global
SecretsVercel/InfisicalSecure env management
MonitoringAxiom + SentryLogs + errors

Alternative: Self-Hosted Stack

If you prefer more control / lower cost at scale:

ComponentServiceWhy
App HostingFly.io or RailwayFull control, predictable pricing
DatabaseFly Postgres or SupabaseManaged, good DX
Cache/QueueFly Redis or DragonflySelf-managed but cheap
MonitorsFly.io (all regions)Single platform
Time-SeriesSelf-hosted ClickHouseCheapest at scale

Cost Projections

ScaleMonitorsChecks/dayMonthly Cost
MVP1,000300k$50-100
Growth10,0003M$150-300
Scale100,00030M$500-1,500

Self-Hosted Stack

ScaleMonitorsMonthly Cost
MVP1,000$100-150
Growth10,000$200-400
Scale100,000$800-1,500

Key Differences from UptimeRobot

AspectUptimeRobotIsUp (Recommended)
Edge Locations~4 regions300+ (Cloudflare)
ArchitectureTraditional VMsServerless edge
Check LatencyHigher (fewer nodes)Lower (edge)
ScalingManualAutomatic
Ops BurdenHighMinimal
Protocol SupportFullHTTP-first (add TCP later)
Cost EfficiencyMediumHigh

Next Steps

  1. Set up Cloudflare Workers project for monitoring
  2. Deploy Vercel app with Neon database
  3. Implement HTTP monitoring first
  4. Add Fly.io when ping/port is needed
  5. Scale regions based on customer demand

Sources

On this page