IsUp - Infrastructure Recommendations

UptimeRobot's Approach (What They Do)

Based on IP analysis and their recent architecture migration:

┌─────────────────────────────────────────────────────────────────────┐
│                    UptimeRobot Infrastructure                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   Multi-Cloud VM Strategy (Traditional)                              │
│                                                                      │
│   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐             │
│   │     AWS      │  │ DigitalOcean │  │   Hetzner    │             │
│   │  (Premium)   │  │   (Value)    │  │   (Budget)   │             │
│   ├──────────────┤  ├──────────────┤  ├──────────────┤             │
│   │ US-East      │  │ NYC, AMS     │  │ Germany      │             │
│   │ US-West      │  │ SGP, SYD     │  │ Finland      │             │
│   │ EU-Frankfurt │  │              │  │              │             │
│   │ AP-Tokyo     │  │              │  │              │             │
│   └──────────────┘  └──────────────┘  └──────────────┘             │
│                                                                      │
│   ~110 checker IPs across 4 regions                                 │
│   Traditional VMs running custom monitoring software                │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Their Stack:

PHP + Node.js backend
MySQL database
Redis caching
Multi-cloud VMs for monitoring nodes
Moved FROM dedicated servers TO cloud (2024-2025)

What Should IsUp Use?

Option 1: Cloudflare Workers (Recommended for MVP)

Best for: Fast time-to-market, lowest ops burden, global edge by default

┌─────────────────────────────────────────────────────────────────────┐
│                    Cloudflare Workers Architecture                   │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │              Cloudflare Edge (300+ locations)                │   │
│   │                                                              │   │
│   │   ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐          │   │
│   │   │ Worker  │ │ Worker  │ │ Worker  │ │ Worker  │  ...     │   │
│   │   │ US-East │ │ EU-West │ │ Asia    │ │ Oceania │          │   │
│   │   └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘          │   │
│   │        │           │           │           │                │   │
│   │        └───────────┴─────┬─────┴───────────┘                │   │
│   │                          │                                  │   │
│   └──────────────────────────┼──────────────────────────────────┘   │
│                              │                                      │
│                              ▼                                      │
│   ┌──────────────────────────────────────────────────────────────┐ │
│   │                     Central Services                          │ │
│   │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐     │ │
│   │  │  Vercel  │  │   Neon   │  │ Upstash  │  │ Tinybird │     │ │
│   │  │  (App)   │  │ (Postgres)│  │ (Redis)  │  │(Analytics)│    │ │
│   │  └──────────┘  └──────────┘  └──────────┘  └──────────┘     │ │
│   └──────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘

Pros:

300+ edge locations (vs UptimeRobot's ~4 regions)
Zero server management
Built-in cron triggers for scheduled checks
Extremely low latency globally
$5/month for 10M requests
Durable Objects for state management
KV storage for configuration

Cons:

50ms CPU time limit per request (fine for HTTP checks)
Can't do raw TCP/ICMP (need workarounds for ping/port)
Vendor lock-in to Cloudflare

Cost Estimate:

Component	Monthly Cost
Workers (10M requests)	$5
KV Storage	$5
Durable Objects	$5-10
Total	$15-20

Option 2: Fly.io (Recommended for Full Control)

Best for: Need TCP/UDP/ICMP, want containers, more control

┌─────────────────────────────────────────────────────────────────────┐
│                       Fly.io Architecture                            │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │                  Fly.io Global Network                       │   │
│   │                                                              │   │
│   │   ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │   │
│   │   │   iad       │ │   ams       │ │   nrt       │           │   │
│   │   │ (Virginia)  │ │ (Amsterdam) │ │  (Tokyo)    │           │   │
│   │   │             │ │             │ │             │           │   │
│   │   │ 2x checker  │ │ 2x checker  │ │ 2x checker  │           │   │
│   │   │ containers  │ │ containers  │ │ containers  │           │   │
│   │   └─────────────┘ └─────────────┘ └─────────────┘           │   │
│   │                                                              │   │
│   │   ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │   │
│   │   │   syd       │ │   gru       │ │   lhr       │           │   │
│   │   │  (Sydney)   │ │ (Sao Paulo) │ │  (London)   │           │   │
│   │   └─────────────┘ └─────────────┘ └─────────────┘           │   │
│   │                                                              │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Pros:

Full Docker containers (any language, any protocol)
Can do ICMP ping, TCP port checks natively
30+ regions available
Easy horizontal scaling
Persistent volumes for local state
Built-in private networking

Cons:

More ops overhead than serverless
Containers run 24/7 (vs pay-per-invocation)
Need to manage scaling yourself

Cost Estimate:

Component	Monthly Cost
6 regions x 2 shared-cpu-1x	$30-60
Fly Postgres	$15
Total	$45-75

Option 3: Hybrid (Best of Both Worlds)

Best for: Production-ready, handles all monitor types

┌─────────────────────────────────────────────────────────────────────┐
│                     Hybrid Architecture                              │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│   HTTP/HTTPS/SSL Checks          TCP/ICMP/Port Checks               │
│   ┌───────────────────┐          ┌───────────────────┐              │
│   │ Cloudflare Workers│          │     Fly.io        │              │
│   │                   │          │                   │              │
│   │ • Fast & cheap    │          │ • Full protocol   │              │
│   │ • 300+ locations  │          │   support         │              │
│   │ • HTTP only       │          │ • 6-10 regions    │              │
│   └─────────┬─────────┘          └─────────┬─────────┘              │
│             │                              │                        │
│             └──────────────┬───────────────┘                        │
│                            │                                        │
│                            ▼                                        │
│             ┌──────────────────────────────┐                        │
│             │     Central API (Vercel)     │                        │
│             │     + Neon + Upstash         │                        │
│             └──────────────────────────────┘                        │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Detailed Comparison

Factor	Cloudflare Workers	Fly.io	AWS/DO/Hetzner (UptimeRobot style)
Setup Time	Hours	Days	Weeks
Ops Burden	Minimal	Low	High
Regions	300+	30+	DIY (3-10 typically)
HTTP Checks	✅ Excellent	✅ Good	✅ Good
TCP/Port	⚠️ Workarounds	✅ Native	✅ Native
ICMP Ping	❌ No	✅ Native	✅ Native
Cost (MVP)	$15-30/mo	$50-100/mo	$100-300/mo
Cost (Scale)	$100-500/mo	$200-500/mo	$500-2000/mo
Scaling	Auto	Manual	Manual
Vendor Lock-in	High	Medium	Low

My Recommendation

Phase 1 (MVP): Cloudflare Workers Only

Start with Cloudflare Workers for everything. Accept limitations:

HTTP/HTTPS monitoring: ✅ Perfect
Keyword monitoring: ✅ Perfect
SSL monitoring: ✅ Can check certs via HTTPS
DNS monitoring: ⚠️ Use DNS-over-HTTPS APIs
Ping monitoring: ❌ Skip for MVP
Port monitoring: ❌ Skip for MVP

Why?

Ship faster
Lowest cost
90% of users only need HTTP monitoring anyway
Add Fly.io later for ping/port

Phase 2 (Growth): Add Fly.io for Advanced Checks

When users request ping/port monitoring:

Deploy Fly.io containers in 6 key regions
Route ping/port checks to Fly.io
Keep HTTP checks on Cloudflare Workers

Phase 3 (Scale): Evaluate Multi-Cloud

At scale (100k+ monitors), consider:

Adding Hetzner for cost optimization in EU
Adding more regions based on customer demand
Potentially moving to Kubernetes for flexibility

Infrastructure Stack Summary

Recommended Production Stack

Component	Service	Why
App Hosting	Vercel	Easy deploys, great DX, auto-scaling
Database	Neon (Postgres)	Serverless, scales to zero, branching
Cache/Queue	Upstash Redis	Serverless, per-request pricing
HTTP Monitors	Cloudflare Workers	300+ locations, dirt cheap
TCP/Ping Monitors	Fly.io	Full protocol support
Time-Series	Tinybird or ClickHouse Cloud	Fast analytics at scale
Email	Resend	Modern, great API
SMS	Twilio	Reliable, global
Secrets	Vercel/Infisical	Secure env management
Monitoring	Axiom + Sentry	Logs + errors

Alternative: Self-Hosted Stack

If you prefer more control / lower cost at scale:

Component	Service	Why
App Hosting	Fly.io or Railway	Full control, predictable pricing
Database	Fly Postgres or Supabase	Managed, good DX
Cache/Queue	Fly Redis or Dragonfly	Self-managed but cheap
Monitors	Fly.io (all regions)	Single platform
Time-Series	Self-hosted ClickHouse	Cheapest at scale

Cost Projections

Serverless Stack (Recommended)

Scale	Monitors	Checks/day	Monthly Cost
MVP	1,000	300k	$50-100
Growth	10,000	3M	$150-300
Scale	100,000	30M	$500-1,500

Self-Hosted Stack

Scale	Monitors	Monthly Cost
MVP	1,000	$100-150
Growth	10,000	$200-400
Scale	100,000	$800-1,500

Key Differences from UptimeRobot

Aspect	UptimeRobot	IsUp (Recommended)
Edge Locations	~4 regions	300+ (Cloudflare)
Architecture	Traditional VMs	Serverless edge
Check Latency	Higher (fewer nodes)	Lower (edge)
Scaling	Manual	Automatic
Ops Burden	High	Minimal
Protocol Support	Full	HTTP-first (add TCP later)
Cost Efficiency	Medium	High

Next Steps

Set up Cloudflare Workers project for monitoring
Deploy Vercel app with Neon database
Implement HTTP monitoring first
Add Fly.io when ping/port is needed
Scale regions based on customer demand