Migrating from Caddy to Traefik
This guide walks through migrating from the current Caddy setup to Traefik for dynamic multi-tenant proxy management.
Why Migrate?
Current Caddy Issues
- Configuration Conflicts: Mixing Caddyfile and API causes srv1/srv2 port conflicts
- Complex Sync Logic: Requires workarounds with PATCH/PUT operations
- Limited Scalability: Not designed for thousands of dynamic routes
- Poor Observability: Limited metrics and debugging capabilities
Traefik Benefits
- True API-First: Built for dynamic configuration
- No Conflicts: Single configuration source via HTTP provider
- Highly Scalable: Efficiently handles thousands of routes
- Observable: Built-in metrics, tracing, and dashboard
- Zero-Downtime: Updates without restarts
Architecture Comparison
Current (Caddy)
Convex → Sync API → Caddy JSON API ←→ Caddyfile
↓
CONFLICTS!New (Traefik)
Convex → API Endpoint → Traefik HTTP Provider
↓
Clean JSON ConfigMigration Steps
Phase 1: Setup (Day 1)
-
Deploy Traefik alongside Caddy:
cd apps/projects/local/traefik ./setup.sh -
Configure Environment:
# Edit .env with your Cloudflare token vim .env -
Start Traefik:
docker-compose up -d -
Verify Dashboard:
- Visit: http://traefik.local.dev
- Login: admin/changeme
Phase 2: Testing (Days 2-3)
-
Test Configuration Endpoint:
# Check if API returns valid config curl http://localhost:3010/api/traefik/config | jq -
Add Test Domain:
- Create a test route in Convex pointing to Traefik
- Update DNS for test domain to Traefik IP
- Verify SSL certificate generation
-
Monitor Performance:
# Watch Traefik logs docker logs -f traefik-proxy # Check metrics curl http://localhost:8080/metrics
Phase 3: Migration (Days 4-7)
-
Batch Migration Strategy:
// Suggested batch order const migrationBatches = [ // Batch 1: Low traffic domains ['docs.dev', 'isup.dev'], // Batch 2: Medium traffic ['biturl.dev', 'contacts.dev', 'homepage.dev'], // Batch 3: High traffic ['do.dev', 'local.dev', 'customers.dev'], // Batch 4: Critical services ['dns.local.dev', 'talk.dev', 'sell.dev'] ] -
For Each Batch:
- Update DNS to point to Traefik IP
- Monitor for 2-4 hours
- Check error rates and performance
- Proceed to next batch if stable
-
Rollback Plan:
- Keep Caddy running throughout migration
- DNS changes can be reverted quickly
- Document any issues for each domain
Phase 4: Cutover (Day 8)
-
Final Validation:
# Test all domains for domain in do.dev contacts.dev biturl.dev; do echo "Testing $domain..." curl -I https://$domain done -
Stop Caddy:
docker stop caddy-reverse-proxy docker rm caddy-reverse-proxy -
Update Infrastructure:
- Remove Caddy configuration files
- Update documentation
- Update monitoring alerts
Configuration Mapping
Caddy Route → Traefik Route
Caddy (Caddyfile):
do.dev {
reverse_proxy 10.1.0.33:3005 10.3.0.33:3005 {
health_uri /
health_interval 30s
lb_policy first
}
}Traefik (JSON):
{
"http": {
"routers": {
"router-do-dev": {
"rule": "Host(`do.dev`)",
"service": "service-do-dev",
"tls": { "certResolver": "cloudflare" }
}
},
"services": {
"service-do-dev": {
"loadBalancer": {
"servers": [
{ "url": "http://10.1.0.33:3005" },
{ "url": "http://10.3.0.33:3005" }
],
"healthCheck": {
"path": "/",
"interval": "30s"
}
}
}
}
}
}UI Updates Required
1. Update Sync Function
Replace Caddy sync with Traefik config regeneration:
// Old (Caddy)
await caddyClient.loadConfiguration(routes)
// New (Traefik)
// Just trigger config regeneration
await fetch('/api/traefik/config', { method: 'POST' })2. Update Server Status Page
Change health check endpoint:
// Old
const health = await fetch('http://10.3.3.3:2019/health')
// New
const health = await fetch('http://traefik:8080/ping')3. Update Route Management
No changes needed - routes still stored in Convex!
Monitoring & Debugging
Useful Commands
# View current routes
curl http://localhost:8080/api/http/routers | jq
# View service health
curl http://localhost:8080/api/http/services | jq
# Check specific route
curl http://localhost:8080/api/http/routers/router-do-dev | jq
# View real-time access logs
docker logs -f traefik-proxy 2>&1 | grep AccessLog | jqMetrics to Monitor
- Response Times:
traefik_service_request_duration_seconds - Error Rates:
traefik_service_requests_total{code="5XX"} - Active Connections:
traefik_service_open_connections - Certificate Status: Check dashboard or
/api/http/routers
Common Issues & Solutions
Issue: Routes Not Appearing
Solution: Check API endpoint is returning valid JSON:
curl http://localhost:3010/api/traefik/config | jq '.http.routers'Issue: SSL Certificate Errors
Solution:
- Check Cloudflare token is valid
- Ensure domain points to Traefik IP
- Check ACME logs:
docker logs traefik-proxy | grep acme
Issue: Health Checks Failing
Solution: Verify upstream services are accessible:
curl http://10.1.0.33:3005/ # From Traefik container networkIssue: Configuration Not Updating
Solution: Check polling is working:
docker logs traefik-proxy | grep "Configuration loaded from"Performance Tuning
For High Traffic
Add to docker-compose.yml:
services:
traefik:
ulimits:
nofile:
soft: 65536
hard: 65536
deploy:
resources:
limits:
cpus: '4.0'
memory: 4GConnection Pooling
In route configuration:
{
loadBalancer: {
servers: [...],
// Add connection limits
maxConn: 100,
// Add timeout settings
responseForwarding: {
flushInterval: "100ms"
}
}
}Success Criteria
Migration is complete when:
- All domains resolved via Traefik
- Zero Caddy containers running
- SSL certificates valid for all domains
- Health checks passing for all upstreams
- Response times ≤ previous Caddy setup
- Error rates < 0.1%
- Monitoring dashboards updated
Rollback Procedure
If issues arise:
-
Quick Rollback (< 5 minutes):
# Start Caddy cd /root/local/caddy docker-compose up -d # Update DNS back to Caddy IP -
Investigate Issues:
- Check Traefik logs
- Review configuration
- Test individual routes
-
Fix and Retry:
- Address specific issues
- Test with single domain
- Proceed with migration
Post-Migration Tasks
-
Documentation:
- Update README files
- Remove Caddy documentation
- Update runbooks
-
Cleanup:
- Remove Caddy containers
- Delete Caddy configuration files
- Remove unused API endpoints
-
Optimization:
- Enable caching where appropriate
- Fine-tune rate limits
- Configure advanced middleware
Support & Resources
- Traefik Docs: https://doc.traefik.io/traefik/
- Dashboard: http://traefik.local.dev
- Metrics: http://localhost:8080/metrics
- Community: https://community.traefik.io/
Remember: Take it slow, test thoroughly, and keep Caddy as a fallback until you're 100% confident in Traefik!