Uptime SLAs: What Your Hosting Provider Promises and What They Actually Deliver

When a Nigerian company signs a hosting agreement and sees "99.9% uptime SLA," the natural interpretation is: our application will be available 99.9% of the time. This interpretation is incorrect in at least three ways, and understanding the gap is critical to designing infrastructure that actually meets business availability requirements.

This article dissects SLA mathematics, explains what SLAs actually commit to, and describes what engineering and architectural investment is required to actually achieve the availability numbers that matter for your business.

The Mathematics of Uptime

Uptime percentages sound impressive until converted to downtime hours:

SLA	Annual Downtime	Monthly Downtime	Weekly Downtime
99%	87.6 hours	7.3 hours	1.68 hours
99.5%	43.8 hours	3.65 hours	50 minutes
99.9%	8.76 hours	43.8 minutes	10 minutes
99.95%	4.38 hours	21.9 minutes	5 minutes
99.99%	52.6 minutes	4.38 minutes	1 minute

A 99.9% SLA permits 8.76 hours of downtime per year. For a Nigerian e-commerce business processing ₦5M per day (₦208,000 per hour), 8.76 hours of downtime at the claim limit is ₦1.8M in permitted revenue loss — for which the SLA credits a fraction of the hosting fee. The economics are clearly unfavourable to the customer.

The 99.99% ("four nines") threshold is where availability becomes operationally acceptable for critical business systems: 52.6 minutes of permitted downtime per year is a consequence the business can manage.

What the SLA Actually Covers

The fine print in cloud and hosting provider SLAs typically contains several exclusions that reduce the practical scope of the commitment significantly:

What is covered: The infrastructure layer — compute instances running, storage accessible, network carrying traffic. The cloud provider's servers and network receive the SLA.

What is not covered: Your application running on that infrastructure. Your application may be unhealthy (error pages, degraded performance, wrong behaviour) while the underlying compute instance is running perfectly and the provider's SLA is being met. The provider has no visibility into, and no SLA commitment over, your application layer.

Exclusions typically include:

Scheduled maintenance windows (often not counted toward downtime at all)
Events caused by customer actions (deploying a broken application)
Third-party software failures
Events that the provider considers beyond their reasonable control (force majeure, which cloud providers interpret broadly)
Performance degradation that is not complete unavailability (your application is slow but not down)

The credit mechanism: When a provider misses their SLA in a month, the credit is typically a percentage of that month's hosting fee for the affected service. At ₦50,000/month hosting, a 10% credit is ₦5,000 — against any business impact that may have been much larger. SLA credits are not compensation for business loss; they are discounts on future hosting. Most providers explicitly disclaim liability for business losses resulting from downtime.

The Layers That Determine Business Availability

Business availability — the percentage of time your application is actually usable by your customers — is determined by the combination of all layers in the delivery stack:

Your application code: Bugs, memory leaks, unhandled errors, deployment issues
Application infrastructure: Hosting platform, load balancers, autoscaling
Database availability: Database uptime, query performance, connection pool management
Third-party dependencies: APIs your application calls (payment gateways, SMS providers, verify services)
CDN/Edge layer: Content delivery network, edge caching
DNS: Domain resolution (often overlooked)
Client-side factors: User's network, device performance

The cloud provider's SLA covers layer 2 (partially) and layer 5/6. Every other layer is your responsibility. In practice, application availability is lower than infrastructure availability because application-layer failures (a bad deployment, an unhandled database connection error, a third-party API being unavailable) are more common than infrastructure failures.

For Nigerian businesses, the practical availability limiters are frequently:

Third-party payment gateway availability (Paystack and Flutterwave both have had notable incidents)
Database performance degradation under load
Deployment issues (bad code reaching production because automated testing was insufficient)
Network issues on the user's side (outside anyone's SLA)

What "High Availability" Architecture Requires

To actually achieve 99.9%+ application availability (not just infrastructure availability), the architecture must address each failure mode:

Redundancy: Multiple instances of each stateless application component behind a load balancer. Single-server deployments cannot be highly available — if the server fails, the application is down until it is restarted. Multiple instances mean the load balancer routes around a failing instance automatically.

Database availability: Database replication (primary + replica) with automated failover. Most managed database services (AWS RDS, Cloud SQL, PlanetScale) provide this. The failover time is 30–60 seconds for automatic promotion of a replica — this is the downtime window for database failure scenarios.

Multi-region considerations: True 99.99% availability for Nigerian businesses with global users requires multi-region deployment, where a failure in one cloud region does not take down the application. This is significant engineering investment. Most Nigerian businesses at current scale are better served by single-region with redundancy than by the complexity of multi-region.

Graceful degradation: Design the application to continue functioning in a degraded state when dependencies fail. If the SMS provider is unavailable, the application should still allow transactions and queue the SMS for later delivery — rather than failing the transaction because the confirmation SMS cannot be sent. If analytics collection fails, the core transaction should still succeed.

Circuit breakers: For third-party API calls, implement circuit breakers that detect repeated failures and stop making calls to a failing service, returning a fast failure rather than waiting for timeouts. This prevents a slow or failing dependency from cascading into your application's availability.

Monitoring and Incident Response Time

Availability arithmetic includes not just the frequency and duration of outages but the mean time to detection (MTTD) and mean time to recovery (MTTR). An outage that happens at 2 AM and is discovered at 8 AM when customers start calling is a 6-hour outage regardless of how quickly it is then resolved.

Synthetic monitoring: Automated external checks that periodically (every 1–5 minutes) call your application's critical endpoints and alert if they fail or respond slowly. This catches outages almost immediately rather than waiting for customer reports. Simple implementations include UptimeRobot, Betterstack, or AWS CloudWatch Synthetics.

Alerting to on-call: Monitoring without alerting adds no resolution speed. Alerts must reach an engineer capable of responding, on the timeline required by the business's availability requirements. For Nigerian businesses with office hours operations and 4-hour acceptable recovery time, business-hours on-call may be sufficient. For 24/7 operations with 15-minute recovery requirements, a formal on-call rotation with defined escalation is required.

Your Business Availability SLA to Your Customers

If you provide SLA commitments to your customers (enterprise contracts, service agreements), those commitments must be consistent with what you can actually deliver. The most common engineering gap is committing to 99.9% availability in a customer contract without the redundant infrastructure architecture that makes 99.9% achievable.

Build the architecture first. Then commit the SLA. Service credits are less expensive than your customer's business losses, but both damage the relationship that underpins the contract.

For Nigerian businesses being asked to provide SLA commitments to enterprise clients for the first time: a well-designed architecture on a reputable cloud provider (AWS, GCP, Cloudflare) can reliably achieve 99.9% application availability with investment in the redundancy and monitoring practices described above. 99.95% is achievable with additional investment. Four nines (99.99%) requires significant engineering investment and is typically only warranted for mission-critical systems where downtime has immediate, large financial consequences.

Know what you can deliver. Commit to that. Build toward better.

Why We Build on the Edge: Cloudflare Workers in Africa — Edge computing for African latency
Data Backup and Recovery Controls — Business continuity and disaster recovery
Performance Optimisation as Revenue Strategy — How performance affects business outcomes