How Seriously Does Your Cloud Hosting Provider Take Redundancy?

  • October 7, 2014

When deciding on a cloud hosting provider, the decision is usually built on price and what you actually get for it. The latter can be a little open ended, but it is usually a combination of performance, or amount of resources, and features. Something we have noticed on this side of the table building these services is that in general there is very little understanding of redundancy and failover mechanisms.

While building redundancy into sophisticated systems is not easy and the level of redundancy offered by a provider is often unclear, its importance is quite intuitive: downtime or unavailability is bad, even unacceptable. However, the reality is that sometimes systems break – some more often than others. The key is how those situations are handled.

SLA and Status Reporting

One way to quantify and compare redundancy is to look at the Service Level Agreements (SLA) of different providers – after all, this reflects the provider’s confidence in their service. The following table summarises UpCloud’s and some of our competitors’ SLAs.

IaaS SLA Comparison.
Company Service Guarantee Eligibility for compensation Compensation Notes
UpCloud 100% >5min 50x 50x the cost of resources unavailable
AWS 99,95% or 99% See notes 10% or 30% of monthly fees All instances must be down on 2 or more Availability Zones within a region
Azure 99,95% or 99% See notes 10% or 25% of monthly fees When monthly total falls below threshold
CloudSigma 100% >15min 50x 50x the cost of resources unavailable
DigitalOcean 99.99% See notes 1x 1x the cost of resources unavailable, eligibility not mentioned
ElasticHosts 100% >15min 100x 100x the cost of resources unavailable
ProfitBricks 99.99% >30min 5% of monthly fees 5% of monthly fee per 30min-1h depending on resource
Rackspace 100% >30min 5% of monthly fees 5% of monthly fee per 30min
Data gathered 09/2014. See appendix for sources.

The norm in the industry is to set a service guarantee level and offer compensation if that level is not met. There are usually some special conditions, like a minimum limit of unavailability, that must also be fulfilled before being eligible for compensation. UpCloud offers an SLA of 100% availability and any period of unavailability lasting longer than 5 minutes is eligible for compensation that amounts to 50x the cost of resources unavailable. As can be seen from the table above, this is superior to most of our competitors in this comparison. The take away here is that we take redundancy very seriously.

Another way to look at redundancy is the provider’s status page and its history. In fact, we are pleased to see that we managed to find a proper status page from all of the compared companies except ProfitBricks. A caveat here is that not all providers have an equal philosophy in updating their status pages and thus comparing them. At UpCloud we believe that a hosting provider should report all hiccups that affect customers and are the provider’s responsibility – a belief we follow firmly at status.upcloud.com.

Long on Redundancy

Although our SLA and open reporting are in good shape, they are certainly not a sales gimmick. We have built redundancy into our service right from the beginning and believe it to be a key issue in running a successful cloud hosting company.

We have designed our technology stack and software in such a way that it will automate failovers as we expect them to happen. We have prepared for a breakdown on any possible level of our service stack with the so-called N+1 philosophy. This means that we have multiple transit connection providers, multiple sources of electricity, multiple networking devices and so on – all the way to your storage backend. Technology does occasionally break down and we prepare for that on all levels.

With regards to storage, all customer data is located on two separate RAID-secured storage backends at all times. This means that disks can fail, servers hosting those disks can malfunction and still, our customers won’t need to endure outages due to our redundancy principles.

Needless to say, redundancy is an attribute of UpCloud that we are not willing to sacrifice for lower pricing or higher margins. Preserving customer data is crucial to us and building that trust into our customers along with great, achieved, overall service level is key to running our business successfully.

Do you know how your current provider treats redundancy? If you’d like to know more about UpCloud’s redundancy principles, please get in touch with us at [email protected]



Appendix

IaaS SLA Comparison Sources.
Company SLA Status Page
UpCloud UpCloud SLA UpCloud Status Page
AWS AWS SLA AWS Status Page
Azure Azure SLA Azure Status Page
CloudSigma CloudSigma SLA CloudSigma Status Page
DigitalOcean DigitalOcean SLA DigitalOcean Status Page
ElasticHosts ElasticHosts SLA ElasticHosts Status Page
ProfitBricks ProfitBricks SLA Status page not found
Rackspace Rackspace SLA RackSpace Status Page
Accessed on 09/2014.