Computers, Science, Technology, Xen Virtualization, Hosting, Photography, The Internet, Geekdom And More

Reliable, Available Or Highly Available?

Posted on | September 29, 2007 | No Comments

I’ll probably get a bunch of hate mail over this one, but that’s nothing new :) Many people seem to be confused when I use the word reliable when talking about a cluster of computers, they seem to be used to the word available. In particular, they seem used to the phrase highly available. Aren’t clusters confusing enough without jargon and buzzwords?

All three words do adequately describe a few different kinds of cluster configurations. Lets go through them.

Available – This means, a computer is on-line running some kind of service that permits some kind of interaction from remote users (usually, via the Internet). Not much interesting there. Some engineers use the word available to indicate that something is not reliable. This simply means, whatever service that is running (or group of services) is prone to breaking under stress.

Highly Available – As described above, whatever it is that you hope to run breaks easily under stress. A highly available cluster of computers will balance incoming requests across a small ‘farm’ of computers that run services that are considered to be available. If one, two or sometimes even several of these computers fail, users are still able to access the remaining computers. Highly available clusters often use shared storage and some kind of program to watch over the members of the cluster (aka nodes), re-booting those that fail to respond. Not a very elegant approach, but its cheap and it works.

Reliable – Take something that is available and make sure that it is (inherently) far less prone to fail. Often, this is done with free software by a programmer to make it smaller, more efficient and task-specific. This enables you to use the same methods as you would in a highly available configuration, however you usually end up needing fewer computers. Typically, a reliable cluster of computers will have very custom and specific Kernels, special security features, self-healing and auto-scaling capabilities as well as other things. Imagine highly available as being glued together with some duct tape for extra strength, a reliable cluster is pretty seamless.

Which one do you need? That depends on many things, from your budget all the way down to your expected growth and services that you hope to offer. Most people get going with a highly available configuration and then ‘tweak’ it into reliability, this seems to be the best way to go. You spend less money initially (which also means you have wasted less if your venture doesn’t go so well). You also get to try/replace/try-again with many different free software components.

There are many clustering solutions available on the market today. Some of them are good, not all of them are free. The best advice that I can give to anyone is to examine the costs two ways:

  1. How much does this cost if I just buy it commercially?
  2. How much would this cost to piece together with free software?

I can assure you, its well worth your time to ask those questions before making a purchase. I have yet to meet a “one size fits all” solution for getting a cluster going, they are so task and need specific. Most consultants charge very little to help you make an informed decision, I highly recommend hiring one. A couple hundred bucks initially could save you tens, even hundreds of thousands down the road.

I strongly caution anyone to ensure that they get full source code with any kind of commercial solution. You must be able to modify your own stuff in-house. This isn’t always easy, be sure to list it as one of your primary needs when contacting any kind of company. Their support rates could change, which would leave you in a very interesting predicament a few years down the road. You must also wait on them to fix any kind of usability issue, even if its a simple change to the user interface to make it easier. Be careful :)

I Hope that this clears it up a bit, I was getting some really blank stares when using rather standard terms. Media does odd things to buzzwords. I find it very odd that this technology still confuses people, given the amount of time that its been within reach on cheap hardware. But hey, gotta love job security!


Leave a Reply

  • Monkey Plus Typewriter
  • Stack Overflow

  • Me According To Ohloh

  • Meta