The Virtual Buzz

Virtualization is all the buzz these days, especially for server farms. As my own collection of server hardware heads towards 20 boxes and is still hard pressed to handle all the tasks I need it to do, I'm finding the lure of the herd hard to ignore.

I already use virtualization on my laptop. I set up virtual images to try out configurations and new versions of software that I run on my servers. Being able to run different operating systems, and also to have systems that I can mangle in various ways without trashing my laptop, is handy. So why run it on my servers?

The concept of virtualizing a server farm is that each server that is fulfilling some role, for example a web server, application server, or database server, can be converted into a virtual image, and more than one can be run on a single server. Things don't look any different to the software now running on a virtual server, nor do they look different to the users of the software. But a single physical server can run multiple virtual images, which can be doing different things, and can even be running different operating systems.

So what are the benefits of doing this?

As I plan our own potential adoption of server virtualization, I'm weighing the shiny wonders of this fabulous technology against a number of pitfalls. The vendors of this software tout the benefits loudly, giving the impression that you can't help but save money, even faced with their often exhorbitant licensing. But reading between the lines, and perusing the word on the Web, turns up some things to think about.

I'll go a little more into the possible benefits, and more into the pitfalls, in separate articles.

The hidden pitfalls of server virtualization

Where's there's buzz, there's bullshit. Slashing hardware costs and the time I spend herding servers makes virtualization sound like a silver bullet, but I have no doubt it's not as easy as the salespeople tell me to pull it off. I've spent some time researching and thinking it through, and have come up with a few things that I need to keep in mind when planning to get into virtualization properly.

Will I really spend less on hardware?

So maybe I can replace replace 4 or more servers with 1, but none of the boxes I currently have can handle that, so I either need to beef some up, or buy newer, bigger boxes. So there is probably some cash outlay, which means that even if I'm using fewer boxes in the end, if I've already paid for the boxes I'm cutting out I may not actually save money.

Do I know how much the hardware can handle?

There's no guarantee I really can fit 4 server images onto a single box and get the same performance I had on 4 different boxes.

Let's look at my java application servers, which are the hungriest animals in my farm. They absolutely must have 2 gigs of RAM each, or they're not gonna work. They actually can't make use of more than this, due to the nature of the 32 bit JVM, so in theory they are an excellent candidate for virtualization.

So I could pimp a few of my current single CPU 2 gig servers with 8 gigs of RAM and 2 dual core CPU's each. Or I could satisfy my techo-lust and get servers with 2 quad core processors each, and 16 gigs of RAM.

In theory my pimped server will have the same power as 4 of my old servers, and the new quad-cored beast could run 8 servers. Now that would be consolidation!

But there's more to hardware than CPU and RAM, so when I load up one of these beasts I could find that the network or disk I/O are a bottleneck, keeping all of my virtual images crawling.

The point is, I don't know how many of my VM's a server can handle until I try it. It's even more difficult in the wonderful world of virtualization, where a box may be running a mixture of web, app, and database servers along with infrastructure services like DNS and email. I might actually get better peformance with a mix of servers than a monoculture where all of the VM's are trying to do the same type of thing and competing for the same hardware resources.

So rule number 1, before I spec out and price up my new virtualized server farm and get it signed off by the bosses, is to do plenty of testing to get an idea of what hardware I'll really need. I ought to test the pimped version of my existing servers, as well as higher specced boxes, maybe with different characteristics. What happens if I use hopped up caching RAID controllers, or high end network cards, or fibre channel rather than iSCSI for my shared storage?

Hardware restrictions

One interesting tidbit I ran across when googling for virtualization limitations is that some of the groovier features can be picky about the hardware. In particular, the capability of shifting virtual machines between physical servers on the fly may require that the physical hardware be very similar, down to the CPU family and chipset. I've read this about VMware's vmotion technology (although I've lost the original reference, sorry).

This means I can't mix and match hardware, and could get trapped by legacy hardware that goes out of production. I can easily see ending up with several pools of hardware, and having to resort to old fashioned manual methods to migrate servers between them. This would mean having to choose between giving up the productivity gains of easy-auto migration or chucking out slightly older, but still perfectly good hardware to buy a raft of newer kit.

What's interesting about virtualization is that it is actually the opposite of the concept - epitomized by Google - of building a utility-style computing infrastructure on lots of cheap, commodity boxes. Instead, I will end up buying a few highly and carefully specified servers.

Don't forget spare capacity for failover!

So let's say I work out that I can trim my current farm to a quarter of its former size, 4 servers per box. My old farm had failover servers, or load balanced servers scoped to be able to cope with one of them going down. I may have virtual servers in my new pool which do the same thing, so if one virtual image crashes others will pick up the slack.

But what happens if one of my physical boxes goes down?

I now have to find homes for all of the virtual servers on that box, but if I've been over-efficient in sizing my hardware capacity I won't have any place to put them. In practice, I will probably have virtual machines I can take offline in a crisis, like my staging servers, and of course the failover images.

But the point is, I need to think about this up front. I'll work out what my tolerance for failure should be - is it enough to be able to cope with 1 server croaking, or should I have 2, or some percentage of my total?

The d'oh! of licensing costs

So if I work out that I will run 20 virtual servers on 4 boxes in a new server farm, then I only need to spend 20% of what I would for 20 separate boxes, right? Oops, no, if I'm running a commercial OS like Windows, Red Hat, or Suse, I've still got to buy 20 licenses. Common sense, but easy to overlook when costing initially, and it would be embarrassing to have to go back and ask for the extra budget.

The real hidden cost ...

OK, maybe I still have to pay for all those licenses (or I can just use Debian), but at least I only have 4 boxes to manage going forward, rather than 20. Phew!

Nope. That's 20 servers that need to be monitored, backed up, patched, disk space managed, user accounts kept up to date, configurations to be changed, etc.

Even worse, once I go to virtualization, I expect to expand my usage of virtual images to where I'll have some that are kept offline until needed. So I can do certain staging, testing, and other exercises by bringing up images I need for a short while, then putting them back into cold storage. So even with a finely honed, well-automated infrastructure management system, I need to work out how these get updated. Do I cycle them into memory periodically to run updates on, or have the update process (potentially a long one) run when they are brought online as needed?

Conclusion

There is obviously plenty I need to think about when planning to go to virtualization. I'm sure there's more I'm missing, and will learn the hard way. I do still think the payoff can outweigh the difficulties, but we'll see as I go along!

What's cool about virtualization?

How would you like to cut your annual server farm budget to a fraction of its current cost, reaping the glory and gold due to a corporate IT champion? Lop your data center to a third of its current size, no a fifth, maybe less! At the same time, you can improve resiliency, flexibility, and potency!

Consolidation is the main selling point of virtualization. You may be able to use a single box to do the same work currently being done by some number (wave hands here) of boxes. Although licensing costs of virtualization software may range up to $5,000 or more, and the boxes will probably need to have more muscle and so cost more, if you can get the ratio of virtual machines to physical machines high enough, you probably can cut your costs.

There are a few keys to this, and some spinoff benefits.

Virtualization isn't likely to help you consolidate servers that are groaning under their workload. But in most data centers, the majority of servers aren't working all that hard. You have them because they're doing some essential service, and probably it's a service that for some reason or other shouldn't be crowded onto the same system with others. Different versions of software, libraries, OS, or whatever mean it needs its own machine.

You try to minimize the hardware you devote to it, put it on a single CPU 1U server, but that's still a fair amount of overhead. Then there are failover servers, staging servers, and various other machines that are usually idle, but need to be right on hand.

So you can have these as virtual servers, sharing a physical box but not using much of its resources unless called to duty.

How much savings you can get from server consolidation will largely depend on how many servers you have which are underutilized.

In my farm, I've got plenty of these. I've got staging servers configured with different sets of OS and application software, to replicate client environments. I've got a server running LDAP, email, and DNS, which almost never returns a number higher than 0.02 when I run uptime. My Apache servers also don't work very hard, but I need two machines devoted to these.

So I'm sure I could compress my running services onto a smaller set of hardware.

Once I've done that, I ought to be able to balance those images between physical boxes to make the most efficient use of my resources. This might be aided by splitting out some things currently on the same system. For instance, my email, LDAP, and DNS servers might each get their own virtual image, so they can be more finely balanced, and also to reduce their interdependence. I can upgrade or replace my DNS server without worrying about breaking a library needed by my LDAP server.

This could be a lot of work, but the better (and more expensive) virtualization tools can make it easier, and even automate it.

This said, there are a number of hidden traps which might make it harder to get these benefits. I'll follow up with my thoughts on what these are.