Virtual infrastructure has come a long way from its desktop origins in 1998. Back then, VMware’s Chief Scientist, Mendel Rosenblum, freed virtualization from the IBM mainframe to run it on the commodity Intel platform. People thought it was a crazy idea, but those who understood Moore’s Law realized those little commodity Intel servers were becoming more powerful than one workload needed, just like the mainframes of old.
Nowadays VMware is set to turn twenty and its virtual infrastructure is firmly in the mainstream for IT service management (ITSM) pros, but it can be complex and this creates provisioning challenges for enterprises. Provisioning is easier than ever, but this in itself is a problem that creates virtual machine (VM) sprawl. New “thin provisioning” technologies can lull administrators into a false sense of security that can lead to catastrophic failures. And, in the cybersecurity age, there’s a new security threat inside the perimeter with increasing East-West traffic flows. These three challenges are expanded upon below.
1. The Free and Easy VM Explosion
“Virtual machines don’t cost anything, they’re essentially free. Right?” Wrong! “How hard can it be to launch a new virtual machine?” goes the rhetorical question, often spoken through an exasperated gasp – meaning “Get on with it!” VMs, of all shapes and sizes and life spans, are constantly created and then the administrator surveys Virtual Center and what do they see? VM Sprawl.
It’s unclear to the poor administrator as to who created which VMs in the virtual infrastructure for what reason, what’s production or test or something else, and all of a sudden that important project is delayed because all the capacity has been taken up by VMs that need to be analyzed and switched off. This cleanup operation takes time and expensive administrator hours. Worse still, those “free” VMs are just idling, which soaks up compute resources and starves production VMs. If you have enough idle VMs and add up all the wasted resources they consume, it can feel like literally throwing money out the window.
The fix for this isn’t technical, it’s a practice. Nobody should be able to launch a VM without going through a process. The process doesn’t have to be written to slow people down, but it should as a minimum collect metadata about the VM that helps the administrators understand who created it, when they created it, and why they created it. The next step is introducing “show back” or chargeback to create back-pressure in the system to change user perceptions of the “free and easy VM.”
2. The Virtual Infrastructure Problems Lurking Inside Over Commitment and Thin-Provisioning
Provisioning hardware in a data center is seemingly simple yet surprisingly non-trivial. Let’s agree that it takes time, specialist skills, and is something you mostly want to avoid. The slowness of provisioning capacity is misaligned with the unpredictable nature of business demand, and so techniques such as over-commitment and deduplication are used to “get more bang for your buck” out of the same IT equipment. In other words, to help you cope with fluctuating demand without having to touch the datacenter.
Virtual infrastructure over-commitment is running more VMs than is seemingly possible on a server. The hypervisor does this because VMs are almost always over-provisioned and don’t always need the CPU or RAM they’re given. This means administrators can be lazy and “let the hypervisor work it out” but, much like the Free and Easy VM Sprawl problem, allowing over commitment leads to lazy provisioning and idle VMs that consume resources without delivering any value. It’s just wasteful.
Thin-provisioning, and its partner in crime, deduplication, are ways to drastically reduce the writes to storage and therefore the capacity consumed. Thin-provisioning lets many VMs share the same base disk and then only write their individual changes to separate files. Deduplication fake-writes to storage where the block is the same as one already written. The two main problems with these approaches are to do with administrators taking their eye off the ball and letting storage get so overused that it becomes unusable, un-expandable, and unfixable; and any underlying disk errors have a wide blast radius as so many VMs are looking at the same disk. Administrators can also feel that storage is “infinite” and relax their monitoring stance. The practice to put in place to manage this is a mix of tooling to give the visibility and warnings, and for the administrator to be wary of the blast-radius and capacity risks, and to assign workloads and storage pools to mitigate the risk.
3. The East-to-West Security Problem of Virtual Infrastructure
Provisioning virtual networks has changed significantly in the past decade. In the beginning, a VM connected through the hypervisor to a physical port on the server and up a trunk port into a switch. This led to hundreds of VMs being hosted in the same switch and able to speak with each other (known as East-West traffic) without traversing a router or firewall (known as North-South traffic). If a hacker gets into a VM behind the firewall, they can exploit this lack of security in East-West traffic and cause havoc inside the perimeter.
Virtual and distributed networks are now more advanced and can apply policies to the virtual ports that a VM uses to identify itself on the network. Traditional network configurations to stop VMs from doing naughty hacker things like DHCP spoofing (pretending to be the server that gives out addresses) were possible, but these didn’t prevent contact between VMs behind the firewall.
The answer to this is using more advanced virtual network security products that can do what’s called “micro-segmentation” on a software-defined network. Using a combination of advanced security products deployed behind the perimeter, VMs can no longer speak to each other unless explicitly allowed. So that’s the top three virtual infrastructure challenges and, to help, Alemba has recently launched vxStore – which is designed to help companies deliver the end-to-end functionality needed to create a true private cloud. It provides the vital missing layer from existing private/hybrid cloud software stacks to truly connect businesses to the cloud. For more information, please visit the Alemba website.
John Murnane
John Murnane has worked in the IT industry for more than 20 years, with 12 of those entirely focused in IT service management. He has worked and experienced all sides of the fence - as a consultant, a salesman, a manager and a customer. In recent years, John spent a number of years in EMC and VMware specifically concentrating on the design, management & operation of cloud platforms. In particular, he has formed very detailed views about the continued necessity for ITSM but also the absolute necessity of ITSM to successfully adapt to this new mode of operations. John is now the COO of Alemba.