In-the-Lab: Full ESX/vMotion Test Lab in a Box, Part 5September 28, 2009
In Part 4 of this series we created two vSphere virtual machines – one running ESX and one running ESXi – from a set of master images we can use for rapid deployment in case we want to expand the number of ESX servers in our lab. We showed you how to use NexentaStor to create snapshots of NFS and iSCSI volumes and create ZFS clone images from them. We then showed you how to stage the startup of the VSA and ESX hosts to “auto-start” the lab on boot-up.
In this segment, Part 5, we will create a VMware Virtual Center (vCenter) virtual machine and place the ESX and ESXi machines under management. Using this vCenter instance, we will complete the configuration of ESX and ESXi using some of the new features available in vCenter.
Part 5, Managing our ESX Cluster-in-a-Box
With our VSA and ESX servers purring along in the virtual lab, the only thing stopping us from moving forward with vMotion is the absence of a working vCenter to control the process. Once we have vCenter installed, we have 60-days to evaluate and test vSphere before the trial license expires.
Prepping for vCenter Server for vSphere
We are going to install Microsoft Windows Server 2003 STD for the vCenter Server operating system. We chose Server 2003 STD since we have limited CPU and memory resources to commit to the management of the lab and because our vCenter has no need of 64-bit resources in this use case.
Since one of our goals is to have a fully functional vMotion lab with reasonable performance, we want to create a vCenter virtual machine with at least the minimum requirements satisfied. In our 24GB lab server, we have committed 20GB to ESX, ESXi and the VSA (8GB, 8GB and 4GB, respectively). Our base ESXi instance consumes 2GB, leaving only 2GB for vCenter – or does it?
Memory Use in ESXi
VMware ESX (and ESXi) does a good job of conserving resources by limiting commitments for memory and CPU. This is not unlike any virtual memory capable system that puts a premium on “real” memory by moving less frequently used pages to disk. With a lot of idle virtual machines, this ability alone can create significant over-subscription possibilities for VMware; this is why it could be possible to run 32GB worth of VM’s to run on a 16-24GB host.
Do we really want this memory paging to take place? The answer – for the consolidation use cases – is usually “yes.” This is because consolidation is born out of the need to aggregate underutilized systems in a more resource efficient way. Put another way, administrators tend to provision systems based on worst case versus average use, leaving 70-80% of those resources idle in off-peak times. Under ESX’s control those underutilized resources can be re-tasked to another VM without impacting the performance of either one.
On the other hand, our ESX and VSA virtual machines are not the typical use case. We intend to fully utilized their resources and let them determine how to share them in turn. Imagine a good number of virtual machines running on our virtualized ESX hosts: will they perform well with the added hardship of memory paging? Also, when begin to use vMotion those CPU and memory resources will appear on BOTH virtualized ESX servers at the same time.
It is pretty clear that if all of our lab storage is committed to the VSA, we do not want to page its memory. Remember that any additional memory not in use by the SAN OS in our VSA is employed as ARC cache for ZFS to increase read performance. Paging memory that is assumed to be “high performance” by NexentaStor would result in poor storage throughput. The key to “recursive computing” is knowing how to anticipate resource bottlenecks and deploy around them.
This brings the question: how much memory is left after reserving 4GB for the VSA? To figure that out, let’s look at what NexentaStor uses at idle with 4GB provisioned:
As you can see, we have specified a 4GB reservation which appears as “4233 MB” of Host Memory consumed (4096MB+137MB). Looking at the “Active” memory we see that – at idle – the NexentaStor is using about 2GB of host RAM for OS and to support the couple of file systems mounted on the host ESXi server (recursively).
Additionally, we need to remember that each VM has a memory overhead to consider that increases with the vCPU count. For the four vCPU ESX/ESXi servers, the overhead is about 220MB each; the NexentaStor VSA consumes an additional 140MB with its two vCPU’s. Totaling-up the memory plus overhead identifies a commitment of at least 21,828MB of memory to run the VSA and both ESX guests – that leaves a little under 1.5GB for vCenter if we used a 100% reservation model.
Memory Over Commitment
The same concerns about memory hold true for our ESX and ESXi hosts – albeit in a less obvious way. We obviously want to “reserve” memory for required by the VMM – about 2.8GB and 2GB for ESX and ESXi respectively. Additionally, we want to avoid over subscription of memory on the host ESXi instance – if at all possible – since it will already be working running our virtual ESX and ESXi machines.
Could we provision our ESX and ESXi hosts with 16GB each even though we only have 24GB total memory in our system? Sure, but we will run into a problem with our ESX instance as over commitment gets shifted to storage. Here’s why:
- when we created the ESX volume, it was based on an iSCSI target of 24GB in size;
- we chose a 4GB/thin and 12GB thick VMDK for the ESX instance (about 13GB in use);
- any memory NOT committed to the VM will show-up as a page file on the iSCSI target, limiting our page file to no more than 5.6GB (70% of 8GB – derived from 24GB total – 4GB OS – 12GB console).
For these reasons, the maximum memory we can apply to the ESX instance(s) is about 8GB (12GB if we reserve 4GB for each instance.) If we want to add memory to ESX, we can either:
- up the reservation – limiting the number of ESX servers we can run based on the size of the reservation;
- up the size of the ESX volume – requiring a new snapshot, etc.
- move the destination of the ESX swap file to a volume with lots of (temporary) space.
Why does this not affect the ESXi instance(s)? Because they are provisioned as NFS file systems! Each file system draws from the volume pool because no restrictions on the file system quota have been provisioned. Had there been a quota, we could simply relax that quota to accommodate the additional space that the VM needs for its swap file and re-provision the VM for more memory without extending its reservation.
Beware the Swap File
Here’s where ESXi begins to use hidden calculus: the swap file. Notice something in the screen capture below (click to enlarge)? We have increased the memory in our ESX VM to 12GB requiring 8GB of swap space – the reported available disk space does NOT include the disk consumed by the swap file!
While this “helpful” feature may prohibit ESXi from complaining that the swap file has caused the file system to be “low on storage” because it encroaches on the 30% safety boundary, it also can mask the fact that the thinly provisioned disk may be dangerously close to running out of space! An abended (or best-case, paused) VM due to “out of disk” condition is no fun in practice. Perhaps VMware needs to rethink this…
Choosing a Reservation Model
If you’ve been paying attention, you have probably guessed that we have settled with a reservation of 4GB for each of the ESX/ESXi VMs which results in about 50% of our memory resources committed to reservation. This leaves plenty of room for our vCenter virtual machine and Windows Server 2003 Standard Edition.
Why not go with the 100% reservation model? Simply put: we trust VMware to manage memory better than we can, and our use case is for lab/testing not performance. With only a 4GB reservation for each ESX/ESXi, we can run 4-5 ESX servers in our single machine lab. We’re stuck with only two in the 100% reservation model…
Creating the vCenter Virtual Machine
According to the ESXi Installable and vCenter Server Setup Guide, our vCenter server virtual machine should meet the following (minimum) requirements:
- Microsoft Server 2003 Standard Edition (Recommended) Requirements
- 550 MHz CPU
- 256MB RAM
- 1.5GB Disk
- vCenter Server Minimum Requirements
- 2 x 2,000 MHz CPU
- 3GB RAM
- 3.5GB Disk for vCenter
- 20GB Disk for Update Manager (separate disk)
- Microsoft SQL Server 2005 Express (minimum) Requirements
- 500 MHz CPU
- 192MB RAM
- 600MB Disk (plus 3.5GB for vCenter’s needs)
- Total System Minimums
- 2 x 2.5 GHz CPU
- 3.5GB RAM
- 5.5GB Base disk
- 20GB Update Manager disk
Therefore, our vCenter VM will look like this:
- 2 vCPU’s
- 3.5GB RAM (448MB reserved)
- 1 – OS drive (vmdk), 8GB (thin)
- 1 – SQL/vCenter/Application drive (vmdk), 12GB (thin)
- 1 – Update Manager drive (vmdk), 24GB (thin)
We will house our VM on a newly provisioned NFS file system which we will ultimately share with both the host ESXi instance and any subsequent virtual ESX/ESXi instance(s). We will provision the storage similarly to the NFS storage we created in Part 3 of this series. To activate this file system, we need to go back to the NexentaStor web GUI and select “Data Management/Shares/Folders/Create” and name the new file system “volume0/default/nfs/virtual-machines” with a block size of 16K, then click “Create.”
Before we’re done with this NFS file system, we need to make sure that the file system share is properly masked to allow read/write and “root” access to the ESXi server. Since this file system will ultimately be shared between the root ESXi host and the virtual ESX servers, we could extend the mask at this time to include the virtual servers, but we’ll wait and do that once they are properly configured. At this point, we’ll just add the fully qualified domain name (FQDN) of the host with its host IP address (separated by a colon) in the “Read-Write” and “Root” text entry fields of the NFS share properties for the file system.
Why NFS for vCenter?
At this point you might question as to why the vCenter installation is being placed on an NFS volume. There is no “technical” reason why vCenter could not exist on an iSCSI volume – in fact, that would be a typical deployment. However, for the purpose of this lab, the storage is essentially boot-strapped via the VSA and that does not work so well for iSCSI storage. If we tie the root ESXi server to an iSCSI volume that does not exist at boot time, the boot time increases as ESXi waits for the iSCSI storage to become available or time-out.
While the same time-out happens for NFS, it results in less delay at boot time. Also, the NFS auto-mounts some time after boot whereas iSCSI will require a manual rescan of the “iSCSI Software Adapter” before the volume is available. If we want to create an automatic sequence whereby our VSA, (local storage), ESXi (NFS) and vCenter (NFS) servers are booted as the ESXi server boots, we can do it without administrative intervention.
Next page, Creating an NFS file system for vCenter…