h1

In-the-Lab: Full ESX/vMotion Test Lab in a Box, Part 5

September 28, 2009

In Part 4 of this series we created two vSphere virtual machines – one running ESX and one running ESXi – from a set of master images we can use for rapid deployment in case we want to expand the number of ESX servers in our lab. We showed you how to use NexentaStor to create snapshots of NFS and iSCSI volumes and create ZFS clone images from them. We then showed you how to stage the startup of the VSA and ESX hosts to “auto-start” the lab on boot-up.

In this segment, Part 5, we will create a VMware Virtual Center (vCenter) virtual machine and place the ESX and ESXi machines under management. Using this vCenter instance, we will complete the configuration of ESX and ESXi using some of the new features available in vCenter.

Part 5, Managing our ESX Cluster-in-a-Box

With our VSA and ESX servers purring along in the virtual lab, the only thing stopping us from moving forward with vMotion is the absence of a working vCenter to control the process. Once we have vCenter installed, we have 60-days to evaluate and test vSphere before the trial license expires.

Prepping for vCenter Server for vSphere

We are going to install Microsoft Windows Server 2003 STD for the vCenter Server operating system. We chose Server 2003 STD since we have limited CPU and memory resources to commit to the management of the lab and because our vCenter has no need of 64-bit resources in this use case.

Since one of our goals is to have a fully functional vMotion lab with reasonable performance, we want to create a vCenter virtual machine with at least the minimum requirements satisfied. In our 24GB lab server, we have committed 20GB to ESX, ESXi and the VSA (8GB, 8GB and 4GB, respectively). Our base ESXi instance consumes 2GB, leaving only 2GB for vCenter – or does it?

Memory Use in ESXi

VMware ESX (and ESXi) does a good job of conserving resources by limiting commitments for memory and CPU. This is not unlike any virtual memory capable system that puts a premium on “real” memory by moving less frequently used pages to disk. With a lot of idle virtual machines, this ability alone can create significant over-subscription possibilities for VMware; this is why it could be possible to run 32GB worth of VM’s to run on a 16-24GB host.

Do we really want this memory paging to take place? The answer – for the consolidation use cases – is usually “yes.” This is because consolidation is born out of the need to aggregate underutilized systems in a more resource efficient way. Put another way, administrators tend to provision systems based on worst case versus average use, leaving 70-80% of those resources idle in off-peak times. Under ESX’s control those underutilized resources can be re-tasked to another VM without impacting the performance of either one.

On the other hand, our ESX and VSA virtual machines are not the typical use case. We intend to fully utilized their resources and let them determine how to share them in turn. Imagine a good number of virtual machines running on our virtualized ESX hosts: will they perform well with the added hardship of memory paging? Also, when begin to use vMotion those CPU and memory resources will appear on BOTH virtualized ESX servers at the same time.

It is pretty clear that if all of our lab storage is committed to the VSA, we do not want to page its memory. Remember that any additional memory not in use by the SAN OS in our VSA is employed as ARC cache for ZFS to increase read performance. Paging memory that is assumed to be “high performance” by NexentaStor would result in poor storage throughput. The key to “recursive computing” is knowing how to anticipate resource bottlenecks and deploy around them.

This brings the question: how much memory is left after reserving 4GB for the VSA? To figure that out, let’s look at what NexentaStor uses at idle with 4GB provisioned:

NexentaStor's RAM footprint with 4GB provisioned, at idle.

NexentaStor's RAM footprint with 4GB provisioned, at idle.

As you can see, we have specified a 4GB reservation which appears as “4233 MB” of Host Memory consumed (4096MB+137MB). Looking at the “Active” memory we see that – at idle – the NexentaStor is using about 2GB of host RAM for OS and to support the couple of file systems mounted on the host ESXi server (recursively).

Additionally, we need to remember that each VM has a memory overhead to consider that increases with the vCPU count. For the four vCPU ESX/ESXi servers, the overhead is about 220MB each; the NexentaStor VSA consumes an additional 140MB with its two vCPU’s. Totaling-up the memory plus overhead identifies a commitment of at least 21,828MB of memory to run the VSA and both ESX guests – that leaves a little under 1.5GB for vCenter if we used a 100% reservation model.

Memory Over Commitment

The same concerns about memory hold true for our ESX and ESXi hosts – albeit in a less obvious way. We obviously want to “reserve” memory for required by the VMM – about 2.8GB and 2GB for ESX and ESXi respectively. Additionally, we want to avoid over subscription of memory on the host ESXi instance – if at all possible – since it will already be working running our virtual ESX and ESXi machines.

Could we provision our ESX and ESXi hosts with 16GB each even though we only have 24GB total memory in our system? Sure, but we will run into a problem with our ESX instance as over commitment gets shifted to storage. Here’s why:

  • when we created the ESX volume, it was based on an iSCSI target of 24GB in size;
  • we chose a 4GB/thin and 12GB thick VMDK for the ESX instance (about 13GB in use);
  • any memory NOT committed to the VM will show-up as a page file on the iSCSI target, limiting our page file to no more than 5.6GB (70% of 8GB – derived from 24GB total – 4GB OS – 12GB console).

For these reasons, the maximum memory we can apply to the ESX instance(s) is about 8GB (12GB if we reserve 4GB for each instance.) If we want to add memory to ESX, we can either:

  • up the reservation – limiting the number of ESX servers we can run based on the size of the reservation;
  • up the size of the ESX volume – requiring a new snapshot, etc.
  • move the destination of the ESX swap file to a volume with lots of (temporary) space.

Why does this not affect the ESXi instance(s)? Because they are provisioned as NFS file systems! Each file system draws from the volume pool because no restrictions on the file system quota have been provisioned. Had there been a quota, we could simply relax that quota to accommodate the additional space that the VM needs for its swap file and re-provision the VM for more memory without extending its reservation.

Beware the Swap File

Here’s where ESXi begins to use hidden calculus: the swap file. Notice something in the screen capture below (click to enlarge)? We have increased the memory in our ESX VM to 12GB requiring 8GB of swap space – the reported available disk space does NOT include the disk consumed by the swap file!

VMware excludes the size of swap in the available disk report. There are actually only 3GB of space left for the "thin" provisioned disk to use.

VMware excludes the size of swap in the available disk report. There are actually only 3GB of space left for the "thin" provisioned disk to use - not the 11.13GB reported.

While this “helpful” feature may prohibit ESXi from complaining that the swap file has caused the file system to be “low on storage” because it encroaches on the 30% safety boundary, it also can mask the fact that the thinly provisioned disk may be dangerously close to running out of space! An abended (or best-case, paused) VM due to “out of disk” condition is no fun in practice. Perhaps VMware needs to rethink this…

Choosing a Reservation Model

If you’ve been paying attention, you have probably guessed that we have settled with a reservation of 4GB for each of the ESX/ESXi VMs which results in about 50% of our memory resources committed to reservation. This leaves plenty of room for our vCenter virtual machine and Windows Server 2003 Standard Edition.

Why not go with the 100% reservation model? Simply put: we trust VMware to manage memory better than we can, and our use case is for lab/testing not performance. With only a 4GB reservation for each ESX/ESXi, we can run 4-5 ESX servers in our single machine lab. We’re stuck with only two in the 100% reservation model…

Creating the vCenter Virtual Machine

According to the ESXi Installable and vCenter Server Setup Guide, our vCenter server virtual machine should meet the following (minimum) requirements:

Therefore, our vCenter VM will look like this:

  • 2 vCPU’s
  • 3.5GB RAM (448MB reserved)
  • 1 – OS drive (vmdk), 8GB (thin)
  • 1 – SQL/vCenter/Application drive (vmdk), 12GB (thin)
  • 1 – Update Manager drive (vmdk), 24GB (thin)

We will house our VM on a newly provisioned NFS file system which we will ultimately share with both the host ESXi instance and any subsequent virtual ESX/ESXi instance(s). We will provision the storage similarly to the NFS storage we created in Part 3 of this series. To activate this file system, we need to go back to the NexentaStor web GUI and select “Data Management/Shares/Folders/Create” and name the new file system “volume0/default/nfs/virtual-machines” with a block size of 16K, then click “Create.”

Before we’re done with this NFS file system, we need to make sure that the file system share is properly masked to allow read/write and “root” access to the ESXi server. Since this file system will ultimately be shared between the root ESXi host and the virtual ESX servers, we could extend the mask at this time to include the virtual servers, but we’ll wait and do that once they are properly configured. At this point, we’ll just add the fully qualified domain name (FQDN) of the host with its host IP address (separated by a colon) in the “Read-Write” and “Root” text entry fields of the NFS share properties for the file system.

Why NFS for vCenter?

At this point you might question as to why the vCenter installation is being placed on an NFS volume. There is no “technical” reason why vCenter could not exist on an iSCSI volume – in fact, that would be a typical deployment. However, for the purpose of this lab, the storage is essentially boot-strapped via the VSA and that does not work so well for iSCSI storage. If we tie the root ESXi server to an iSCSI volume that does not exist at boot time, the boot time increases as ESXi waits for the iSCSI storage to become available or time-out.

While the same time-out happens for NFS, it results in less delay at boot time. Also, the NFS auto-mounts some time after boot whereas iSCSI will require a manual rescan of the “iSCSI Software Adapter” before the volume is available. If we want to create an automatic sequence whereby our VSA, (local storage), ESXi (NFS) and vCenter (NFS) servers are booted as the ESXi server boots, we can do it without administrative intervention.

Next page, Creating an NFS file system for vCenter…

Pages: 1 2 3

6 comments

  1. Good Job !!
    and I am still waiting XD

    and BTW , may I translate this series articles to Chinese and post in my Blog ?

    Like


    • Eric:

      Thanks, and yes – feel free to reproduce this series in the Chinese language provided proper attribution to all sources referenced in the original is maintained and links back to the original SOLORI post(s) are included (see our copyright standard). A “reproduction” notice at the bottom of each page with links back to the source would be appropriate.

      Once your Chinese version is complete, let us know and we’ll link back to it from SOLORI’s site to facilitate SEO. We’ll be releasing Part 6 soon – getting into the “recursive” vMotion capability of this lab series.

      Regards,
      Collin C. MacMillan
      Solution Oriented LLC

      Like


  2. very nice blogs, I learned alot from these 5 parts so far. I want to use the storage appliance you are recommending in this or even just SOlaris 10 on a hardware or as as VM, but I did alot of research but I donot see any write ups on ZFS and MS VSS backups, like Active Directory or Exchange. Suppose I have an ESX with the Solaris VM running for ZFS and presending the datastore to ESX and then I have an Active Directory VM and an Exchange 2007 or 2010 server as a VM both using the ZFS datastore, then is there a way to integrate the MS VSS write with the ZFS snapshots? that way I will know the backups are consistant with the application meaning if I do a restore the database of exchange or logs of exchange will recover without any problems. This is the only problem or grey area i am dealing with right now. Once again thanks again for a very nice write up 🙂

    Farid

    Like


    • Farid:

      Thanks for the comment. The ZFS-based NexentaStor solution (licensed, with VM Data Center plug-in) integrates well into VMware, Hyper-V and Xen environments where VSS agents provide Windows file system quiescence (VSS) through VM Data Center API calls to the hypervisor. This allows for clean, NexentaStor Appliance-driven snapshots on a per-VM or per datastore basis (assuming your VSS writers are in place).

      Restoring from a snapshot – even application consistent ones – can have application-specific requirements. For instance, AD virtual machines may require one restoration process and Exchange 2007 yet another – even though VSS is a commonality in both snapshot-driven back-up solutions. If your virtual environment is more complex (i.e. coordinated data sources) syncing the snapshots between sources will be required to result in consistent systemic recovery instead of a bunch of unsynchronized, individual recovery points.

      Fortunately, the open aspect of NexentaStor allows for complex management scripts (from a central management agent, say a Windows server like vCenter or Linux server like vMA) to coordinate and orchestrate many sequential events to insure the needs of your environment are met. These scripts could be simple – occupying 20-30 lines of script code – or very complex depending on your needs, environment and number of VM’s and/or physical machines it is necessary to coordinate (synchronous backup).

      Without testing in your specific environment, it is impossible to promise “recovery without any problems” and the amount of tuning you will need to do to your process will be dependent on the complexity of the application. Likewise, it may be that VMware Data Recovery or a third-party backup solution like Veeam Backup and Replication would offer more/additional value than VM Data Center alone, given their ability to offer file-level recovery. If image-level recovery fits your needs, you should trial NexentaStor and VM Data Center for yourself for 45-days and see how it fits your application.

      We’ll be running some in-depth looks at VM Data Center and NexentaStor towards the end of March, 2010.

      Like


      • What About something like this…

        VMWare vSphere Essentials Plus for 3 ESX hosts

        ESX Host1:
        Hardware-
        HP DL360 with Array controller with 6x SAS Drives setup in a raid5 with hot-spare
        4-6 Gig ethernet ports

        Software:
        ESXi hypervisor
        NexentaStor Enterprise Silver Edition – 4TB (VM) controlling the Raid5 logical volume

        ESX Host2:
        Hardware-
        HP DL360 with Array controller with 6x SAS Drives setup in a raid5 with hot-spare
        4-6 Gig ethernet ports

        Software:
        ESXi hypervisor
        NexentaStor Enterprise Silver Edition – 4TB (VM) controlling the Raid5 logical volume

        Now, both NexentaStor are controlling local DAS or the array logical drive. I want to setup 2 VIPs on the nexenta VM’s on the storage side.

        so VIP1=
        Nexenta1 – Active
        Nexenta2 – Passive

        so VIP2=
        Nexenta2 – Active
        Nexenta1 – Passive

        now, on ESX1 I can setup a VM lets say Active Directory 1 and point its storage to VIP1, on ESX2 I can setup another VM lets say another AD server and point its storage to VIP2. So AD1 on ESX1 is accessing storage on Nexenta1 and AD2 on ESX2 is accessing Storage on Nexenta2. Now, I want the 2 Nexenta in the back end to setup a lets say syncronous mirror or like software raid-1 across the 2 nexenta’s. so when AD1 writes or delets anything from its storage on nexenta1 that is replicated to nexenta2 in realtime and same with AD2. Now lets say the ESX1 server goes offline for anyreason, it will bring down the AD1 nexenta1 along with it. Now the VMWare HA will kick in and move the AD1 VM to the ESX2, at the same time I want the nexenta2 on ESX2 to take the nexenta 1’s VIP as well, so now nexenta2 responds to both vips, vip1 and vip2. Now when the AD1 machine is on ESX2 it will acess its data on VIP1 which is now on nexenta2. Sorry for long text but this is what I have in mind. Wondering if this is doable in nexenta and esx.

        The last piece is I want to setup a 3rd ESXi in DR somewhere, and it has its own nexenta3 VM and its got same kinda hardware, and i want to do async replication to the 3rd ESX in DR using delta changes only to save bandwidth. so in case of Primary Site going down I will failover to DR and bring the 2 AD servers in ESX3’s online using the data on the nexenta3.

        As far as backup goes, can I use the Nexenta Delorean 2.0 and install the windows client in the AD1 and AD2 VM’s and do backup that way? is the Nexenta Delorean 2.0 considered a snapshot backup or is it like disk to disk backup? if its snapshot then that would be nice if its disk to disk backup then in that case I can use any backup software out there that supports ESX, like the new backup exex 2010 with deduplicaiton and offsite replication built right in to their disk to disk backup.

        thanks,
        Farid

        Like


  3. […] we can use for rapid deployment in case we want to expand the […]      by In-the-Lab: Full ESX/vMotion Test Lab in a Box, Part 5 « SolutionOriented Blog September 28, 2009 at 6:34 pm […]

    Like



Comments are closed.

%d bloggers like this: