New patches are available today for ESX/ESXi 3.5, 4.0, 4.1 and 5.0 to resolve a few known security vulnerabilities. Here’s the run down if you’re running ESXi 5.0 standard image:
VMware Host Checkpoint File Memory Corruption
Certain input data is not properly validated when loading checkpoint files. This might allow an attacker with the ability to load a specially crafted checkpoint file to execute arbitrary code on the host. The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the name CVE-2012-3288 to this issue.
The following workarounds and mitigating controls might be available to remove the potential for exploiting the issue and to reduce the exposure that the issue poses.
Workaround: None identified.
Mitigation: Do not import virtual machines from untrusted sources.
VMware Virtual Machine Remote Device Denial of Service
A device (for example CD-ROM or keyboard) that is available to a virtual machine while physically connected to a system that does not run the virtual machine is referred to as a remote device. Traffic coming from remote virtual devices is incorrectly handled. This might allow an attacker who is capable of manipulating the traffic from a remote virtual device to crash the virtual machine.
The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the name CVE-2012-3289 to this issue.
The following workarounds and mitigating controls might be available to remove the potential for exploiting the issue and to reduce the exposure that the issue poses.
Workaround: None identified.
Mitigation: Users need administrative privileges on the virtual machine in order to attach remote devices. Do not attach untrusted remote devices to a virtual machine.
None beyond the required patch bundles and reboot information listed in the table above.
Today, VMware announces the release of vSphere 4.0 Update 3 (U3). Many, many fixes and enhancements – some rolled-in from (or influenced by) vSphere 4.1. Updates to ESX, ESXi, vCenter and vCenter Update Manager are available now (see below for links).
Don't forget to click the "View History" link to expose the vCenter and ESX updates available for older versions...
Guest Operating System Customization Improvements: vCenter Server adds support for customization of the following guest operating systems:
RHEL 6.0 (32-bit and 64-bit)
SLES 11 SP1 (32-bit and 64-bit)
Windows 7 SP1 (32-bit and 64-bit)
Windows Server 2008 R2 SP1
Additional vCenter Server Database Support: vCenter Server now supports the following databases:
Microsoft SQL Server 2008 R2 (32-bit and 64-bit)
Oracle 11g R2 (32-bit and 64-bit)
IBM DB2 9.7.2 (32-bit and 64-bit)
For more information about using IBM DB2 – 9.7.2 database with vCenter Server 4.0 Update 3, see KB 1037354.
Additional vCenter Server Operating System Support: You can install vCenter Server on Windows Server 2008 R2.
Resolved Issues:In addition, this release delivers a number of bug fixes that have been documented in the Resolved Issues section.
VMware’s current version of its vSphere Management Assistant – also known as vMA (pronounced “vee mah”) – will crash when run on an ESX host using AMD Magny Cours processors. This behavior was discovered recently when installing the vMA on an AMD Opteron 6100 system (aka. Magny Cours) causing a “kernel panic” on boot after deploying the OVF template. Something of note is the crash also results in 100% vCPU utilization until the VM is either powered-off or reset:
vMA Kernel Panic on Import
As it turns out, no manner of tweaks to the virtual machine’s virtualization settings nor OS boot/grub settings (i.e. noapic, etc.) seem to cure the ills for vMA. However, we did discover that the OVF deployed appliance was configured as a VMware Virtual Machine Hardware Version 4 machine:
vMA 4.1 defaults to Virtual Machine Hardware Version 4
Since our lab vMA deployments have all been upgraded to Virtual Machine Harware Version 7 for some time (and for functional benefits as well), we tried to update the vMA to Version 7 and try again:
Upgrade vMA Virtual Machine Version...
This time, with Virtual Hardware Version 7 (and no other changes to the VM), the vMA boots as it should:
vMA Booting after Upgrade to Virtual Hardware Version 7
Since the Magny Cours CPU is essentially a pair of tweaked 6-core Opteron CPUs in a single package, we took the vMA into the lab and deployed it to an ESX server running on AMD 2435 6-core CPUs: the vMA booted as expected, even with Virtual Hardware Version 4. A quick check of the community and support boards show a few issues with older RedHat/Centos kernels (like vMA’s) but no reports of kernel panic with Magny Cours. Perhaps there are just not that many AMD Opteron 6100 deployments out there with vMA yet…
Here’s a maintenance note for SMB environments attempting 100% virtualization and relying on SAN-based file shares to simplify backup and storage management: beware the chicken-and-egg scenario on restart before going home to capture much needed Zzz’s. If your domain controller is virtualized and it’s VMDK file lives on SAN/NAS, you’ll need to restart SMB services on the NexentaStor appliance before leaving the building.
Here’s the scenario:
An afterhours SAN upgrade in non-HA environment (maybe Auto-CDP for BC/DR, but no active fail-over);
Shutdown of SAN requires shutdown of all dependent VM’s, including domain controllers (AD);
End-user and/or maintenance plans are dependent on CIFS shares from SAN;
Here’s the typical maintenance plan (detail omitted):
Ordered shutdown of non-critical VM’s (including UpdateManager, vMA, etc.);
Ordered shutdown of application VM’s;
Ordered shutdown of resource VM’s;
Ordered shutdown of AD server VM’s (minus one, see step 7);
Migrate/vMotion remaining AD server and vCenter to a single ESX host;
Ordered shutdown of ESX hosts (minus one, see step 8);
vSphere Client: Log-out of vCenter;
vSphere Client: Log-in to remaining ESX host;
Ordered shutdown of vCenter;
Ordered shutdown of remaining AD server;
Ordered shutdown of remaining ESX host;
Reboot SAN to update checkpoint;
Test SAN update – restoring previous checkpoint if necessary;
Power-on ESX host containing vCenter and AD server (see step 8);
vSphere Client: Log-in to remaining ESX host;
Power-on AD server (through to VMware Tools OK);
Restart SMB service on NexentaStor;
vSphere Client: Log-in to vCenter;
vSphere Client: Log-out of ESX host;
Power-on remaining ESX hosts;
Ordered power-on of remaining VM’s;
A couple of things to note in an AD environment:
NexnetaStor requires the use of AD-based DNS for AD integration;
AD-based DNS will not be available at SAN re-boot if all DNS servers are virtual and only one SAN is involved;
Lack of DNS resolution on re-boot will cause a failure for DNS name based NTP service synchronization;
NexentaStor SMB service will fail to properly initialize AD credentials;
VMware 4.1 now pushes AD authentication all the way to ESX hosts, enabling better credential management and security but creating a potential AD dependency as well;
Using auto-startup order on the remaining ESX host for AD and vCenter could automate the process (steps 17 & 19), however, I prefer the “manual” approach after a SAN upgrade in case the upgrade failure is detected only after ESX host is restarted (i.e. storage service interaction in NFS/iSCSI after upgrade).
SOLORI’s Take: This is a great opportunity to re-think storage resources in the SMB as the linchpin to 100% virtualization. Since most SMB’s will have a tier-2 or backup NAS/SAN (auto-sync or auto-CDP) for off-rack backup, leveraging a shared LUN/volume from that SAN/NAS for a backup domain controller is a smart move. Since tier-2 SAN’s may not have the IOPs to run ALL mission critical applications during the maintenance interval, the presence of at least one valid AD server will promote a quicker RTO, post-maintenance, than coming up cold. [This even works with DAS on the ESX host]. Solution – add the following and you can ignore step 15:
3a. Migrate always-on AD server to LUN/volume on tier-2 SAN/NAS;
24. Migrate always-on AD server from LUN/volume on tier-2 SAN/NAS back to tier-1;
Since even vSphere Essentials Plus has vMotion now (a much requested and timely addition) collapsing all remaining VM’s to a single ESX host is a no brainer. However, migrating the storage is another issue which cannot be resolved without either a shutdown of the VM (off-line storage migration) or Enterprise/Enterprise Plus version of vSphere. That is why the migration of the AD server from tier-2 is reserved for last (step 17) – it will likely need to be shutdown to migrate the storage between SAN/NAS appliances.
That’s right, I said upgrade from ESX to ESXi – not ESX 3.x to vSphere, but ESX (any version) to vSphere’s ESXi! Ever since VMware PartnerExchange 2009 (April), SOLORI has been advising clients and prospects to focus on ESXi and move away from ESX-based hosts – you know, the one with the “Linux” service console. Personally, I’m glad VMware has strengthened their message about the virtues of ESXi and the direction of their flagship product.
ESXi has a superior architecture and we encourage customers to deploy ESXi as part of any new vSphere deployment. Our future posts will compare ESX 4 and ESXi 4 in detail on topics like hardware compatibility list, performance, and management to demonstrate that ESXi is either on par with or superior than ESX. But for now, here are some key points you should know about ESXi vs. ESX:
The functionality and performance of VMware ESX and ESXi are the same; the difference between the two hypervisors resides in their packaging architecture and operational management. VMware ESXi is the latest hypervisor architecture from VMware. It has an ultra thin footprint with no reliance on a general-purpose OS, setting a new bar for security and reliability (learn more).
In the future, ESXi’s superior architecture will be the exclusive focus of VMware’s development efforts.
New and existing customers are highly encouraged to deploy ESXi. Many Fortune 100 companies have already standardized on the ESXi platform.
Not unfamiliar with the VI3 version of ESXi, its ease of installation, configuration and management and smaller footprint, I was one of about 10 participants in an “ESXi BoF Breakout Session” with Charu Chaubal of VMware. While discussing vSphere’s ESXi with Charu, I never heard the words “ESXi is a superior architecture,” but I did get a clear message that ESXi was the way of the future. From that point on, it seemed as though any efforts concentrated on (net new) ESX deployments was going to be “time wasted.”
However, it was clear by the whispered tone about ESXi’s virtues that the timing was not right for the real message to be spoken aloud. Remember, this was the launch of vSphere and “ESXi” was strongly associated with “ESXi Free” – not a clear marketing message when license sales and adoption curves are on the line. Perhaps that’s why the “talking points” at PEX2009 always suggested that ESX was the “flagship” hypervisor and ESXi was “targeted for embedded and OEM” deployments.
In practical terms, migration from ESX 3.x to vSphere/ESXi didn’t make a lot of sense for many large or institutional customers at the time due to the lack of third-party driver parity between ESX and ESXi in vSphere. However, for net new installations where thrid-party drivers and service console agents were not a concern, the message about ESXi’s superiority was getting lost. Thomas Paine once said “what we obtain too cheap, we esteem too lightly” and I’d attribute the slow uptake on ESXi to the perception that it was somehow inferior to ESX – a misconception owed to it being offered in a “free” version.
I’ve attended a great deal of WebEx sessions on VMware products over the last year and I’m still hearing hushed tones and uncertainty about the role of ESXi versus ESX. To be clear, ESXi is being talked about as “enterprise ready,” but much of the focus is still on ESX. These overtones were still present in an “vSphere: Install, Configure and Manage” course I recently attended to qualify for VCP410. While our instructors were very knowledgeable and experienced, there seemed to be much less confidence when discussing ESXi versus ESX. In fact, the lab guide clearly states:
If you are new to VMware vSphere and you do not have any special needs for more advanced features, use ESXi.
– Page 599, Module 13, Installing VMware ESX and ESXi
The manual – and VMware’s choice of message here – seems to indicate that ESX has “more advanced features” than ESXi. While the “advanced features” VMware is talking about are service console related, it leaves many regarding ESXi as the inferior product in sharp contrast to today’s message. If the statement “ESXi’s superior architecture will be the exclusive focus of VMware’s development efforts” isn’t writing on the wall for the rest of you, here’s a list of VMware’s new talking points on ESXi:
Improved Reliability and Security
Less Disk Space
Streamlined Deployment and Configuration
Reduced Management Overhead
Next Generation Hypervisor
Drastically Reduced Hypervisor Footprint
Smaller Code Base, Smaller Attach Surface
Certified on over 1,000 Server Systems – including USB keys
New, Operating System Independent
In contrast, the ESX platform is being re-imaged. Here are some new talking points about ESX:
The Older Architecture
Relies on Linux OS for Resource Management
20x Larger on-disk Footprint
More Complex to Configure
Console OS Administration for Configuration and Diagnostics
Prone to Arbitrary Code Execution (Console OS)
For many of us familiar with both ESXi and ESX, nothing here is really new. The only real change is the message: build your eco-system around ESXi…
SOLORI’s Take: It’s clear to me that VMware took inventory of its customers and chose to lead with ESX when vSphere was released. I suspect this was a practical decision due to the overwhelming numbers of ESX hosts already installed. However, the change in marketing and positioning we’re witnessing signals that we’re moving toward a time when ESX will be openly considered a dead-end variant.
When will ESX be phased-out? That’s up to market forces and VMware, but the cloud loves efficiency and ESXi is certainly more resource efficient and compartmentalized than its brother ESX. Furthermore, VMware has to maintain two development and support chains with ESX and ESXi and Darwin likes ESXi. If I had to bet, I wouldn’t put my money on seeing an ESX version of the next major release. In any case, when ESX is gone VMware can stop having to make excuses for the “linux console” and the implications that VMware is somehow “based on Linux.”
Fujitsu’s RX300 S5 rack server takes the top spot in VMware’s VMmark for 8-core systems today with a score of 25.16@17 tiles. Loaded with two of Intel’s top-bin 3.33GHz, 130W Nehalem-EP processors (W5590, turbo to 3.6GHz per core) and 96GB of DDR3-1333 R-ECC memory, the RX300 bested the former champ – the HP ProLiant BL490c G6 blade – by only about 2.5%.
With 17 tiles and 102 virtual machines on a single 2U box, the RX300 S5 demonstrates precisely how well vSphere scales on today’s x86 commodity platforms. It also appears to demonstrate both the value and the limits of Intel’s “turbo mode” in its top-bin Nehalem-EP processors – especially in the virtualization use case – we’ll get to that later. In any case, the resulting equation is:
More * (Threads + Memory + I/O) = Dense Virtualization
We could have added “higher execution rates” to that equation, however, virtualization is a scale-out applications where threads, memory pool and I/O capabilities dominate the capacity equation – not clock speed. Adding 50% more clock provides less virtualization gains than adding 50% more cores, and reducing memory and context latency likewise provides better gains that simply upping the clock speed. That’s why a dual quad-core Nehalem 2.6GHz processor will crush a quad dual-core 3.5GHz (ill-fated) Tulsa.
Speaking of Tulsa, unlike Tulsa’s rather anaemic first-generation hyper-threading, Intel’s improved SMT in Nehalem “virtually” adds more core “power” to the Xeon by contributing up to 100% more thread capacity. This is demonstrated by Nehalem-EP’s 2 tiles per core contributions to VMmark where AMD’s Istanbul quad-core provides only 1 tile per core. But exactly what is a VMmark tile and how does core versus thread play into the result?
The Illustrated VMmark "Tile" Load
As you can see, a “VMmark Tile” – or just “tile” for short – is composed of 6 virtual machines, half running Windows, half running SUSE Linux. Likewise, half of the tiles are running in 64-bit mode while the other half runs in 32-bit mode. As a whole, the tile is composed of 10 virtual CPUs, 5GB of RAM and 62GB of storage. Looking at how the parts contribute to the whole, the tile is relatively balanced:
Operating System / Mode
Windows Server 2003 R2
SUSE Linux Enterprise Server 10 SP2
If we stop here and accept that today’s best x86 processors from AMD and Intel are capable of providing 1 tile for each thread, we can look at the thread count and calculate the number of tiles and resulting memory requirement. While that sounds like a good “rule of thumb” approach, it ignores specific use case scenarios where synthetic threads (like HT and SMT) do not scale linearly like core threads do where SMT accounts for only about 12% gains over single-threaded core, clock-for-clock. For this reason, processors from AMD and Intel in 2010 will feature more cores – 12 for AMD and 8 for Intel in their Magny-Cours and Nehalem-EX (aka “Beckton”), respectively.
Learning from the Master
If we want to gather some information about a specific field, we consult an expert, right? Judging from the results, Fujitsu’s latest dual-processor entry has definitely earned the title ‘Master of VMmark” in 2P systems – at least for now. So instead of the usual VMmark $/VM analysis (which are well established for recent VMmark entries), let’s look at the solution profile and try to glean some nuggets to take back to our data centers.
It’s Not About Raw Speed
First, we’ve noted that the processor used is not Intel’s standard “rack server” fare, but the more workstation oriented W-series Nehalem at 130W TDP. With “turbo mode” active, this CPU is capable of driving the 3.33GHz core – on a per-core basis – up to 3.6GHz. Since we’re seeing only a 2.5% improvement in overall score versus the ProLiant blade at 2.93GHz, we can extrapolate that the 2.93GHz X5570 Xeon is spending a lot of time at 3.33GHz – its “turbo” speed – while the power-hungry W5590 spends little time at 3.6GHz. How can we say this? Looking at the tile ratio as a function of the clock speed.
We know that the X5570 can run up to 3.33GHz, per core, according to thermal conditions on the chip. With proper cooling, this could mean up to 100% of the time (sorry, Google). Assuming for a moment that this is the case in the HP test environment (and there is sufficient cause to think so) then the ratio of the tile score to tile count and CPU frequency is 0.433 (24.54/17/3.33). If we examine the same ratio for the W5590, assuming the clock speed of 3.33GHz, we get 0.444 – a difference of 2.5%, or the contribution of “turbo” in the W5590. Likewise, if you back-figure the “apparent speed” of the X5570 using the ratio of the clock-locked W5590, you arrive at 3.25GHz for the W5570 (an 11% gain over base clock). In either case, it is clear that “turbo” is a better value at the low-end of the Nehalem spectrum as there isn’t enough thermal headroom for it to work well for the W-series.
VMmark Equals Meager Network Use
Second, we’re not seeing “fancy” networking tricks out of VMmark submissions. In the past, we’ve commented on the use of “consumer grade” switches in VMmark tests. For this reason, we can consider VMmark’s I/O dependency as related almost exclusively to storage. With respect to networking, the Fujitsu team simply interfaced three 1Gbps network adapter ports to the internal switch of the blade enclosure used to run the client-side load suite and ran with the test. Here’s what that looks like:
Networking Simplified: The "leaders" simple virtual networking topology.
Note that the network interfaces used for the VMmark trial are not from the on-board i82575EB network controller but from the PCI-Express quad-port adapter using its older cousin – the i82571EB. What is key here is that VMmark is tied to network performance issues, and it is more likely that additional network ports might increase the likelihood of IRQ sharing and reduced performance more so than the “optimization” of network flows.
Keeping Storage “Simple”
Third, Fujitsu’s approach to storage is elegantly simple: several “inexpensive” arrays with intelligent LUN allocation. For this, Fujistu employed eight of its ETERNUS DX80 Disk Storage Systems with 7 additional storage shelves for a total of 172 working disks and 23 LUNs. For simplicity, Fujistu used a pair of 8Gbps FC ports to feed ESX and at least one port per DX80 – all connected through a Brocade 5100 fabric switch. The result looked something like this:
And yes, the ESX server is configured to boot from SAN, using no locally attached storage. Note that the virtual machine configuration files, VM swap and ESX boot/swap are contained in a separate DX80 system. This “non-default” approach allows the working VMDKs of the virtual machines to be isolated – from a storage perspective – from the swap file overhead, about 5GB per tile. Again, this is a benchmark scenario, not an enterprise deployment, so trade-offs are in favour of performance, not CAPEX or OPEX.
Even if the DX80 solution falls into the $1K/TB range, to say that this approach to storage is “economic” requires a deeper look. At 33 rack units for the solution – including the FC switch but not including the blade chassis – this configuration has a hefty datacenter footprint. In contrast to the old-school server/blade approach, 1 rack at 3 servers per U is a huge savings over the 2 racks of blades or 3 racks of 1U rack servers. Had each of those servers of blades had a mirror pair, we’d be talking about 200+ disks spinning in those racks versus the 172 disks in the ETERNUS arrays, so that still represents a savings of 15.7% in storage-related power/space.
When will storage catch up?
Compared to a 98% reduction in network ports, a 30-80% reduction server/storage CAPEX (based on $1K/TB SAN), a 50-75% reduction in overall datacenter footprint, why is a 15% reduction in datacenter storage footprint acceptable? After all, storage – in the Fujitsu VMmark case – now represents 94% of the datacenter footprint. Even if the load were less aggressively spread across five ESX servers (a conservative 20:1 loading), the amount of space taken by storage only falls to 75%.
How can storage catch up to virtualization densities. First, with 2.5″ SAS drives, a bank of 172 disks can be made to occupy only 16U with very strong performance. This drops storage to only 60% of the datacenter footprint – 10U for hypervisor, 16U for storage, 26U total for this example. Moving from 3.5″ drives to 2.5″ drives takes care of the physical scaling issue with acceptable returns, but results in only minimal gains in terms of power savings.
Saving power in storage platforms is not going to be achieved by simply shrinking disk drives – shrinking the NUMBER of disks required per “effective” LUN is what’s necessary to overcome the power demands of modern, high-performance storage. This is where non-traditional technology like FLASH/SSD is being applied to improve performance while utilizing fewer disks and proportionately less power. For example, instead of dedicating disks on a per LUN basis, carving LUNs out of disk pools accelerated by FLASH (a hybrid storage pool) can result in a 30-40% reduction in disk count – when applied properly – and that means 30-40% reduction in datacenter space and power utilization.
Here are our “take aways” from the Fujitsu VMmark case:
1) Top-bin performance is at the losing end of diminishing returns. Unless your budget can accommodate this fact, purchasing decisions about virtualization compute platforms need to be aligned with $/VM within an acceptable performance envelope. When shopping CPU, make sure the top-bin’s “little brother” has the same architecture and feature set and go with the unit priced for the mainstream. (Don’t forget to factor memory density into the equation…) Regardless, try to stick within a $190-280/VM equipment budget for your hypervisor hardware and shoot for a 20-to-1 consolidation ratio (that’s at least $3,800-5,600 per server/blade).
2) While networking is not important to VMmark, this is likely not the case for most enterprise applications. Therefore, VMmark is not a good comparison case for your network-heavy applications. Also, adding more network ports increases capacity and redundancy but does so at the risk of IRQ-sharing (ESX, not ESXi) problems, not to mention the additional cost/number of network switching ports. This is where we think 10GE will significantly change the equation in 2010. Remember to add up the total number of in use ports – including out-of-band management – when factoring in switch density. For net new instalments, look for a switch that provides 10GE/SR or 10GE/CX4 options and go with !0GE/SR if power savings are driving your solution.
3) Storage should be simple, easy to manage, cheap (relatively speaking), dense and low-power. To meet these goals, look for storage technologies that utilize FLASH memory, tiered spindle types, smart block caching and other approaches to limit spindle count without sacrificing performance. Remember to factor in at least the cost of DAS when approximating your storage budget – about $150/VM in simple consolidation cases and $750/VM for more mission critical applications (that’s a range of $9,000-45,000 for a 3-server virtualization stack). The economies in managed storage come chiefly from the administration of the storage, but try to identify storage solutions that reduce datacenter footprint including both rack space and power consumption. Here’s where offerings from Sun and NexentaStor are showing real gains.
We’d like to see VMware update VMmark to include system power specifications so we can better gage – from the sidelines – what solution stack(s) perform according to our needs. VMmark served its purpose by giving the community a standard from which different platforms could be compared in terms of the resultant performance. With the world’s eyes on power consumption and the ecological impact of datacenter choices, adding a “power utilization component” to the “server-side” of the VMmark test would not be that significant of a “tweak.” Here’s how we think it can be done:
Require power consumption of the server/VMmark related components be recorded, including:
the ESX platform (rack server, blade & blade chassis, etc.)
the storage platform providing ESX and test LUN(s) (all heads, shelves, switches, etc.)
the switching fabric (i.e. Ethernet, 10GE, FC, etc.)
Power delivered to the test harness platforms, client load machines, etc. can be ignored;
Power measurements should be recorded at the following times:
All equipment off (validation check);
Single tile load;
100% tile capacity;
75% tile capacity;
50% tile capacity;
Power measurements should be recorded using a time-power data-logger with readings recorded as 5-minute averages;
Notations should be made concerning “cache warm-up” intervals, if applicable, where “cache optimized” storage is used.
Why is this important? In the wake of the VCE announcement, solution stacks like VCE need to be measured against each other in an easy to “consume” way. Is VCE the best platform versus a component solution provided by your local VMware integrator? Given that the differentiated VCE components are chiefly UCS, Cisco switching and EMC storage, it will be helpful to have a testing platform that can better differentiate “packaged solutions” instead of uncorrelated vendor “propaganda.”
Let us know what your thoughts are on the subject, either on Twitter or on our blog…
In this segment, Part 5, we will create a VMware Virtual Center (vCenter) virtual machine and place the ESX and ESXi machines under management. Using this vCenter instance, we will complete the configuration of ESX and ESXi using some of the new features available in vCenter.
Part 5, Managing our ESX Cluster-in-a-Box
With our VSA and ESX servers purring along in the virtual lab, the only thing stopping us from moving forward with vMotion is the absence of a working vCenter to control the process. Once we have vCenter installed, we have 60-days to evaluate and test vSphere before the trial license expires.
Prepping for vCenter Server for vSphere
We are going to install Microsoft Windows Server 2003 STD for the vCenter Server operating system. We chose Server 2003 STD since we have limited CPU and memory resources to commit to the management of the lab and because our vCenter has no need of 64-bit resources in this use case.
Since one of our goals is to have a fully functional vMotion lab with reasonable performance, we want to create a vCenter virtual machine with at least the minimum requirements satisfied. In our 24GB lab server, we have committed 20GB to ESX, ESXi and the VSA (8GB, 8GB and 4GB, respectively). Our base ESXi instance consumes 2GB, leaving only 2GB for vCenter – or does it?
Memory Use in ESXi
VMware ESX (and ESXi) does a good job of conserving resources by limiting commitments for memory and CPU. This is not unlike any virtual memory capable system that puts a premium on “real” memory by moving less frequently used pages to disk. With a lot of idle virtual machines, this ability alone can create significant over-subscription possibilities for VMware; this is why it could be possible to run 32GB worth of VM’s to run on a 16-24GB host.
Do we really want this memory paging to take place? The answer – for the consolidation use cases – is usually “yes.” This is because consolidation is born out of the need to aggregate underutilized systems in a more resource efficient way. Put another way, administrators tend to provision systems based on worst case versus average use, leaving 70-80% of those resources idle in off-peak times. Under ESX’s control those underutilized resources can be re-tasked to another VM without impacting the performance of either one.
On the other hand, our ESX and VSA virtual machines are not the typical use case. We intend to fully utilized their resources and let them determine how to share them in turn. Imagine a good number of virtual machines running on our virtualized ESX hosts: will they perform well with the added hardship of memory paging? Also, when begin to use vMotion those CPU and memory resources will appear on BOTH virtualized ESX servers at the same time.
It is pretty clear that if all of our lab storage is committed to the VSA, we do not want to page its memory. Remember that any additional memory not in use by the SAN OS in our VSA is employed as ARC cache for ZFS to increase read performance. Paging memory that is assumed to be “high performance” by NexentaStor would result in poor storage throughput. The key to “recursive computing” is knowing how to anticipate resource bottlenecks and deploy around them.
This brings the question: how much memory is left after reserving 4GB for the VSA? To figure that out, let’s look at what NexentaStor uses at idle with 4GB provisioned:
NexentaStor's RAM footprint with 4GB provisioned, at idle.
As you can see, we have specified a 4GB reservation which appears as “4233 MB” of Host Memory consumed (4096MB+137MB). Looking at the “Active” memory we see that – at idle – the NexentaStor is using about 2GB of host RAM for OS and to support the couple of file systems mounted on the host ESXi server (recursively).
Additionally, we need to remember that each VM has a memory overhead to consider that increases with the vCPU count. For the four vCPU ESX/ESXi servers, the overhead is about 220MB each; the NexentaStor VSA consumes an additional 140MB with its two vCPU’s. Totaling-up the memory plus overhead identifies a commitment of at least 21,828MB of memory to run the VSA and both ESX guests – that leaves a little under 1.5GB for vCenter if we used a 100% reservation model.
Memory Over Commitment
The same concerns about memory hold true for our ESX and ESXi hosts – albeit in a less obvious way. We obviously want to “reserve” memory for required by the VMM – about 2.8GB and 2GB for ESX and ESXi respectively. Additionally, we want to avoid over subscription of memory on the host ESXi instance – if at all possible – since it will already be working running our virtual ESX and ESXi machines.
SOLORI's Take and Quick Take posts express my personal opinion unless explicitly attributed to other sources. Where possible, supporting facts are presented to properly frame and ground these opinions, however they are presented "AS-IS" without regard to warranty or promise: expressed or implied.
Comments are open to all registered users and may be edited for decorum. Spam is deleted with prejudice.