Archive for the ‘Quick Take’ Category


Quick-Take: NexentaStor 4.0.1GA

April 14, 2014

Our open storage partner, Nexenta Systems Inc., hit a milestone this month by releasing NexentaStor 4.0.1 for general availability. This release is significant mainly because it is the first commercial release of NexentaStor based on the Open Source Illumos kernel and not Oracle’s OpenSolaris (now closed source). With this move, NexentaStor’s adhering to the company’s  promise of “open source technology” that enables hardware independence and targeted flexibility.

Some highlights in 4.0.1:

  • Faster Install times
  • Better HA Cluster failover times and “easier” cluster manageability
  • Support for large memory host configurations – up to 512GB of DRAM per head/controller
  • Improved handling of intermittently faulty devices (disks with irregular I/O responses under load)
  • New (read: “not backward compatible”) Auto-Sync replication (user configurable zfs+ssh still available for backward compatibility) with support for replication of HA to/from non-HA clusters
    • Includes LZ4 compression (fast) option
    • Better Control of “Force Flags” from NMV
    • Better Control of Buffering and Connections
  • L2ARC Compression now supported
    • Potentially doubles the effective coverage of L2ARC (for compressible data sets)
    • Supports LZ4 compression (fast)
    • Automatically applied if dataset is likewise compressed
  • Server Message Block v2.1 support for Windows (some caveats for IDMAP users)
  • iSCSI support for Microsoft Server 2012 Cluster and Cluster Shared Volume (CSV)
  • Guided storage pool configuration wizards – Performance, Balanced and Capacity modes
  • Enhanced Support Data and Log Gathering
  • High Availability Cluster plug-in (RSF-1) binaries are now part of the installation image
  • VMware: Much better VMXNET3 support
    • no more log spew
    • MTU settings work from NMV
  • VMware: Install to PVSCSI (boot disk) from ISO no longer requires tricks
  • Upgrade from 3.x is currently “disruptive” – promised “non-disruptive” in next maintenance update
  • Improved DTrace capabilities from NMC shell for
    • general IO
  • Snappier, more stable NMV/GUI
    • Service port changes from 2000 to 8457
    • Multi-NMS default
    • Fast refresh for ZFS containers
    • RSF-1 defaults in “Server” settings
    • Improved iSCSI

See Nexenta’s 4.0.1 Release Notes for additional changes and details.

Note, the 18TB Community Edition EULA is still hampered by the “non-commercial” language, restricting it’s use to home, education and academic (ie. training, testing, lab, etc.) targets. However, the “total amount of Storage Space” license for Community is a deviation from the Enterprise licensing (typical “raw” storage entitlement)

2.2 If You have acquired a Community Edition license, the total amount of Storage Space is limited as specified on the Site and is subject to change without notice. The Community Edition may ONLY be used for educational, academic and other non-commercial purposes expressly excluding any commercial usage. The Trial Edition licenses may ONLY be used for the sole purposes of evaluating the suitability of the Product for licensing of the Enterprise Edition for a fee. If You have obtained the Product under discounted educational pricing, You are only permitted to use the Product for educational and academic purposes only and such license expressly excludes any commercial purposes.

– NexentaStor EULA, Version 4.0; Last updated: March 18, 2014

For those who operate under the Community license, this means your total physical storage is UNLIMITED, provided your space “IN USE” falls short of 18TB (18,432 GB) at all times. Where this is important is in constructing useful arrays with “currently available” disks (SATA, SAS, etc.) Let’s say you needed 16TB of AVAILABLE space using “modern” 3TB disks. The fact that your spinning disks are individually larger than 600GB indicates that array rebuild times might run afoul of failure PRIOR to the completion of the rebuild (encountering data loss) and mirror or raidz2/raidz3 would be your best bet for array configuration.

SOLORI Note: Richard Elling made this concept exceedingly clear back in 2010, and his “ZFS data protection comparison” of 2, 3 and 4-way mirrors to raidz, raidz2 and raidz3 is still a great reference on the topic.

Elling’s MTTDL Comparison by RAID Type


Given 16TB in 3-way mirror or raidz2 (roughly equivalent MTTDL predictors), your 3TiB disk count would follow as:

3-way Mirror Disks := RoundUp( 16 * (1024 / 1000)^3 / 70% / ( 3 * (1000 / 1024)^3 )  ) * 3 = 27 disks, or

6-disk Raidz2 Disks := RoundUp( 16 * (1024 / 1000)^3 / 70% / ( 4 * 3 * (1000 / 1024)^3 )  ) * 6 = 18 disks

By “raw” licensing standards, the 3-way mirror would require a 76TB license while the raidz2 volume would require a 51TB license – a difference of 25TB in licensing (around $5,300 retail). However, under the Community License, the “cost” is exactly the same, allowing for a considerable amount of flexibility in array loadout and configuration.

Why do I need 54TiB in disk to make 16TB of “AVAILABLE” storage in Raidz2?

The RAID grouping we’ve chosen is 6-disk raidz2 – that’s akin to 4 data and 2 parity disks in RAID6 (without the fixed stripe requirement or the “write hole penalty.”) This means, on average, one third of the space consumed on-disk will be in the form of parity information. Therefore, right of the top, we’re losing 33% of the disk capacity. Likewise, disk manufacturers make TiB not TB disks, so we lose 7% of “capacity” in the conversion from TiB to TB. Additionally, we like to have a healthy amount of space reserved for new block allocation and recommend 30% unused space as a target. All combined, a 6-disk raidz array is, at best, 43% efficient in terms of capacity (by contrast, 3-way mirror is only 22% space efficient). For an array based on 3TiB disks, we therefore get only 1.3TB of usable storage – per disk – with 6-disk raidz (by contrast, 10-disk raidz nets only 160GB additional “usable” space per disk.)

 SOLORI’s Take: If you’re running 3.x in production, 4.0.1 is not suitable for in-place upgrades (yet) so testing and waiting for the “non-disruptive” maintenance release is your best option. For new installations – especially inside a VM or hypervisor environment as a Virtual Storage Appliance (VSA) – version 4.0.1 presents a better option over it’s 3.x siblings. If you’re familiar with 3.x, there’s not much new on the NMV side outside better tunables and snappier response.


Quick-Take: Removable Media and Update Manager Host Remediation

January 31, 2013

Thanks to a spate of upgrades to vSphere 5.1, I recently (re)discovered the following inconvenient result when applying an update to a DRS cluster from Update Manager (, using vCenter Server Appliance 5.1.0 build 947673):

Remediate entity ‘vm11.solori.labs’  Host has VMs ‘View-PSG’ , vUM5 with connected removable media devices. This prevents putting the host into maintenance mode. Disconnect the removable devices and try again.

Immediately I thought: “Great! I left a host-only ISO connected to these VMs.” However, that assumption was as flawed as Update Manager’s assumption that the workloads cannot be vMotion’d without disconnecting the removable media. In fact, the removable media indicated was connected to a shared ISO repository available to all hosts in the cluster. However, I was to blame and not Update Manager, as I had not remembered that Update Manager’s default response to removable media is to abort the process. Since cluster remediation is a powerful feature made possible by Distributed Resource Scheduler (DRS) in Enterprise (and above) vSphere editions that may be new to the feature to many (especially uplifted “Advanced AK” users), it seemed like something worth reviewing and blogging about.

Why is this a big deal?

More the the point, why does this seem to run contrary to “a common sense” response?

First, the manual for remediation of a host in a DRS cluster would include:

  1. Applying “Maintenance Mode” to the host,
  2. Selecting the appropriate action for “powered-off and suspended” workloads, and
  3. Allowing DRS to choose placement and finally vMotion those workloads to an alternate host.

In the case of VMs with removable media attached, this set of actions will result in the workloads being vMotion’d (without warning or hesitation) so long as the other hosts in the cluster have access to the removable media source (i.e. shared storage, not “Host Device.”) However, in the case of Update Manger remediation, the following are documented road blocks to a successful remediation (without administrative override):

  1. A CD/DVD drive is attached (any method),
  2. A floppy drive is attached (any method),
  3. HA admission control prevents migration of the virtual machine,
  4. DPM is enabled on the cluster,
  5. EVC is disabled on the cluster,
  6. DRS is disabled on the cluster (preventing migration),
  7. Fault Tolerance (FT) is enabled for a VM on the host in the cluter.

Therefore it is “by design” that a scheduled remediation would have failed – even if the removable media would be eligible for vMotion. To assist in the evaluation of “obstacles to successful deferred remediation” a cluster remediation report is available (see below).

Generating a remediation report prior to scheduling a Update Manager remediation.

Generating a remediation report prior to scheduling a Update Manager remediation.

In fact, the report will list all possible road blocks to remediation whether or not matching overrides are selected (potentially misleading, certainly not useful for predicting the outcome of the remediation attempt). While this too is counter intuitive, it serves as a reminder of the show-stoppers to successful remediation. For the offending “removable media” override, the appropriate check-box can be found on the options page just prior to the remediation report:

Disabling removable media during Update Manager driven remediation.

Disabling removable media during Update Manager driven remediation.

The inclusion of this override allows Update Manager to slog through the remediation without respect to the attached status of removable media. Likewise, the other remediation overrides will enable successful completion of the remediation process; these overrides are:

  1. Maintenance Mode Settings:
    1. VM Power State prior to remediation:  Do not change, Power off, Suspend
    2. Temporarily disable any removable media devices;
    3. Retry maintenance mode in case of failure (delay and attempts);
  2. Cluster Settings:
    1. Temporarily Disable Distributed Power Management (forces “sleeping” hosts to power-on prior to next steps in remediation);
    2. Temporarily Disable High Availability Admission Control (allows for host remediation to violate host-resource reservation margins);
    3. Temporarily Disable Fault Tolerance (FT) (admonished  to remediate all cluster hosts in the same update cycle to maintain FT compatibility);
    4. Enable parallel remediation for hosts in cluster (will not violate DRS anti-affinity constraints);
      1. Automatically determine the maximum number of concurrently remediated hosts, or
      2. Limit the number of concurrent hosts (1-32);
    5. Migrate powered off and suspended virtual machines to other hosts in the cluster (helpful when a remediation leaves a host in an unserviceable condition);
  3.  PXE Booted ESXi Host Settings:
    1. Allow installation of additional software on PXE booted ESXi 5.x hosts (requires the use of an updated PXE boot image – Update Manager will NOT reboot the PXE booted ESXi host.)

These settings are available at the time of remediation scheduling and as host/cluster defaults (Update Manager Admin View.)

SOLORI’s Take: So while it follows that the remediation process is NOT as similar to the manual process as one might think, it still can be made to function accordingly (almost.) There IS a big difference between disabling removable media and making vMotion-aware decisions about hosts. Perhaps VMware could take a few cycles to determine whether or not a host is bound to a removable media device (either through Host Device or local storage resource) and make a more intelligent decision about removable media.

vSphere already has the ability to identify point-resource dependencies, it would be nice to see this information more intelligently correlated where cluster management is concerned. Currently, instead of “asking” DRS for a dependency list, it just seems to just ask the hosts “do you have removable media plugged-into any VM’s” – and if the answer is “yes” it stops right there… Still, not very intuitive for a feature (DRS) that’s been around since Virtual Infrastructure 3 and vCenter 2.


Quick-Take: vCenter Server 5.0 Update 1b, Appliance Replaced DB2 with Postgres

August 17, 2012

VMware announced the availability of vCenter Server 5.0 Update 1b today along with some really good news for the fans of openness:

vCenter Server Appliance Database Support: The DB2 express embedded database provided with the vCenter Server Appliance has been replaced with VMware vPostgres database. This decreases the appliance footprint and reduces the time to deploy vCenter Server further.

vCenter 5.0U1b Release Notes

Ironically and despite its reference in the release notes, the VMware Product Interoperability Matrix has yet to be updated to include 5.0U1b for reference, so  the official impact of an upgrade is as-yet unknown.

VMware Product Interoperability Matrix not updated at time of vCenter 5U1b release.

Also, couple of new test questions are going to be tricky moving forward as the support for Oracle has been expanded:

  • vCenter Server 5.0 Update 1b introduces support for the following vCenter Databases
    • Oracle 11g Enterprise Edition, Standard Edition, Standard ONE Edition Release 2 [] – 64 bit
    • Oracle 11g Enterprise Edition, Standard Edition, Standard ONE Edition Release 2 [] – 32 bit

Besides still not supporting IPv6 and continuing the limitation of 5 hosts and 50 VMs, there is some additional leg work needed to upgrade the vCenter Server Appliance 5.0U1a to U1b as specified in KB2017801:

  1. Create a new virtual disk with size 20GB and attach it to the vCenter Server Appliance.
  2. Log in to the vCenter Server Appliance’s console and format the new disk as follows:
    1. At the command line, type, “echo “- – -” > /sys/class/scsi_host/host0/scan”.
    2. Type, “parted -s /dev/sdc mklabel msdos”.
    3. Type, “parted -s /dev/sdc mkpartfs primary ext2 0 22G”.
  3. Mount the new partition under /storage/db/export:
    1. Type, “mkdir -p /storage/db/export”.
    2. Type, “mount /dev/sdc1 /storage/db/export”.
  4. Repeat the update process.
  5. You can remove the new disk after the update process finishes successfully and the vCenter Server Appliance is shut down.

SOLORI’s Take: Until the interop matrix is updated, it’s hard to know what you’re getting into with the update (Update: as you can see from Joshua Andrews’ post on SOS tech), but the inclusion of vPostgres – VMware’s vFabric deployment of PostgreSQL 9.1.x – makes taking a look at the “crippled” appliance version a bit more tantalizing.  Hopefully, the next release will “unshackle” the vCenter Appliance beyond the 5/50 limitations – certainly vPostgres is up to the task of managing many, many more hosts and VMs (vCD anyone?) Cheers, VMware!


Quick-Take: How Virtual Backup Can Invite Disaster

August 1, 2012

There have always been things about virtualizing the enterprise that have concerned me. Most boil down to Uncle Ben’s admonishment to his nephew, Peter Parker, in Stan Lee’s Spider-Man, “with great power comes great responsibility.” Nothing could be more applicable to the state of modern virtualization today.

Back in “the day” when all this VMware stuff was scary and “complicated,” it carried enough “voodoo mystique” that (often defacto) VMware admins either knew everything there was to know about their infrastructure, or they just left it to the experts. Today, virtualization has reached such high levels of accessibility that I think even my 102 year old Nana could clone a live VM; now that is scary.

Enter Veeam Backup, et al

Case in point is Veeam Backup and Recovery 6 (VBR6). Once an infrastructure exceeds the limits of VMware Data Recovery (VDR), it just doesn’t get much easier to backup your cadre of virtual machines than VBR6. Unlike VDR, VBR6 has three modes of access to virtual machine disks:

  1. Direct SAN  Access – VBR6 backup server/proxy has direct access to the VMFS LUNs containing virtual machine disks – very fast, very low overhead;
  2. Virtual Appliance – VBR6 backup server/proxy, running as a virtual machine, leverages it’s relation to the ESXi host to access virtual machine disks using the ESXi host as a go-between – fast, moderate overhead;
  3. Network – VBR6 backup server/proxy accesses virtual machine disks from ESXi hosts similar in a manner similar to the way the vSphere Client grants access to virtual machine disks across the LAN – slower, with more overhead;

For block-based storage, option (1) appears to be the best way to go: it’s fast with very little overhead in the data channel. For those of us with grey hair, think VMware Consolidated Backup proxy server and you’re on the right track; for everyone else, think shared disk environment. And that, boys and girls, is where we come to the point of today’s lesson…

Enter Windows Server, Updates

For all of its warts, my favorite aspect of VMware Data Recovery is the fact that it is a virtual appliance based on a stripped-down Linux distribution. Those two aspects say “do not tamper” better than anything these days, so admins – especially Windows admins – tend to just install and use as directed. At the very least, the appliance factor offers an opportunity for “special case” handling of updates (read: very controlled and tightly scripted).

The other “advantage” to VMDR is that is uses a relatively safe method for accessing virtual machine disks: something more akin to VBR6’s “virtual appliance” mode of operation. By allowing the ESXi host(s) to “proxy” access to the datastore(s), a couple of things are accomplished:

  1. Access to VMDKs is protocol agnostic – direct attach, iSCSI, AoE, SAS, Fiber Channel and/or NFS all work the same;
  2. Unlike “Direct SAN Access” mode, no additional initiators need to be added to the target(s)’ ACL;
  3. If the host can access the VMDK, it stands a good chance of being backed-up fairly efficiently.

However, VBR6 installs onto a Windows Server and Windows Server has no knowledge of what VMFS looks like nor how to handle VMFS disks. This means Windows disk management needs to be “tweaked” to ignore VMFS targets by disabling “automount” in VBR6 servers and VCB proxies. For most, it also means keeping up with patch management and Windows Update (or appropriate derivative). For active backup servers with a (pre-approved, tested) critical update this might go something like:

  1. Schedule the update with change management;
  2. Stage the update to the server;
  3. Put server into maintenance mode (services and applications disabled);
  4. Apply patch, reboot;
  5. Mitigate patch issues;
  6. Test application interaction;
  7. Rinse, repeat;
  8. Release server back to production;
  9. Update change management.

See the problem? If Windows Server 2008 R2 SP1 is involved you just might have one right around step 5…

And the Wheels Came Off…

Service Pack 1 for Windows Server 2008 R2 requires a BCD update, so existing installations of VCB or VBR5/6 will fail to update. In an environment where there is no VCB or VBR5/6 testing platform, this could result in a resume writing event for the patching guy or the backup administrator if they follow Microsoft’s advice and “fix” SP1. Why?

Fixing the SP1 installation problem is quite simple:

Quick steps to do this in case you forgot are:


2.  automount enable

3.  Restart

4.  Install SP1

Technet Blogs, Windows Servicing Guy, SP1 Fails with 0x800f0a12

Done, right? Possibly in more ways than one. By GLOBALLY enabling automount, rebooting Windows Server and installing SP1, you’ve opened-up the potential for Windows to write a signature to the VMFS volumes holding your critical infrastructure. Fortunately, it doesn’t have to end that way.

Avoiding the Avoidable

Veeam’s been around long enough to have some great forum participants from across the administrative spectrum. Fortunately, a member posted a solution method that keeps us well away from VMFS corruption and still solves the SP1 issue in a targeted way: temporarily mounting the “hidden” system partition instead of enabling the global automount feature. Here’s my take on the process (GUI mode):

  1. Inside Server Manager, open Disk Management (or run diskmgt.msc from admin cmd prompt);
  2. Right-click on the partition labled “System Reserved” and select “Change Drive Letter and Paths…”
  3. On the pop-up, click the “Add…” button and accept the default drive letter offered, click “OK”;
  4. Now “try again” the installation of Service Pack 1 and reboot;
  5. Once SP1 is installed, re-run Disk Management;
  6. Right-click on the “System Reserved” partition and select “Change Drive Letter and Paths..”
  7. Click the “Remove” button to unmap the drive letter;
  8. Click “Yes” at the “Are you sure…” prompt;
  9. Click “Yes” at the “Do you want to continue?” prompt;
  10. Reboot (for good measure).

This process assumes that there are no non-standard deployments of the Server 2008 R2 boot volume. Of course, if there is no separate system reserved partition, you wouldn’t encounter the SP1 failure to install issue…

SOLORI’s Take: The takeaway here is “consider your environment” (and the people tasked with maintaining it) before deploying Direct SAN Access mode into a VMware cluster. While it may represent “optimal” backup performance, it is not without its potential pitfalls (as demonstrated herein). Native access to SAN LUNs must come with a heavy dose of respect, caution and understanding of the underlying architecture: otherwise, I recommend Virtual Appliance mode (similar to Data Recovery’s take.)

While no VMFS volumes were harmed in the making of this blog post, the thought of what could have happened in a production environment chilled me into writing this post. Direct access to the SAN layer unlocks tremendous power for modern backup: just be safe and don’t forget to heed Uncle Ben’s advice! If the idea of VMFS corruption scares you beyond your risk tolerance, appliance mode will deliver acceptable results with minimal risk or complexity.


Quick-Take: NexentaStor 3.1.3 New AD Group Feature, Can Break AD Shares

June 12, 2012

The latest update of NexentaStor may not go too smoothly if you are using Windows Server 2008 AD servers and delegating shares via NexentaStor. While the latest update includes a long sought after fix in AD capabilities (see pull quote below) it may require a tweak to the CIFS Server settings to get things back on track.

Domain Group Support

It is now possible to allow Domain groups as members of local groups. When a Windows client authenticates with NexentaStor using a domain account, NexentaStor consults the domain controller for information about that user’s membership in domain groups. NexentaStor also computes group memberships based on its _local_ groups database, adding both local and domain groups based on local group memberships, which are allowed to be indirect. NexentaStor’s computation of group memberships previously did not correctly handle domain groups as members of local groups.

NexentaStor 3.1.3 Release Notes

In the past, some of NexentaStor’s in-place upgrades have reset the “lmauth_level” of the associated SMB share server from its user configured value back to a “default” of four (4). This did not work very well in an AD environment where the servers were Windows Server 2008 and running their native authentication mode. The fix was to change the “lmauth_level” to two (2) via the NMV or NMC (“sharectl set -p lmauth_level=2 smb”) and restart the service. If you have this issue, the giveaway kernel log entries are as follows:

smbd[7501]: [ID 702911 daemon.notice] smbd_dc_update: myad.local: locate failed
smbd[7501]: [ID 702911 daemon.notice] smbd_dc_monitor: domain service not responding

However, the rules have changed in some applications; Nexenta’s new guidance is:

Summary Description CIFS Issue

A recent patch release by Microsoft has necessitated a changed to the CIFS authorization setting. Without changing this setting, customers will see CIFS disconnects or the appliance being unable to join the Active Directory domain. If you experience CIFS disconnects or problems joining your Active Directory domain, please modify the ‘lmauth_level’ setting.

# sharectl set -p lmauth_level=4 smb

– NexentaStor 3.1.3 Release Notes

While this may work for others out there it does not universally work for any of my tested Windows Server 2008 R2, native AD mode servers. Worse, it appears to work with some shares, but not all; this can lead to some confusion about the actual cause (or resolution) of the problem based on the Nexenta release notes. Fortunately (or not, depending on your perspective), the genesis of NexentaStor is clearlyheading toward an intersection with Illumos although the current kernel is still based on Open Solaris (134f), and a post from OpenIndiana points users to the right solution.

(Jonathan Leafty) I always thought it was weird that lmauth_level had to be set to 2 so I
bumped it back to the default of 3 and restarted smb and it worked...
(Gordon Ross) Glad you found that.  I probably should have sent a "heads-up" when the
"extended security outbound" enhancement went in.  People who have
adjusted down lmauth_level should put it back the the default.

– CIFS in Domain Mode (AD 2008), OpenIndiana Discussion Group (

Following the advice for OpenIndiana re-enabled all previously configured shares. This mode is also the default for Solaris, although NexentaStor continues to use a different one. According to the man pages for smb on Nexenta (‘man smb(4)’) the difference between ‘lmauth_level=3’ and ‘lmauth_level=4’ is as follows:


Specifies the LAN Manager (LM) authentication level. The LM compatibility level controls the type of user authentication to use in workgroup mode or
domain mode. The default value is 3.

The following describes the behavior at each level.

2 – In Windows workgroup mode, the Solaris CIFS server accepts LM, NTLM, LMv2, and NTLMv2 requests. In domain mode, the SMB redirector on
the Solaris CIFS server sends NTLM requests.

3 – In Windows workgroup mode, the Solaris CIFS server accepts LM, NTLM, LMv2, and NTLMv2 requests. In domain mode, the SMB redirector on
the Solaris CIFS server sends LMv2 and NTLMv2 requests.

4 – In Windows workgroup mode, the Solaris CIFS server accepts NTLM, LMv2, and NTLMv2 requests. In domain mode, the SMB redirector on the
Solaris CIFS server sends LMv2 and NTLMv2 requests.

5 – In Windows workgroup mode, the Solaris CIFS server accepts LMv2 and NTLMv2 requests. In domain mode, the SMB redirector on the Solaris
CIFS server sends LMv2 and NTLMv2 requests.

Manpage for SMB(4)

This illustrates either a continued dependency on LAN Manager (absent in ‘lmauth_level=4’) or a bug as indicated in the OpenIndiana thread. Either way, more testing to determine if this issue is unique to my particular 2008 AD environment or this is a general issue with the current smb/server facility in NexentaStor…

SOLORI’s Take: So while NexentaStor defaults back to ‘lmauth_level=4’ and ‘lmauth_level=2’ is now broken (for my environment), the “default” for OpenIndiana and Solaris (‘lmauth_level=3’) is a winner; as to why – that’s a follow-up question… Meanwhile, proceed with caution when upgrading to NexentaStor 3.1.3 if your appliance is integrated into AD – testing with the latest virtual appliance for the win.


Quick-Take: vCenter 5.0 dies within 48-hours of Installation, Error 1000

May 1, 2012

After upgrading a View installation for a client this weekend from View 4.0 to View 5.0 all seemed well. The upgrade process took them from vSphere 4.0U2 to vSphere 5.0U1 in the bargain – about 15-20 hours of work including backups and staging. Testing and the first 24 hours of production went swimmingly with no negative reports or hiccups. (The upgrade process and spectres of dead pilots-turned-production is an issue for another blog post.)

I got a call about vCenter 5.0 dying (and then magically working again before the local admin could get to it – a couple of minutes or so.) Two mysteries, one easy, one VERY frustrating…

Mystery One – vCenter Dies and Comes Back to Life

This was the easy one: the VMware VirtualCenter Server service is set to a “300000 millisecond” recovery delay upon failure by default. The local site admin didn’t have his prayer answered, the system just recovered as planned. (Note to upgraders – set your recovery time to more or less hold-down time as your site needs – probably no less than 120000 milliseconds.)

The VMware VirtualCenter Server service terminated unexpectedly. It has done this 1 time(s). The following corrective action will be taken in 300000 milliseconds: Restart the service.

– Service Control Manager

Why would five minutes (yep, 300000 milliseconds) be a good amount of recovery time? The socratic answer is this: how long will it take for all of the vCenter log and dump files to be written based on your environment? In the case of this issue, the dump file was about 500MB in size with about another 150MB in various other logs. At a “leisurely pace” of 5 MB/sec (let’s assume the worst), that would require about two minutes of “hold time” before restart.

Mystery Two – vCenter Died. Why?

Here’s the problem: vCenter needs to be bullet proof. vCenter’s installer asks  for your environmental size during the installation and sets parameters to accommodate the basic needs. Also, during the SQL upgrade process from vCenter 4.0 to 5.0, the SQL database is set from SIMPLE (the recommended setting for vCenter) to BULK-LOGGING, but just for the duration of the upgrade. After the upgrade it’s reset back to SIMPLE.

Fast forward 48 hours. vCenter is running with a couple of hundred virtual machines in a View environment and is tracking all of that lovely host and performance data we appreciate when dealing with complex enterprise systems. It’s happily responding to View Connection Server’s request for power-ons and power-offs when all of a sudden the worst happens: it crashes!

Suddenly, 10’s of thousands of dollars worth of infrastructure is waiting for a 5 minute recovery interval and View logins requiring VM power-ons wont happen until then. All is not right in your virtual world now, buckaroo! Let’s see if Windows Event Viewer can elicit a solution:

The description for Event ID 1000 from source VMware VirtualCenter Server cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Log directory: C:\ProgramData\VMware\VMware VirtualCenter\Logs.

the message resource is present but the message is not found in the string/message table

– Event Viewer, Application Log

Okay, Event ID 1000 – there’s got to be a KB on that one, but seriously, ID 1000 sound pretty generic for me to have a ton of hope. But sure enough, VMware Knowledge Base immediately coughs up KB article 1015101, applicable to vCenter 5.0. Unfortunately, vCenter Server is not installed on an IIS platform, so this is just an empty rabit hole…

Next, let’s have a look at the vCenter Server logs (thoughtfully pointed to in the Event log, above) for vCenter at-or-around the time of failure. Sure enough, there is a gzipped log with the restart time stamp available. A quick glance at the end of the log shows the following “impending doom” quality message:

--> Panic: TerminateHandler called
--> Backtrace:
--> backtrace[00] rip 000000018013deba (no symbol)
--> backtrace[01] rip 0000000180101518 (no symbol)
--> backtrace[60] rip 00000000708f2fdf (no symbol)
--> backtrace[61] rip 00000000708f3080 (no symbol)

– vCenter vpxd-X.log file

But a sobering look above the doomsday report gives us a better idea as to the real culprit: SQL execution failed. What? Did I hear you whisper “kill your DBA?” Before walking down to the DBA and calling him out for leaving you in the lurch, let’s visit the SQL logs to find out (perhaps you will have to talk to the DBA after all if your vCenter admins don’t have access to SQL logs in the environment.) Here’s what my SQL log for the vCenter database said:

05/01/2012 08:05:21,spid62,Unknown,The transaction log for database 'VIM_VCDB' is full. To find out why space in the log cannot be reused<c/> see the log_reuse_wait_desc column in sys.databases
05/01/2012 08:05:21,spid62,Unknown,Error: 9002<c/> Severity: 17<c/> State: 4.
05/01/2012 08:00:04,spid75,Unknown,The transaction log for database 'VIM_VCDB' is full. To find out why space in the log cannot be reused<c/> see the log_reuse_wait_desc column in sys.databases
05/01/2012 08:00:04,spid75,Unknown,Error: 9002<c/> Severity: 17<c/> State: 4.

– Microsoft SQL Server Log for VIM_VCDB (vCenter)

Note that something to this effect also shows up as a diagnostic message inside the vCenter log – reducing the number of times you need to traipse down to the DBA’s cubby for  a chat. Okay, that cinches it, the DBA’s been meddling in my vCenter database again – probably with some unscheduled and undocumented maintenance. We’re definitely going to have that talk now, right? Nope.


Remember that upgrade we did 48-hours ago? As part of the upgrade process, the database is upgraded from the vCenter 4.0’s format to the more information rich vCenter 5.0 format. Along the way, an upgrade process changes the SQL database’s mode from the preferred “SIMPLE” mode to the “BULK-LOGGING” mode so that a failed upgrade can be more easily rolled-back.


BULK-LOGGING mode can create a HUGE transaction log during a vCenter upgrade process. There are MANY posts about the TLOG filling-up during these processes, with a consensus that the TLOG needs to be allowed to grow to at least 4x the size of your vCenter database or the process will not complete.

You’ve been warned.

In the case of this upgrade, I happen to know that the TLOG was set to at least 4x of the vCenter database PRIOR to the upgrade process. In fact, during this upgrade (final stage) it grew to 1.5X of the vCenter database size. What was unknown to me – until now – is that the TLOG maximum allowed growth was reset to 500MB when the database was returned to “SIMPLE” mode. During a time of high activity (perhaps processing the last 24-hours of data) the TLOG needed to exceed that amount, couldn’t, and vCenter crashed accordingly. The simple fix is to increase the TLOG limit back to the original settings that works well for the environment.

SOLORI’s Take:

Ouch! Someone feels setup for failure. I never want to hear a customer say: “gosh, everything was great until I logged into vCenter [with the vSphere Client] and then, “all of a sudden” things went sideways” – especially when the cause is that SQL server has been silently modified with setting known to cause it to  choke, subsequently resulting in vCenter coming to a crashing halt.

VMware: if you’re modifying my database parameters POST INSTALL you need to WARN ME or post it in the install or upgrade docs. I’ve combed them and can’t find it… let’s get the upgrade process modified so that the database settings are restored after the database is returned to SIMPLE mode, okay?

Updated 05/02/2012: Corrected intro grammar. Link to TLOG upgrade issue added.


Quick Take: Syslog Stops Working after Upgrade to ESXi 5.0 Update 1

March 24, 2012

If you’ve recently upgraded your ESXi from 5.0 build 456551 and were logging to syslog, it’s possible that your events are no longer being received by your syslog server. It seems that there was a “feature” in ESXi 5.0 build 456551 that allowed syslog to escape the ESXi firewall regardless of the firewall setting. This could be especially problematic if your upgraded from ESXi 4.x where there was no firewall configuration needed for syslog traffic.

VMware notes that syslog traffic was not affected by the ESXi firewall in v5 build 456551. See KB2003322 for details.

However, in ESXi 5.0 Update 1, the firewall rules definitely applies and if you were “grandfathered-in” during the upgrade to build 456551: check your syslog for your ESXi 5 servers. If your no longer getting syslog entries, either set the policy in the host’s Configuration->Security Profile->Properties… control panel:

Enabling syslog traffic in the ESXi firewall within the vSphere Client interface.


Or use ESXCLI to do the work (especially with multiple hosts):

esxcli network firewall ruleset set –ruleset-id=syslog –enable=true

esxcli network firewall refresh

That will take care of the “absent” syslog entries.

SOLORI’s Take: Gotcha! As ESXi becomes more like ESX in terms of provisioning, old-school ESXiers (like me) need to make sure they’re up-to-speed on the latest changes in ESXi. Ashamed to admit it, but this exact scenario got me in my home lab… Until I stumbled onto KB2003322 I didn’t think to go back and check the ESXi firewall settings – after all, it was previously working 😉


Quick Take: VMware ESXi 5.0, Patch ESXi50-Update01

March 16, 2012

VMware releases ESXi 5.0 Complete Update 1 for vSphere 5. An important change for this release is the inclusion of general and security-only image profiles:

Starting with ESXi 5.0 Update 1, VMware patch and update releases contain general and security-only image profiles. Security-only image profiles are applicable to new security fixes only. No new bug fixes are included, but bug fixes from earlier patch/update releases are included.

The general release image profile supersedes the security-only profile. Application of the general release image profile applies to new security and bug fixes.

The security-only image profiles are identified with the additional “s” identifier in the image profile name.

Just a few of the more interesting bugs fixed in this release:

PR 712342: Cannot assign VMware vSphere Hypervisor license key to an ESXi host with pRAM greater than 32GB

PR 719895: Unable to add a USB device to a virtual machine (KB 1039359).

PR 721191: Modifying snapshots using the commands vim-cmd vmsvc/snapshot.remove or vim-cmd vmsvc/snapshot.revert
will fail when applied against certain snapshot tree structures.

This issue is resolved in this release. Now a unique identifier, snapshotId, is created for every snapshot associated to a virtual machine. You can get the snapshotId by running the command vim-cmd vmsvc/snapshot.get <vmid>. You can use the following new syntax when working with the same commands:

Revert to snapshot: vim-cmd vmsvc/snapshot.revert <vmid> <snapshotId> [suppressPowerOff/suppressPowerOn]
Remove a snapshot: vim-cmd vmsvc/snapshot.remove <vmid> <snapshotId>

PR 724376: Data corruption might occur if you copy large amounts of data (more than 1GB) from a 64-bit Windows virtual machine to a USB storage device.

PR 725429: Applying a host profile to an in-compliance host causes non-compliance (KB 2003472).

PR 728257: On a pair of HA storage controllers configured for redundancy, if you take over one controller, the datastores that reside on LUNs on the taken over controller might show inactive and remain inactive until you perform a rescan manually.

PR 734366: Purple diagnostic screen with vShield or third-party vSphere integrated firewall products (KB 2004893)

PR 734707: Virtual machines on a vNetwork Distributed Switch (vDS) configured with VLANs might lose network connectivity upon boot if you configure Private VLANs on the vDS. However, disconnecting and reconnecting the uplink solves the problem.This issue has been observed on be2net NICs and ixgbe vNICs.

PR 742242: XCOPY commands that VAAI sends to the source storage device might fail. By default, XCOPY commands should be sent to the destination storage device in accordance with VAAI specification.

PR 750460: Adding and removing a physical NIC might cause an ESXi host to fail with a purple screen. The purple diagnostic screen displays an error message similar to the following:

NDiscVlanCheck (data=0x2d16, timestamp=<value optimized out>) at bora/vmkernel/public/list.h:386

PR 751803: When disks larger than 256GB are protected using vSphere Replication (VR), any operation that causes an internal restart of the virtual disk device causes the disk to complete a full sync. Internal restarts are caused by a number of conditions including any time:

  • A virtual machine is restarted
  • A virtual machine is vMotioned
  • A virtual machine is reconfigured
  • A snapshot is taken of the virtual machine
  • Replication is paused and resumed

PR 754047: When you upgrade VMware Tools the upgrade might fail because, some Linux distributions periodically delete old files and folders in /tmp. VMware Tools upgrade requires this directory in /tmp for auto upgrades.

PR 766179: ESXi host installed on a server with more than 8 NUMA nodes fails and displays a purple screen.

PR 769677: If you perform a VMotion operation to an ESXi host on which the boot-time option “pageSharing” is disabled, the ESXi host might fail with a purple screen.

Disabling pageSharing severely affects performance of the ESXi host. Because pageSharing should never be disabled, starting with this release, the “pageSharing” configuration option is removed.

PR 773187: On an ESXi host, if you configure the Network I/O Control (NetIOC) to set the Host Limit for Virtual Machine Traffic to a value higher than 2000Mbps, the bandwidth limit is not enforced.

PR 773769: An ESXi host halts and displays a purple diagnostic screen when using Network I/O Control with a Network Adapter that does not support VLAN Offload (KB 2011474).

PR 788962: When an ESXi host encounters a corrupt VMFS volume, VMFS driver might leak memory causing VMFS heap exhaustion. This stops all VMFS operations causing orphaned virtual machines and missing datastores. vMotion operations might not work and attempts to start new virtual machines might fail with errors about missing files and memory exhaustion. This issue might affect all ESXi hosts that share the corrupt LUN and have running virtual machines on that LUN.

PR 789483: After you upgrade to ESXi 5.0 from ESXi 4.x, Windows 2000 Terminal Servers might perform poorly. The consoles of these virtual machines might stop responding and their CPU usage show a constant 100%.

PR 789789: ESXi host might fail with a purple screen when a virtual machine connected to VMXNET 2 vNIC is powered on. The purple diagnostic screen displays an error message similar to the following:

0x412261b07ef8:[0x41803b730cf4]Vmxnet2VMKDevTxCoalesceTimeout@vmkernel#nover+0x2b stack: 0x412261b0
0x412261b07f48:[0x41803b76669f]Net_HaltCheck@vmkernel#nover+0xf6 stack: 0x412261b07f98

You might also observe an error message similar to the following written to VMkernel.log:

WARNING: Vmxnet2: 5720: failed to enable port 0x2000069 on vSwitch1: Limit exceeded^[[0m

SOLORI’s Take: Lions, tigers and bears – oh my! In all, I count seven (7) unique PSD bugs (listed in the full KB) along with some rather head-scratching gotchas.  Lots of reasons to keep your vSphere hosts current in this release to be sure… Use Update Manager or start your update journey here…


VMware vCenter5: Revenge of Y2K, aka Worst Host Import Fail Ever!

January 6, 2012

I was recently involved in a process of migrating from vSphere 4 to vSphere 5 for an enterprise client leapfrogging from vSphere 4.0 to vSphere 5.0. Their platform is and AMD service farm with modern, socket G34 CPU blades and 10G Ethernet connectivity – all moving parts on VMware’s Hardware Compatibility List for all versions of vSphere involved in the process.

Supermicro AS-2022TG Platform Compatibility

Intel 10G Ethernet, i82599EB Chipset based NIC

Although VMware lists the 2022TG-HIBQRF as ESXi 5.0 compatible and not the 2022TG-HTRF, it is necessary to note the only difference between the two is the presence of a Mellanox ConnectX-2 QDR infiniband controller on-board: the motherboards and BIOS are exactly the same, the Mellanox SMT components are just mission on the HTRF version.

It is key to note that VMware also distinguishes the ESXi compatible platform by supported BIOS version 2.0a (Supermicro’s current version) versus 1.0b for the HTRF version. The current version is also required for AMD Opteron 6200 series CPUs which is not a factor in this current upgrade process (i.e. only 6100-series CPUs are in use). For this client, the hardware support level of the current BIOS (1.0c) was sufficient.

Safe Assumptions

So is it safe to assume that a BIOS update is not necessary when migrating to a newer version of vSphere? In the past, it’s been feature driven. For instance, proper use new hardware features like Intel EPT, AMD RVI or VMDirectPath (pci pass-through) have required BIOS updates in the past. All of these features were supported by the “legacy” version of vSphere and existing BIOS – so sounds safe to assume a direct import into vCenter 5 will work and then we can let vCenter manage the ESXi update, right?

Well, not entirely: when importing the host to vCenter5 the process gets all the way through inventory import and the fails abruptly with a terse message “A general system error occurred: internal error.” Looking at the error details in vCenter5 is of no real help.

Import of ESXi 4 host fails in vCenter5 for unknow reason.

A search of the term in VMware Communities is of no help either (returns non-relevant issues). However, digging down to the vCenter5 VPXD log (typically found in the hidden directory structure “C:\ProgramData\VMware\VMware VirtualCenter\Logs\”) does return a nugget that is both helpful and obscure.

Reviewing the vCenter VPXD log for evidence of the import problem.

If you’ve read through these logs before, you’ll note that the SSL certificate check has been disabled. This was defeated in vCenter Server Settings to rule-out potentially stale SSL certificates on the “legacy” ESXi nodes – it was not helpful in mitigating the error. The section highlighted was, however, helpful in uncovering a relevant VMware Knowledgebase article – the key language, “Alert:false@ D:/build/ob/bora-455964/bora/vim/lib/vdb/vdb.cpp:3253” turns up only one KB article – and it’s a winner.

Knowledge Base article search for cryptic VPXD error code.

It is important – if not helpful – to note that searching KB for “import fail internal error” does return nine different (and unrelated) articles, but it does NOT return this KB (we’ve made a request to VMware to make this KB easier to find in a simpler search). VMware’s KB2008366 illuminates the real reason why the host import fails: non-Y2K compliant BIOS date is rejected as NULL data by vCenter5.

Y2K Date Requirement, Really?

Yes, the spectre of Y2K strikes 12 years later and stands as the sole roadblock to importing your perfectly functioning ESXi 4 host into vCenter5. According the the KB article, you can tell if you’re on the hook for a BIOS update by checking the “Hardware/Processors” information pane in the “Host Configuration” tab inside vCenter4.

ESXi 4.x host BIOS version/date exposed in vCenter4

According to vCenter date policy, this platform was minted in 1910. The KB makes it clear that any two-digit year will be imported as 19XX, where XX is the two digit year. Seeing as how not even a precursor of ESX existed in 1999, this choice is just dead stupid. Even so, the x86 PC wasn’t even invented until 1978, so a simple “date check” inequality (i.e. if “two_digit_date” < 78 then “four_digit_date” = 2000 + “two_digit_date”) would have resolved the problem for the next 65 years.

Instead, VMware will have you go through the process of upgrading and testing a new (and, as 6200 Opterons are just now available to the upgrade market, a likely unnecessary) BIOS version on your otherwise “trusty” platform.

Non-Y2K compliant BIOS date

Y2K-compliant BIOS date, post upgrade

Just to add insult to injury with this upgrade process, the BIOS upgrade for this platform comes with an added frustration: the IPMI/BMC firmware must also be updated to accommodate the new hardware monitoring capabilities of the new BIOS. Without the BMC update, vCenter will complain of Northbridge chipset overheat warnings from the platform until the BMC firmware is updated.

So, after the BIOS update, BMC update and painstaking hours (to days) of “new” product testing, we arrive at the following benefit: vCenter gets the BIOS version date correctly.

vCenter5 only wants Y2K compliant BIOS release dates for imported hosts

Bar Unnecessarily High

VMware actually says, “if the BIOS release date of the host is in the MM/DD/YY format, contact the hardware vendor to obtain the current MM/DD/YYYY format.” Really? So my platform is not vCenter5 worthy unless the BIOS date is four-digit year formatted? Put another way, VMware’s coders can create the premier cloud platform but they can’t handle a simple Y2K date inequality. #FAIL

Forget “the vRAM tax”, this obstacle is just dead stupid and unnecessary; and it will stand in the way of many more vSphere 5 upgrades. Relying on a BIOS update for a platform that was previously supported (remember 1.0b BIOS above?) just to account for the BIOS date is arbitrary at best, and it does not pose a compelling argument to your vendor’s support wing when dealing with an otherwise flawless BIOS.

SOLORI’s Take:

We’ve submitted a vCenter feature request to remove this exclusion for hundreds of vSphere 4.x hosts, maybe you should too…


Quick-Take: VMworld 2011, Thoughts on the Airplane

August 28, 2011

On the way to VMworld this morning this morning I started-out by listening to @Scott_lowe, @mike_laverick and @duncanyp about stretched clusters and some esoteric storage considerations. Then i was off reading @sakacc blogging about his take on stretch clusters and the black hole of node failure when I stumbled on a retweet @bgracely via @andreliebovici about the spectre of change in our industry. Suddenly these things seemed very well related within the context of my destination: VMworld 2011.

Back about a month ago when vSphere 5 was announced the buzz about the “upgrade” was consumed by discussions about licensing and vRAM. Naturally, this was not the focus VMware was hoping for, especially considering how much of a step forward vSphere 5 is over VS4. Rather, VMware – by all deserved rights – wanted to hear “excited” conversations about how VS5 was closing the gap on vCloud architecture problems and pain-points.

Personally, I managed to keep the vRAM licensing issue out of SOLORI’s blog for two reasons: 1) the initial vRAM targets were so off that VMware had to make a change, and 2) significant avenues for the discussion were available elsewhere. That does not mean I wasn’t outspoken about my thoughts on vRAM – made obvious by contributions to some community discussions on the topic – or VMware’s reasoning for moving to vRAM. Suffice to say VMware did “the right thing” – as I had confidence they would – and the current vRAM targets capture 100% of my clients without additional licenses.

I hinted that VS5 answers a lot of the hanging questions from VS4 in terms of facilitating how cloud confederations are architected, but the question is: in the distraction, did VS5’s “goodness” get lost in the scuffle? If so, can they get back the mind share they may have lost to Chicken Little reactionaries?

First, if VMware’s lost ground to anyone, it’s VMware. The vast majority of cool-headed admins I talked to were either not affected by vRAM or were willing to take a wait-and-see outlook on vSphere 5 with continued use of vSphere 4.1. Some did evaluate Hyper-V’s “readiness” but most didn’t blink. By comparison, vSphere 4.1 still had more to offer private cloud than anything else.

Secondly, vSphere 5 “goodness” did get lost in the scuffle, and that’s okay! It may be somewhat counter intuitive but I believe VMware will actually come out well ahead of their “would be” position in the market, and it is precisely because of these things, not just in spite of them. Here’s my reasoning:

1) In the way the vSphere 5 launch announcement and vRAM licensing debacle unfolded, lot of the “hot air” about vRAM was vented along the way. Subsequently, VMware gained some service cred by actually listening to their client base and making a significant change to their platform pricing model. VMware got more bang-for-their-buck out of that move as the effect on stock price may never be known here, given the timing of the S&P ratings splash, but I would have expected to see a slight hit. Fortunately, 20-30% sector slides trump vRAM, and only Microsoft is talking about vRAM now (that is until they adopt something similar.)

On that topic, anytime you can get your competitor talking about your product instead of theirs, it usually turns out to be a good thing. Even in this case, where the topic has nothing to do with the needs of most businesses, negative marketing against vRAM will ultimately do more to establish VMware as an innovator than an “already too expensive alternative to XYZ.”

2) SOLORI’s law of conservation of marketing momentum: goodness preserved, not destroyed. VMworld 2011 turns out to be perfectly timed to generate excitement in all of the “goodness” that vSphere 5 has to offer. More importantly, it can now do so with increased vigor and without a lot of energy siphoned-off discussing vRAM, utilization models and what have you: been there done that, on to the meat and away with the garnish.

3) Again it’s odd timing, but the market slide has more folks looking at cloud than ever before. Confidence in cloud offerings has been a deterrent for private cloud users, partly because of the “no clear choices” scenario and partly because concerns about data migration in and around the public cloud. Instability and weak growth in the world economy have people reevaluating CAPEX-heavy initiatives as well as priorities. The bar for cloud offerings has never been lower.

In vSphere 5, VMware hints at the ability for more cloud providers to be transparent to the subscriber: if they adopt vSphere. Ultimately, this will facilitate vendor agnosticism much like the early days of the Internet. Back then, operators discovered that common protocols allowed for dial-up vendors to share resources in a reciprocal and transparent manner. This allowed the resources of provider A to be utilized by a subscriber of provider B: the end user was completely unaware of the difference. For those that don’t have strict requirements on where their data “lives” and/or are more interested in adherence to availability and SLA requirements, this can actually induce a broader market instead of a narrower one.

If you’ve looked past vRAM, you may have noticed for yourself that vSphere has more to deliver cloud offerings than ever before. VMware will try to convince you that whether cloud bursting, migrating to cloud or expanding hybrid cloud options, having a common underlying architecture promotes better flexibility and reduces overall cost and complexity. They want you to conclude that vSphere 5 is the basis for that architecture. Many will come away from Las Vegas – having seen it – believing it too.

So, as I – and an estimated 20K+ other virtualization junkies – head off to Las Vegas for a week of geek overload, parties and social networking, my thoughts turn to @duncanyp‘s 140+ improvements, enhancements and advances waiting back home in my vSphere 5 lab. Last week he challenged his “followers” to be the first to post examples of all of them; with the myriad of hands-on labs and expert sessions just over the horizon, I hope to do it one better and actually experience them first hand.

These things all add up to a win-win for VMware and a strong showing for VMworld. It’s going to be an exciting and – tip of the hat to @bgracely – industry changing week! Now off to the fray…


See Mike Laverick’s chinwag podcasts

See Chad’s Sakacc’s VirtualGeek blog on stretched cluster issues to overcome

(excuse typos today, wordpress iPad…)