Archive for August, 2009

h1

Quick Take: AristaNetworks Announces vEOS for vNDS

August 30, 2009

Arista Networks – the Menlo Park, CA “cloud networking solutions” company – has announced the first competitor to Cisco’s Nexus 1000V virtual, distributed switch for VMware’s vSphere. Called simply vEOS (aka Virtualized Extensible Operating System) the virtual machine application integrates into the VMware vNetwork Distributed Switch (vNDS) framework to accomplish its task.

The current advantages posed by the vNDS architecture are preservation of a VM’s network state across vMotion events and port configuration simplicity. Enhancements for the vEOS virtual machine will include QoS (with TX/RX limiting), ACL enforcement (when used with Arista’s 7000 series switch family), CLI configuration and management, distributed port profiles, port profile inheritance, VMware port mirroring, SNMP v3 RW, syslog exports, active standby control plane, SSH/telnet access to CLI, “hitless” control plane upgrades, non-disruptive installation and integrated vSwitch upgrade workflow.

Another advantage of vEOS for adopters of Arista’s 7000-series 10Gbase-T switches is its derrivation from the switch’s own EOS code base. The EOS image is monolithic with respect to feature set, so there are no “trains to catch” for compatibility, et al. Michael Morris from NetworkWorld has an article about the vEOS announcement along with some additional information about Arista’s plans and how to get your hands on a beta version of vEOS… See Arista’s official press release about vEOS.

SOLORI’s Take: While CNA’s are great, 10Gbase-T provides a less “process disruptive” access to greater network bandwidth. Arista’s move with vEOS is challenging both Cisco’s presence in the VMware market and its vision for network convergence. Just like Ethernet won-out over “superior” technologies based on its simplicity, we’re looking at 10Gbase-T’s drop-in simplicity (CAT5e support, 100/1G/10G auto-negotiate, etc.) to drive obvious market share for related products. With the 2010 battle lines drawn in the virtualization platform market, the all-important network segment will be as much a scalability factor as it is a budgetary one.

When using the coming “massive” virtualization potential of 2P/4P hardware being released in 2010 (see our Quick Take on 48-core virtualization) the obvious conduit for network traffic into 180+ VM consolidations is 10GE. The question: will 10Gbase-T deliver order-of-magnitude economies of scale needed to displace the technical advantages of DCE/CNA-based networking?

h1

In-the-Lab: Full ESX/vMotion Test Lab in a Box, Part 4

August 26, 2009

In Part 3 of this series we showed how to install and configure a basic NexentaStor VSA using iSCSI and NFS storage. We also created a CIFS bridge for managing ISO images that are available to our ESX servers using NFS. We now have a fully functional VSA with working iSCSI target (unmounted as of yet) and read-only NFS export mounted to the hardware host.

In this segment, Part 4, we will create an ESXi instance on NFS along with an ESX instance on iSCSI, and, using writable snapshots, turn both of these installations into quick-deploy templates. We’ll then mount our large iSCSI target (created in Part 3) and NFS-based ISO images to all ESX/ESXi hosts (physical and virtual), and get ready to install our vCenter virtual machine.

Part 4, Making an ESX Cluster-in-a-Box

With a lot of things behind us in Parts 1 through 3, we are going to pick-up the pace a bit. Although ZFS snapshots are immediately available in a hidden “.zfs” folder for each snapshotted file system, we are going to use cloning and mount the cloned file systems instead.

Cloning allows us to re-use a file system as a template for a copy-on-write variant of the source. By using the clone instead of the original, we can conserve storage because only the differences between the two file systems (the clone and the source) are stored to disk. This process allows us to save time as well, leveraging “clean installations” as starting points (templates) along with their associate storage (much like VMware’s linked-clone technology for VDI.) While VMware’s “template” capability allows us save time by using a VM as a “starting point” it does so by copying storage, not cloning it, and therefore conserves no storage.

Using clones in NexentaStor to aid rapid deployment and testing.

Using clones in NexentaStor to conserve storage and aid rapid deployment and testing. Only the differences between the source and the clone require additional storage on the NexentaStor appliance.

While the ESX and ESXi use cases might not seem the “perfect candidates” for cloning in a “production” environment, in the lab it allows for an abundance of possibilities in regression and isolation testing. In production you might find that NFS and iSCSI boot capabilities could make cloned hosts just as effective for deployment and backup as they are in the lab (but that’s another blog).

Here’s the process we will continue with for this part in the lab series:

  1. Create NFS folder in NexentaStor for the ESXi template and share via NFS;
  2. Modify the NFS folder properties in NexentaStor to:
    1. limit access to the hardware ESXi host only;
    2. grant the hardware ESXi host “root” access;
  3. Create a folder in NexentaStor for the ESX template and create a Zvol;
  4. From VI Client’s “Add Storage…” function, we’ll add the new NFS and iSCSI volumes to the Datastore;
  5. Create ESX and ESXi clean installations in these “template” volumes as a cloning source;
  6. Unmount the “template” volumes using the VI Client and unshare them in NexentaStore;
  7. Clone the “template” Zvol and NFS file systems using NexentaStore;
  8. Mount the clones with VI Client and complete the ESX and ESXi installations;
  9. Mount the main Zvol and ISO storage to ESX and ESXi as primary shared storage;
Basic storage architecture for the ESX-on-ESX lab.

Basic storage architecture for the ESX-on-ESX lab.

Read the rest of this entry ?

h1

Quick Take: HP’s Sets Another 48-core VMmark Milestone

August 26, 2009

Not satisfied with a landmark VMmark score that crossed the 30 tile mark for the first time, HP’s performance team went back to the benches two weeks later and took another swing at the performance crown. Well, the effort paid off, and HP significantly out-paced their two-week-old record with a score of 53.73@35 tiles in the heavy weight, 48-core category.

Using the same 8-processor HP ProLiant DL785 G6 platform as in the previous run – complete with 2.8GHz AMD Opteron 8439 SE 6-core chips and 256GB DDR2/667 – the new score comes with significant performance bumps in the javaserver, mailserver and database results achieved by the same system configuration as the previous attempt – including the same ESX 4.0 version (164009). So what changed to add an additional 5 tiles to the team’s run? It would appear that someone was unsatisfied with the storage configuration on the mailserver run.

Given that the tile ratio of the previous run ran about 6% higher than its 24-core counterpart, there may have been a small indication that untapped capacity was available. According to the run notes, the only reported changes to the test configuration – aside from the addition of the 5 LUNs and 5 clients needed to support the 5 additional tiles – was a notation indicating that the “data drive and backup drive for all mailserver VMs” we repartitioned using AutoPart v1.6.

The change in performance numbers effectively reduces the virtualization cost of the system by 15% to about $257/VM – closing-in on its 24-core sibling to within $10/VM and stretching-out its lead over “Dunnington” rivals to about $85/VM. While virtualization is not the primary application for 8P systems, this demonstrates that 48-core virtualization is definitely viable.

SOLORI’s Take: HP’s performance team has done a great job tuning its flagship AMD platform, demonstrating that platform performance is not just related to hertz or core-count but requires balanced tuning and performance all around. This improvement in system tuning demonstrates an 18% increase in incremental scalability – approaching within 3% of the 12-core to 24-core scaling factor, making it actually a viable consideration in the virtualization use case.

In recent discussions with AMD about the SR5690 chipset applications for Socket-F, AMD re-iterated that the mainstream focus for SR5690 has been Magny-Cours and the Q1/2010 launch. Given the close relationship between Istanbul and Magny-Cours – detailed nicely by Charlie Demerjian at Semi-Accurate – the bar is clearly fixed for 2P and 4P virtualization systems designed around these chips. Extrapolating from the similarities and improvements to I/O and memory bandwidth, we expect to  see 2P VMmarks besting 32@23 and 4P scores over 54@39 from HP, AMD and Magny-Cours.

SOLORI’s 2nd Take: Intel has been plugging away with its Nehalem-EX for 8-way systems and – delivering 128-threads – promises to deliver some insane VMmarks. Assuming Intel’s EX scales as efficiently as AMD’s new Opterons have, extrapolations indicate performance for the 4P, 64-thread Nehalem-EX shoud fall between 41@29 and 44@31 given the current crop of speed and performance bins. Using the same methods, our calculus predicts an 8P, 128-thread EX system should deliver scores between 64@45 and 74@52.

With EX expected to clock at 2.66GHz with 140W TDP and AMD’s MCM-based Magny-Cours doing well to hit 130W ACP in the same speed bins, CIO’s balancing power and performance considerations will need to break-out the spreadsheets to determine the winners here. With both systems running 4-channel DDR3, there will be no power or price advantage given on either side to memory differences: relative price-performance and power consumption of the CPU’s will be major factors. Assuming our extrapolations are correct, we’re looking at a slight edge to AMD in performance-per-watt in the 2P segment, and a significant advantage in the 4P segment.

h1

In-the-Lab: Full ESX/vMotion Test Lab in a Box, Part 3

August 21, 2009

In Part 2 of this series we introduced the storage architecture that we would use for the foundation of our “shared storage” necessary to allow vMotion to do its magic. As we have chosen NexentaStor for our VSA storage platform, we have the choice of either NFS or iSCSI as the storage backing.

In Part 3 of this series we will install NexentaStor, make some file systems and discuss the advantages and disadvantages of NFS and iSCSI as the storage backing. By the end of this segment, we will have everything in place for the ESX and ESXi virtual machines we’ll build in the next segment.

Part 3, Building the VSA

For DRAM memory, our lab system has 24GB of RAM which we will apportion as follows: 2GB overhead to host, 4GB to NexentaStor, 8GB to ESXi, and 8GB to ESX. This leaves 2GB that can be used to support a vCenter installation at the host level.

Our lab mule was configured with 2x250GB SATA II drives which have roughly 230GB each of VMFS partitioned storage. Subtracting 10% for overhead, the sum of our virtual disks will be limited to 415GB. Because of our relative size restrictions, we will try to maximize available storage while limiting our liability in case of disk failure. Therefore, we’ll plan to put the ESXi server on drive “A” and the ESX server on drive “B” with the virtual disks of the VSA split across both “A” and “B” disks.

Our VSA Virtual Hardware

For lab use, a VSA with 4GB RAM and 1 vCPU will suffice. Additional vCPU’s will only serve to limit CPU scheduling for our virtual ESX/ESXi servers, so we’ll leave it at the minimum. Since we’re splitting storage roughly equally across the disks, we note that an additional 4GB was taken-up on disk “A” during the installation of ESXi, therefore we’ll place the VSA’s definition and “boot” disk on disk “B” – otherwise, we’ll interleave disk slices equally across both disks.

NexentaStor-VSA-virtual-hardware

  • Datastore – vLocalStor02B, 8GB vdisk size, thin provisioned, SCSI 0:0
  • Guest Operating System – Solaris, Sun Solaris 10 (64-bit)
  • Resource Allocation
    • CPU Shares – Normal, no reservation
    • Memory Shares – Normal, 4096MB reservation
  • No floppy disk
  • CD-ROM disk – mapped to ISO image of NexentaStor 2.1 EVAL, connect at power on enabled
  • Network Adapters – Three total 
    • One to “VLAN1 Mgt NAT” and
    • Two to “VLAN2000 vSAN”
  • Additional Hard Disks – 6 total
    • vLocalStor02A, 80GB vdisk, thick, SCSI 1:0, independent, persistent
    • vLocalStor02B, 80GB vdisk, thick, SCSI 2:0, independent, persistent
    • vLocalStor02A, 65GB vdisk, thick, SCSI 1:1, independent, persistent
    • vLocalStor02B, 65GB vdisk, thick, SCSI 2:1, independent, persistent
    • vLocalStor02A, 65GB vdisk, thick, SCSI 1:2, independent, persistent
    • vLocalStor02B, 65GB vdisk, thick, SCSI 2:2, independent, persistent

NOTE: It is important to realize here that the virtual disks above could have been provided by vmdk’s on the same disk, vmdk’s spread out across multiple disks or provided by RDM’s mapped to raw SCSI drives. If your lab chassis has multiple hot-swap bays or even just generous internal storage, you might want to try providing NexentaStor with RDM’s or 1-vmdk-per-disk vmdk’s for performance testing or “near” production use. CPU, memory and storage are the basic elements of virtualization and there is no reason that storage must be the bottleneck. For instance, this environment is GREAT for testing SSD applications on a resource limited budget.

Read the rest of this entry ?

h1

In-the-Lab: Full ESX/vMotion Test Lab in a Box, Part 2

August 19, 2009

In Part 1 of this series we introduced the basic Lab-in-a-Box platform and outlined how it would be used to provide the three major components of a vMotion lab: (1) shared storage, (2) high speed network and (3) multiple ESX hosts. If you have followed along in your lab, you should now have an operating VMware ESXi 4 system with at least two drives and a properly configured network stack.

In Part 2 of this series we’re going to deploy a Virtual Storage Appliance (VSA) based on an open storage platform which uses Sun’s Zetabyte File System (ZFS) as its underpinnings. We’ve been working with Nexenta’s NexentaStor SAN operating system for some time now and will use it – with its web-based volume management – instead of deploying OpenSolaris and creating storage manually.

Part 2, Choosing a Virtual Storage Architecture

To get started on the VSA, we want to identify some key features and concepts that caused us to choose NexentaStor over a myriad of other options. These are:

  • NexentaStor is based on open storage concepts and licensing;
  • NexentaStor comes in a “free” developer’s version with 4TB 2TB of managed storage;
  • NexentaStor developer’s version includes snapshots, replication, CIFS, NFS and performance monitoring facilities;
  • NexentaStor is available in a fully supported, commercially licensed variant with very affordable $/TB licensing costs;
  • NexentaStor has proven extremely reliable and forgiving in the lab and in the field;
  • Nexenta is a VMware Technology Alliance Partner with VMware-specific plug-ins (commercial product) that facilitate the production use of NexentaStor with little administrative input;
  • Sun’s ZFS (and hence NexentaStor) was designed for commodity hardware and makes good use of additional RAM for cache as well as SSD’s for read and write caching;
  • Sun’s ZFS is designed to maximize end-to-end data integrity – a key point when ALL system components live in the storage domain (i.e. virtualized);
  • Sun’s ZFS employs several “simple but advanced” architectural concepts that maximize performance capabilities on commodity hardware: increasing IOPs and reducing latency;

While the performance features of NexentaStor/ZFS are well outside the capabilities of an inexpensive “all-in-one-box” lab, the concepts behind them are important enough to touch on briefly. Once understood, the concepts behind ZFS make it a compelling architecture to use with virtualized workloads. Eric Sproul has a short slide deck on ZFS that’s worth reviewing.

ZFS and Cache – DRAM, Disks and SSD’s

Legacy SAN architectures are typically split into two elements: cache and disks. While not always monolithic, the cache in legacy storage typically are single-purpose pools set aside to hold frequently accessed blocks of storage – allowing this information to be read/written from/to RAM instead of disk. Such caches are generally very expensive to expand (when possible) and may only accomodate one specific cache function (i.e. read or write, not both). Storage vendors employ many strategies to “predict” what information should stay in cache and how to manage it to effectively improve overall storage throughput.

New cache model used by ZFS allows main memory and fast SSDs to be used as read cache and write cache, reducing the need for large DRAM cache facilities.

New cache model used by ZFS allows main memory and fast SSDs to be used as read cache and write cache, reducing the need for large DRAM cache facilities.

Read the rest of this entry ?

h1

In-the-Lab: Full ESX/vMotion Test Lab in a Box, Part 1

August 17, 2009

There are many features in vSphere worth exploring but to do so requires committing time, effort, testing, training and hardware resources. In this feature, we’ll investigate a way – using your existing VMware facilities – to reduce the time, effort and hardware needed to test and train-up on vSphere’s ESXi, ESX and vCenter components. We’ll start with a single hardware server running VMware ESXi free as the “lab mule” and install everything we need on top of that system.

Part 1, Getting Started

To get started, here are the major hardware and software items you will need to follow along:

ESX-hardwareRecommended Lab Hardware Components

  • One 2P, 6-core AMD “Istanbul” Opteron system
  • Two 500-1,500GB Hard Drives
  • 24GB DDR2/800 Memory
  • Four 1Gbps Ethernet Ports (4×1, 2×2 or 1×4)
  • One 4GB SanDisk “Cruiser” USB Flash Drive
  • Either of the following:
    • One CD-ROM with VMware-VMvisor-Installer-4.0.0-164009.x86_64.iso burned to it
    • An IP/KVM management card to export ISO images to the lab system from the network

Recommended Lab Software Components

  • One ISO image of NexentaStor 2.x (for the Virtual Storage Appliance, VSA, component)
  • One ISO image of ESX 4.0
  • One ISO image of ESXi 4.0
  • One ISO image of VCenter Server 4
  • One ISO image of Windows Server 2003 STD (for vCenter installation and testing)

For the hardware items to work, you’ll need to check your system components against the VMware HCL and community supported hardware lists. For best results, always disable (in BIOS) or physically remove all unsupported or unused hardware- this includes communication ports, USB, software RAID, etc. Doing so will reduce potential hardware conflicts from unsupported devices.

The Lab Setup

We’re first going to install VMware ESXi 4.0 on the “test mule” and configure the local storage for maximum use. Next, we’ll create three (3) machines two create our “virtual testing lab” – deploying ESX, ESXi and NexentaStor running directly on top of our ESXi “test mule.” All subsequent tests VMs will be running in either of the virtualized ESX platforms from shared storage provided by the NexentaStor VSA.

ESX, ESXi and VSA running atop ESXi

ESX, ESXi and VSA running atop ESXi

Next up, quick-and-easy install of ESXi to USB Flash…
Read the rest of this entry ?

h1

Quick Take: HP Plants the Flag with 48-core VMmark Milestones

August 12, 2009

Following on the heels of last month we predicted that HP could easily claim the VMmark summit with its DL785 G6 using AMD’s Istanbul processors:

If AMD’s Istanbul scales to 8-socket at least as efficiently as Dunnington, we should be seeing some 48-core results in the 43.8@30 tile range in the next month or so from HP’s 785 G6 with 8-AMD 8439 SE processors. You might ask: what virtualization applications scale to 48-cores when $/VM is doubled at the same time? We don’t have that answer, and judging by Intel and AMD’s scale-by-hub designs coming in 2010, that market will need to be created at the OEM level.

Well, HP didn’t make us wait too long. Today, the PC maker cleared two significant VMmark milestones: crossing the 30 tile barrier in a single system (180 VMs) and exceeding the 40 mark on VMmark score. With a score of 47.77@30 tiles, the HP DL785 G6 – powered by 8 AMD Istanbul 8439 SE processors and 256GB of DDR2/667 memory – set the bar well beyond the competition and does so with better performance than we expected – most likely due to AMD’s “HT assist” technology increasing its scalability.

Not available until September 14, 2009, the HP DL785 G6 is a pricey competitor. We estimate – based on today’s processor and memory prices – that a system as well appointed as the VMmark-configured version (additional NICs, HBA, etc) will run at least $54,000 or around $300/VM (about $60/VM higher than the 24-core contender and about $35/VM lower than HP’s Dunnnigton “equivalent”).

SOLORI’s Take: While the September timing of the release might imply a G6 with AMD’s SR5690 and IOMMU, we’re doubtful that the timing is anything but a coincidence: even though such a pairing would enable PCIe 2.0 and highly effective 10Gbps solutions. The modular design of the DL785 series – with its ability to scale from 4P to 8P in the same system – mitigates the economic realities of the dwindling 8P segment, and HP has delivered the pinnacle of performance for this technology.

We are also impressed with HP’s performance team and their ability to scale Shanghai to Istanbul with relative efficiency. Moving from DL785 G5 quad-core to DL785 G6 six-core was an almost perfect linear increase in capacity (95% of theoretical increase from 32-core to 48-core) while performance-per-tile increased by 6%. This further demonstrates the “home run” AMD has hit with Istanbul and underscores the excellent value proposition of Socket-F systems over the last several years.

Unfortunately, while they demonstrate a 91% scaling efficiency from 12-core to 24-core, HP and Istanbul have only achieved a 75% incremental scaling efficiency from 24-cores to 48-cores. When looking at tile-per-core scaling using the 8-core, 2P system as a baseline (1:1 tile-to-core ratio), 2P, 4P and 8P Istanbul deliver 91%, 83% and 62.5% efficiencies overall, respectively. However, compared to the %58 and 50% tile-to-core efficiencies of Dunnington 4P and 8P, respectively, Istanbul clearly dominates the 4P and 8P performance and price-performance landscape in 2009.

In today’s age of virtualization-driven scale-out, SOLORI’s calculus indicates that multi-socket solutions that deliver a tile-to-core ratio of less than 75% will not succeed (economically) in the virtualization use case in 2010, regardless of socket count. That said – even at a 2:3 tile-to-core ratio – the 8P, 48-core Istanbul will likely reign supreme as the VMmark heavy-weight champion of 2009.

SOLORI’s 2nd Take: HP and AMD’s achievements with this Istanbul system should be recognized before we usher-in the next wave of technology like Magny-Cours and Socket G34. While the DL785 G6 is not a game changer, its footnote in computing history may well be as a preview of what we can expect to see out of Magny-Cours in 2H/2010. If 12-core, 4P system price shrinks with the socket count we could be looking at a $150/VM price-point for a 4P system: now that would be a serious game changer.