Archive for April, 2009

h1

Shanghai Economics 101

April 30, 2009

Before the release of the Istanbul 6-core processor we wanted to preview the CAPEX comparisons we’ve been working on between today’s Opteron (Shanghai) and today’s Nehalem-EP. The results are pretty startling and mostly due to the Nahelem-EP’s limited memory addressing capability. Here are the raw numbers for comparable performance systems (i.e. high-end):

Nehalem-EP Configuration Street $
Shanghai HT3 Configuration Street $
Savings $ Savings %
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 24GB DDR3/1333 $7,017.69   2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 32GB DDR2/800 $5,892.12   $1,125.57 16.04%
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 48GB DDR3/1066 $7,755.99   2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 48GB DDR2/800 $6,352.12   $1,403.87 18.10%
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 96GB DDR3/1066 $21,969.99   2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 96GB DDR2/667 $11,968.72   $10,001.27 45.52%
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 144GB DDR3/800 $30,029.19   2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 128GB DDR2/533 $14,300.92   $15,728.27 52.38%
               
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 96GB DDR3/1066 $21,969.99   4P/16C Shanghai, 8393 SE, 3.1GHz, 4.4GT HT3 with 96GB DDR2/800 $17,512.87   $4,457.12 20.29%
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 144GB DDR3/800 $30,029.19   4P/16C Shanghai, 8393 SE, 3.1GHz, 4.4GT HT3 with 192GB DDR2/667 $28,746.07   $1,283.12 4.27%
2 x 2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 144GB (288GB total) DDR3/800 $60,058.38   1 x 4P/16C Shanghai, 8393 SE, 3.1GHz, 4.4GT HT3 with 256GB DDR2/533 $33,410.47   $26,647.92 44.37%

Even the 4-socket Shanghai 8393SE averages 23% lower implementation cost over Nehalem-EP and produces 16 “real” cores versus 8 “real” cores in the process. Even at 50% theoretical efficiency using Nehalem’s SMT, the 4P Shanghai represents a solid choice in the performance segment. An Istanbul drop-in upgrade spread’s the gulf in capabilities even wider.

Based on today’s economics and the history of seamless vMotion between Opteron processors, 4P/24C Istanbul is a solid will be a no-brainer investment. With 2P/24C and 4P/48C Magny-Cours on the way to handle the “really big” tasks, a Shanghai-Istanbul Eco-System looks like an economic stimulus all its own.

h1

VMware ESXi Update: Build 158874

April 30, 2009

VMware has released a series of critical patches for ESXi and VMware tools discussed in knowledge base articles 1010135 and 1010136. These are the highlights:

Patch for ESXi:

  • Fixes an issue in the VMkernel TCP/IP stack where adding a system to an HA (High Availability) cluster results in a timeout error.
  • Fixes an issue where a virtual machine might fail if a reserved register is accessed within the guest operating system.
  • Fixes an issue where a virtual machine might stop responding or the progress bar in the graphical user interface might appear to be stuck at 95% when you consolidate a snapshot of a powered on virtual machine. Read the rest of this entry ?
h1

Magny-Cours Spotted

April 29, 2009
Magny-Cours, 12-core Processor

Magny-Cours, 12-core Processor

AMD’s next generation “G34” socket Magny-Cours processor was spotted recently by XbitLabs running in AMD’s 4-way test mule platform. We’ve talked about Magny-Cours and socket-G34 before, but had no picture until now. The multi-chip module (MCM) heritage is obvious given it’s rectangular shape.

Critical for AMD will be HT3+DCA2 efficiency and memory bandwidth to counter the apparent success of Nehalem-EP’s SMT technology. Although AMD does not consider hyperthreading to be a viable technology for them, it appears to be working for Intel in benchmark cases.

While seems logical that more “physical” cores should scale better than the “logical” cores provided by SMT, Intel is making some ground of legacy “physical core” systems, demonstrating what appears to be a linear scaling in VMmark. However, Intel has a fine reputation for chasing – and mastering – benchmark performance only to show marginal gains in real-world applications.

Meanwhile, the presure mounts on Instanbul’s successful launch in June with white box vendors making ready for the next wave of “product release buzz” to stimulate sinking sales. Decision makers will have a lot of spreadsheet work to do to determine where the real price performance lies. Based on the high-cost of dense DDR3 and DDR2, the 16-DIMM/CPU advantage is weighing heavily on AMD’s side from a CAPEX and OPEX perspective (DDR2 is already a well-entrenched component of all socket-F platforms).

Up to now, Intel’s big benchmark winners have been the W5580 and X5570 with $1,700 and $1,500 unit prices, respectively. Compounded with high-cost DDR3 dual-rank memory, or reduction in memory bandwidth (which eliminates a significant advantage), the high-end Nehalem-EP is temporarily caught in an economic bind, severely limiting its price-performance suitability.

h1

Clarification: Nehalem-EP and DDR3

April 29, 2009

I have seen a lot of contrasting comments about Nehalem-EP and memory speed on the community groups – especially in the area of supported speed ratings: often in the context of comparison to Opteron’s need to reduce supported DIMM speed ratings based on slot population. While it is true Nehalem’s 3-channel design allows for a mixture of performance (800/1066/1333) and capacity, it does not allow for both.

Here are the rules (from Intel’s “Intel Xeon Processor 5500 Series Datasheet, Volume 2“) based on DIMM per Channel (DPC):

  • 1-DPC = Support DDR3-1333 (if DIMM supports DDR3-1333)
    • KVR1333D3D4R9S/4G – $169/ea
    • 12GB/CPU max. @ $507/CPU (24GB/system max.)
  • 2-DPC = Support DDR3-1066 (if all DIMMs are rated DDR3-1066 or higher)
    • KVR1066D3D4R7S/4G – $138/ea
    • 24GB/CPU max. @ $828/CPU (48GB/system max.)
    • KVR1066D3Q4R7S/8G – $1,168/ea
    • 48GB/CPU max. @ $7,008/CPU (96GB/system max.)
    • “96GB Memory (12x8GB), 1066MHz Dual Ranked RDIMMs for 2 Processors,Optimized [add $15,400]” – Dell
  • 3-DPC = Support DDR3-800 only (if all DIMMs are rated DDR3-800 or higher)
    • KVR1066D3D4R7S/4G – $138/ea
    • 36GB/CPU max. @ $1,242/CPU (72GB/system max.)
    • “144GB Memory (18x8GB), 800MHz Dual Ranked RDIMMs for 2 Processors,Optimized [add $22,900]” – Dell

When the IMC detects the presence of 1, 2 or 3 DIMMs, these speed limits are imposed, regardless of the capabilities of the DIMM. A couple of other notable exceptions exist:

  • When one 4-rank DIMM is used, it must be populated in DIMM slot0 of a given channel (farthest from CPU);
  • Mixing of 4-rank DIMMs in one channel and 3-DIMMs in other channel (3-DPC) on the same CPU socket is not allowed – forcing BIOS to disable on the 4-rank channel;
  • RDIMM
    • Single-rank DIMM: 1-DPC, 2-DPC or 3-DPC
    • Dual-rank DIMM: 1-DPC, 2-DPC or 3-DPC
    • Quad-rank DIMM: 1-DPC or 2-DPC
  • UDIMM
    • Single-rank DIMM: 1-DPC or 2-DPC
    • Dual-rank DIMM: 1-DPC or 2-DPC
    • Quad-rank DIMM: n/a

Speed freaks be warned!

h1

Discover IOV and VMware NetQueue on a Budget

April 28, 2009

While researching advancements in I/O virtualization (VMware) we uncovered a “low cost” way to explore the advantages of IOV without investing in 10GbE equipment: the Intel 82576 Gigabit Network Controller which supports 8-receive queues per port. This little gem comes in a 2-port by 1Gbps PCI-express package (E1G142ET) for around $170/ea on-line. It also comes in a 4-port by 1Gbps package (full or half-height, E1G144ET) for around $450/ea on-line.

Enabling VMDq/NetQueue is straightforward:

  1. Enable NetQueue in VMkernel using VMware Infrastructure 3 Client:
    1. Choose Configuration > Advanced Settings > VMkernel.
    2. Select VMkernel.Boot.netNetqueueEnabled.
  2. Enable the igb module in the service console of the ESX Server host:# esxcfg-module -e igb
  3. Set the required load option for igb to turn on VMDq:
    The option IntMode=3 must exist to indicate loading in VMDq mode. A value of 3 for the IntMode parameter specifies using MSI-X and automatically sets the number of receive queues to the maximum supported (devices based on the 82575 Controller enable 4 receive queues per port; devices based on the 82576 Controller enable 8 receive queues per port). The number of receive queues used by the igb driver in VMDq mode cannot be changed. Read the rest of this entry ?
h1

Very Cool “Hologram”

April 28, 2009
Demo: Augmented Reality

Demo: Augmented Reality

If you have not seen this, it is a very cool demonstration of “augmented reality.” It takes a web camera, microphone, Internet access and black-and-white printer:

  1. Print the special target;
  2. Turn-on webcam, sound and microphone;
  3. Select the desired “reality”
  4. Point the target at the camera
  5. Play…

The “augmented reality” demonstration has a video (if you don’t want to print & interact) that walks through the experience. This is fun stuff, kids! Try it on your big screen today.

h1

Tyan Announces Support for Enhanced Opteron

April 28, 2009

Remember our “reveal” of the Tyan S2935-SI back in January as a potential HT3-capable replacement for the GT28 dual-node systems? Well, it’s still not ready, but Tyan has announced 18 motherboard and system updates that “support the enhanced Opteron with HT-3 technology” that are shipping now.

“TYAN has launched 9 new motherboards that support the AMD HyperTransport 3.0 technology that targets various appliances. For scalable and flexible 2-way motherboard solutions, TYAN’s S2912-E, S2915-E, S2927-E, S2932-SI, S2937 and S3992-E are perfect platforms to meet current and future IT server and workstation requirements. TYAN’s S4985-SI, S4989-SI and S4992 motherboards are 4-way solutions that are exceptionally proficient in high density and high performance IT infrastructures.”

We’ve included the table of motherboard and barebones systems affected by the update. Those in blue italics are also part of Tyan’s VMware Ready Certified platform. While these platforms have been user HCL for some time, the elevation to “Certified” status is recognition of the reliability and performance these systems have rendered over the years. It is good news indeed to see their value extended with motherboard and barebones refreshes.

Motherboard

4 sockets

S4985-SI, S4989-SI, S4992

2 sockets

S2912-E, S2915-E, S2927-E, S2937, S3992-E.

Barebone

8 sockets

VX50-B4985-SI-8P

4 sockets

FT48-B4985-SI, TN68-B4989-SI, TN68-B4989-SI-LE

2 sockets

TA26-B3992-E, TA26-B2932-SI, GT24-B3992-E,
GT24-B2932-SI
,
GT24-B2912-E

These systems, which SOLORI has been recommending for low-cost VMware Eco-systems for some time, are part of Tyan’s aggressive VMware strategy:

“As a member of the VMware Technology Alliance Partner (TAP) program, TYAN is aggressively utilizing VMware virtualization software technology in TYAN hardware platforms. The nine servers that have recently passed VMware Ready certification for VMware ESX 3.5 and VMware ESX 3.5i include TA26-B2932-E, TA26-B3992-E, TA26-B5397, TX46-B4985-E, VX50-B4985-8P-E, S4985-E, S3992-E, S5397 and S2932-E. VMware System Builder program members can claim equivalency for these systems via the VMware System Builder site at http://www.vmware.com/partners/vip/system-builders/

Most of these systems offer 16-socket+ DIMM configurations (8-DIMM/CPU) enabling up to 64GB/CPU with DDR3/533 support. In order to run DDR2/800 memory, only half of the available slots can be filled (4-DIMM/CPU) allowing for 16GB/CPU of DDR2/800 (4x4GB REG ECC DDR2/800, about 2GB/second increase in bandwidth over DDR2/533 according to benchmarks).