Posts Tagged ‘top score’

h1

Quick Take: HP Blade Tops 8-core VMmark w/OC’d Memory

September 25, 2009

HP’s ProLiant BL490c G6 server blade now tops the VMware VMmark table for 8-core systems – just squeaking past rack servers from Lenovo and Dell with a score of 24.54@17 tiles: a new 8-core record. The half-height blade was equipped with two, quad-core Intel Xeon X5570 (Nehalem-EP, 130W TDP) and 96GB ECC Registered DDR3-1333 (12x 8GB, 2-DIMM/channel) memory.

In our follow-up, we found that HP’s on-line configuration tool does not allow for DDR3-1333 memory so we went to the street for a comparison. For starters, we examined the on-line price from HP with DDR3-1066 memory and the added QLogic QMH2462 Fiber Channel adapter ($750) and additional NC360m dual-port Gigabit Ethernet controller ($320) which came to a grand total of $28,280 for the blade (about $277/VM, not including Blade chassis or SAN storage).

Stripping memory from the build-out results in a $7,970 floor to the hardware, sans memory. Going to the street to find 8GB sticks with DDR3-1333 ratings and HP support yielded the Kingston KTH-PL313K3/24G kit (3x 8GB DIMMs) of which we would need three to complete the build-out.  At $4,773 per kit, the completed system comes to $22,289 (about $218/VM, not including chassis or storage) which may do more to demonstrate Kingston’s value in the market place rather than HP’s penchant for “over-priced” memory.

Now, the interesting disclosure from HP’s testing team is this:

Notes from HP's VMmark submission.

Notes from HP's VMmark submission.

While this appears to boost memory performance significantly for HP’s latest run (compared to the 24.24@17 tiles score back in May, 2009) it does so at the risk of running the Nehalem-EP memory controller out of specification – essentially, driving the controller beyond the rated load. It is hard for us to imagine that this specific configuration would be vendor supported if used in a problematic customer installation.

SOLORI’s Take:Those of you following closely may be asking yourselves: “Why did HP choose to over-clock the  memory controller in this run by pushing a 1066MHz, 2DPC limit to 1333MHz?”  It would appear the answer is self-evident: the extra 6% was needed to put them over the Lenovo machine. This issue raises a new question about the VMmark validation process: “Should out of specification configurations be allowed in the general benchmark corpus?” It is our opinion that VMmark should represent off-the-shelf, fully-supported configurations only – not esoteric configuration tweaks and questionable over-clocking practices.

Should there be as “unlimited” category in the VMmark arena? Who knows? How many enterprises knowingly commit their mission critical data and processes to systems running over-clocked processors and over-driven memory controllers? No hands? That’s what we thought… Congratulations anyway to HP for clawing their way to the top of the VMmark 8-core heap…

h1

Quick Take: HP Plants the Flag with 48-core VMmark Milestones

August 12, 2009

Following on the heels of last month we predicted that HP could easily claim the VMmark summit with its DL785 G6 using AMD’s Istanbul processors:

If AMD’s Istanbul scales to 8-socket at least as efficiently as Dunnington, we should be seeing some 48-core results in the 43.8@30 tile range in the next month or so from HP’s 785 G6 with 8-AMD 8439 SE processors. You might ask: what virtualization applications scale to 48-cores when $/VM is doubled at the same time? We don’t have that answer, and judging by Intel and AMD’s scale-by-hub designs coming in 2010, that market will need to be created at the OEM level.

Well, HP didn’t make us wait too long. Today, the PC maker cleared two significant VMmark milestones: crossing the 30 tile barrier in a single system (180 VMs) and exceeding the 40 mark on VMmark score. With a score of 47.77@30 tiles, the HP DL785 G6 – powered by 8 AMD Istanbul 8439 SE processors and 256GB of DDR2/667 memory – set the bar well beyond the competition and does so with better performance than we expected – most likely due to AMD’s “HT assist” technology increasing its scalability.

Not available until September 14, 2009, the HP DL785 G6 is a pricey competitor. We estimate – based on today’s processor and memory prices – that a system as well appointed as the VMmark-configured version (additional NICs, HBA, etc) will run at least $54,000 or around $300/VM (about $60/VM higher than the 24-core contender and about $35/VM lower than HP’s Dunnnigton “equivalent”).

SOLORI’s Take: While the September timing of the release might imply a G6 with AMD’s SR5690 and IOMMU, we’re doubtful that the timing is anything but a coincidence: even though such a pairing would enable PCIe 2.0 and highly effective 10Gbps solutions. The modular design of the DL785 series – with its ability to scale from 4P to 8P in the same system – mitigates the economic realities of the dwindling 8P segment, and HP has delivered the pinnacle of performance for this technology.

We are also impressed with HP’s performance team and their ability to scale Shanghai to Istanbul with relative efficiency. Moving from DL785 G5 quad-core to DL785 G6 six-core was an almost perfect linear increase in capacity (95% of theoretical increase from 32-core to 48-core) while performance-per-tile increased by 6%. This further demonstrates the “home run” AMD has hit with Istanbul and underscores the excellent value proposition of Socket-F systems over the last several years.

Unfortunately, while they demonstrate a 91% scaling efficiency from 12-core to 24-core, HP and Istanbul have only achieved a 75% incremental scaling efficiency from 24-cores to 48-cores. When looking at tile-per-core scaling using the 8-core, 2P system as a baseline (1:1 tile-to-core ratio), 2P, 4P and 8P Istanbul deliver 91%, 83% and 62.5% efficiencies overall, respectively. However, compared to the %58 and 50% tile-to-core efficiencies of Dunnington 4P and 8P, respectively, Istanbul clearly dominates the 4P and 8P performance and price-performance landscape in 2009.

In today’s age of virtualization-driven scale-out, SOLORI’s calculus indicates that multi-socket solutions that deliver a tile-to-core ratio of less than 75% will not succeed (economically) in the virtualization use case in 2010, regardless of socket count. That said – even at a 2:3 tile-to-core ratio – the 8P, 48-core Istanbul will likely reign supreme as the VMmark heavy-weight champion of 2009.

SOLORI’s 2nd Take: HP and AMD’s achievements with this Istanbul system should be recognized before we usher-in the next wave of technology like Magny-Cours and Socket G34. While the DL785 G6 is not a game changer, its footnote in computing history may well be as a preview of what we can expect to see out of Magny-Cours in 2H/2010. If 12-core, 4P system price shrinks with the socket count we could be looking at a $150/VM price-point for a 4P system: now that would be a serious game changer.

h1

Lenovo Claims Top VMmark Spot: 2P, 8C

July 1, 2009

The new top spot for VMmark in the “8 core” category is now held by Lenovo’s R525 G2 rack server with a score of 24.35@17 tiles (tile ratio of 1.43 over 102 VMs). As this server appears to be available in the overseas (China) markets only, we can only estimate the street price of the system used in the benchmark based on the reported build-out at to be around $20,330 per server (street):

  • Base Lenovo R525 G2 ($4,900 – 30,000 yuan)
  • 2 x Intel Xeon X5570 Processors ($1,500/ea)
  • 96GB ECC DDR3/1066 (12x8GB) ($900/DIMM from Kingston)
  • 1 x Intel 82575EB dual-port GigabitEthernet (on-board)
  • 2 x Intel 82571EB dual-port GigabitEthernet (2x PCIe slot, $150/ea)
  • 1 x QLogic QLE2462 FC HBA (1x PCIe slot, $1,300)
  • 1 x LSI1078 SAS Controller (on-board)
  • 2 x SAS OS drive ($300 est.)

An EMC CX3-40f was used as the storage backing of the test. The storage system included 4GB cache, 4 enclosures and 55 146GB 15K FC disks (10, 15, 15, 15), and 17 LUNs at 100GB each. Interestingly, a Cisco Linksys SR2024 GigabitEthernet switch was used for the network interconnection (about $299/each at NewEgg) which implies that test results are not being influenced on network performance or latency. Given the use of a 2-port FC HBA for storage, iSCSI network performance is not a factor.

At about $1,094/tile ($182/VM) the new “top dog” delivers its best at a 5% price-per-VM premium over Istanbul’s only VMmark results (1.41 tile ratio) and an 80% system price premium (assuming memory sourced by third parties).  Since we had to go to the street to configure the Lenovo system, the Istanbul system saves about $1,570 [in mark-up] under similar (non-vendor pricing) circumstances:

  • Base HP DL385 G6  ($5,100)
  • 2 x AMD 2435 Istanbul Processors (included)
  • 64GB ECC DDR2/800 (8x8GB) ($370/DIMM)
  • 2 x Broadcom 5709 dual-port GigabitEthernet (on-board)
  • 1 x Intel 82571EB dual-port GigabitEthernet (1x PCIe slot, $150/ea)
  • 1 x QLogic QLE2462 FC HBA (1x PCIe slot, $1,300)
  • 1 x HP SAS Controller (on-board)
  • 2 x SAS OS drive (included)
  • $9,810/system total (versus $11,378 complete from HP)

Street pricing changes Istanbul’s numbers to $892/tile ($149/VM) signifying a 22% per-VM savings and a 52% savings in system price. Given that virtualization systems are generally sold in pairs, this comparison shows that a redundant Istanbul system can be had for less than the cost of a non-redundant Nehalem. For SMB’s getting started in virtualization, Istanbul continues to offer a compelling system value proposition over Nehalem.

h1

First 48-core VMmark Appears

June 18, 2009

Following in the footsteps of the first 12-core VMmark comes the current champion at 33.85@24 tiles using 48-cores – and, despite the timing, it is not an Istanbul server. In fact, today’s leader is the IBM System x3950 M2 running 8, 6-core Intel Xeon MP “Dunnington” X7460 processors with 256GB DDR2/667 RAM (5.3GB/core).

This score edges-out the previous champion – the HP ProLiant DL785 G5 with 8, 4-core Opteron 8393SE processors – which reigned at 31.56@21 tiles. In contrast to the 4-socket, 24-core IBM System x3850 M2 Xeon leading the 24-core category, this doubling of socket/core count resulted in only a 50% increase in capacity. This scaling inefficiency is less typical in 2P-to-4P transition but seems to plague the 4p-to-8P segment.

“The x3950 M2 is based on the fourth generation of IBM Enterprise X-Architecture®, and is designed to deliver innovation with enhanced reliability and availability features that enable optimal performance for databases, enterprise applications and virtualized environments.”

IBM News Blurb

“I’m really looking forward to even more virtualization benchmarks which are coming very soon.”

– Elisabeth Stahl, IBM Benchmarking and Systems Performance Blog

Looking at the virtualization notes we discover what it takes to keep 48-cores fed to achieve such a benchmark:

  • 4-QLogic QLE2462 HBA’s (Dual-port, 4-Gbps FC)
  • 1-IBM DS4800 with 4GB cache
    • 19 EXP 810 storage expansion units for
    • 1.8TB in 49 LUNs
      • 280 15K disks total
  • 21 IBM x336 clients
    • DP 3.2GHz Xeon
    • 3GB RAM
    • Server 2003 R2
  • 2 IBM x335 clients
    • DP Xeon 3.06GHz
    • 2.5GB RAM
    • Server 2003 R2
  • Eight vSwitches
    • 120 ports total
  • 4 Intel PRO 1000PT Dual-port 1Gb Ethernet controllers
    • one per vSwitch

While the Dunnington tops the list by sheer brute force, it’s safe to assume that – given the 32-core Opteron is nipping at its heels – the 48-core Istanbul results will displace it soon (possibly alluded to in Elisabeth Stahl’s “Benchmarking and Performance Blog” reference above). More interestingly, will AMD’s much touted “HT Assist” allow the 8P Istanbul to break the 4P-to-8P “curse” of scaling inefficiency? If not, it would show that much work is needed before the relatively “massive ” core counts of 2010 are upon us.