Archive for the ‘Ethics and Technology’ Category

h1

Operton vs. Nehalem-EP at AnandTech

May 22, 2009

AnandTech’s Johan DeGelas has an interesting article on what he calls “real world virtualization” using a benchmark process his team calls “vApus Mk I” and runs it on ESX 3.5 Update 4. Essentially, it is a suite of Web 2.0 flavored apps running entirely on Windows in a mixed 32/64 structure. We’re cautiously encouraged by this effort as it opens the field of potential reviewers wide open.

Additionally, he finally comes to the same conclusion we’ve presented (in an economic impact context) about Shanghai’s virtualization value proposition. While his results are consistent with what we have been describing – that Shanghai has a good price-performance position against Nehalem-EP – there are some elements about his process that need further refinement.

Our biggest issue comes with his handling of 32-bit virtual machines (VM) and disclosure of using AMD’s Rapid Virtualization Indexing (RVI) with 32-bit VMs. In the DeGalas post, he points out some well known “table thrashing” consequences of TLB misses:

“However, the web portal (MCS eFMS) will give the hypervisor a lot of work if Hardware Assisted Paging (RVI, NPT, EPT) is not available. If EPT or RVI is available, the TLBs (Translation Lookaside Buffer) of the CPUs will be stressed quite a bit, and TLB misses will be costly.”

However, the MCS eFMS web portal (2 VMs) is running in a 32-bit OS. What makes this problematic is VMware’s default handling of page tables in 32-bit VM’s is “shadow page table” using VMware’s binary translation engine (BT). In otherwords, RVI is not enabled by default for ESX 3.5.x:

“By default, ESX automatically runs 32bit VMs (Mail, File, and Standby) with BT, and runs 64bit VMS (Database, Web, and Java) with AMD-V + RVI.”

–    VROOM! Blog, 3/2009

Read the rest of this entry ?

h1

Quick Take: AMD Istanbul Update

May 21, 2009

AMD was gracious enough to invite us to their Reviewer’s Day on May 20th to have a final look at “Istanbul” and discuss their plans for the product’s upcoming release. While much of the information we received is embargoed until the June, 2009 release date, we can tell you that we’ve have received a couple of AMD’s new 6-core “Istanbul” Opterons for testing and review. We’ll look forward to seeing “Istanbul” in action inside our lab over the next couple of weeks. Our verdict will be available at launch.

Instead of typical benchmarks, we’ll be focusing on Istanbul’s implications for vSphere before the new Opteron hits the streets (remember 6-core is the limit for “free” and “reduced capability” vSphere license). If what we saw from AMD’s internal testing at Reviewer’s Day is accurate , then our AMD/VMware Eco-System partners are going to be very happy with the results. What we can confirm today is that AGESA 3.3.0.3+ 3.5.0.0+ is required to run Istanbul, so start looking for BIOS updates from your vendors as the launch date approaches. The systems we reported on from Tyan back in April will be good-to-go at launch (our GT28 test systems are already running it require a beta BIOS).

SOLORI’s take: We made a somewhat bold prediction on April 30, 2009 that “Shanghai-Istanbul Eco-System looks like an economic stimulus all its own” when comparing the AMD upgrade path to Intel’s (rip and replace) where VMware infrastructures are concerned. That article, Shanghai Economics 101, was one of our most popular AMD-related postings yet, and – judging from what we’ve seen already – it looks like we may have been correct!

While we’re impressed with the ability to flawlessly vMotion from socket 940 to socket-F, we were more impressed with the ability to insert an Istanbul into a Barcelona or Shanghai system and immediately realize the benefits. We’re going to look at our review samples, revisit our price-performance data and Watt/VM calculations before making sweeping recommendation. However, we expect to find Istanbul to be a very good match to on-premise cloud/virtualization initiatives.

SOLORI’s 2nd take: VDI and databased consolidation systems running on 4P AMD boxes are about to take a giant leap forward. We can’t wait to see 24-core and 48-core VMmark scores updated over the next two months. Start asking your system vendor for updated BIOS supporting AGESA 3.5.0.0+ (Tyan are you listening? Supermicro’s AS2041M is already there), and get your 4P test mule updated and prepare to be amazed…

h1

Intel’s $1.1B Euro Slap On the Wrist, Must Sell 2.3M Chips

May 13, 2009

May 13th, 2009  – besides being my birthday – marks the day that the European Competition Commission drew a $1.1B Euro fine (about $1.4B US dollars) on Intel for going “to great lengths to cover up its anti-competitive actions” and in the process “harmed millions of European consumers.” This according to the EU commissioner Neelie Kroes, in an address in Brussels today. The fine could have been as large as $4B Euros, and will go to the EU’s annual budget – not consumers.

Commissioner Kroes was seen holding up an Intel PII/PIII processor card (SECC2) during the news conference, giving some scope to what has been a very long and drawn-out process: going back to 2000. At the heart of the matter has been Intel’s “llegal anticompetitive practices to exclude competitors from the market for computer chips called x86 central processing units (CPUs)” – namely AMD. These were apparantly manifested in behind the scenes rebates and discounts in exchange for a reduction or termination of AMD-based products.

In a press release from Intel’s President and CEO, Paul Otellini, the fined chip maker offered this defense:

Intel takes strong exception to this decision. We believe the decision is wrong and ignores the reality of a highly competitive microprocessor marketplace – characterized by constant innovation, improved product performance and lower prices. There has been absolutely zero harm to consumers. Intel will appeal.

Intel must cover their fine immediately with a bank guarantee which will stay sequestered until their appeal is either exhausted or the decision reversed. Based on EU’s hunger for this type of commercial justice, the money could be tied-up for many years. But the question remains, does Intel have a history of anti-competitive behavior beyond the test of rigorous competition?

Intel’s history tells a compelling story: the EU joins Japan (2004) and South Korea (2008) in finding Intel engaged in anti-competitive behavior. The question remains: how will the EU’s decision play in the US courts as AMD’s ongoing antitrust suit (2005) against Intel continues to unfold? Delayed until 2010 due to the lenghty list of depositions scheduled for the case, the EU’s decision will likely do more to tarnish Intel’s new “Promoting Innovation” Campaign than settle the dispute.

So what does Intel need to do to weather the EU’s wrath? In product terms, Intel needs to move 2,262,752 of its Nehalem-EP (5500-series) chips to cover the loss. Based on a predicted 40M unit replacement market in the US, thats less than 5% and it’s under 2.5% of the market if they are 2P systems. However, Intel’s promised a 9:1 value for the replacement with some estimating that number moves to 18:1 with good results for SMT (depending on the workload).

What does this mean from an Intel 5500-series sales perspective? Here’s our estimate, using Intel’s 9:1 and 18:1 math (not forgeting the 4.5:1 for the dual-core):

Nehalem Units Needed Retail Value 9:1 18:1
W5580 12,545 $20,072,000.00 0.56%
X5570 121,713 $168,694,218.00 5.48%
X5560 168,227 $197,162,044.00 7.57%
X5550 174,450 $167,123,100.00 7.85%
E5540 531,715 $395,595,960.00 23.93%
E5530 419,636 $222,407,080.00 18.88%
E5520 183,533 $68,457,809.00 8.26%
E5506 262,704 $69,879,264.00 5.91%
E5504 250,051 $56,011,424.00 5.63%
E5502 106,312 $19,986,656.00 1.20%
L5520 10,516 $5,573,480.00 0.24%
L5506 21,350 $9,031,050.00 0.96%
Total 2,262,752 $1,399,994,085.00 12.97% 73.49%

By these estimates, Intel will need to close 86.5% of the total replacement market to be able to cover the EU fines. All this assumes, of course, that they don’t offer discounts off of their “published” per-1000 chip prices. Good luck, Intel, on an exciting marketing campaign!

h1

The Cost of Benchmarks

May 8, 2009

We’ve been challenged to backup our comparison of Nehalem-EP systems to Opteron Shanghai in price performance based on prevailing VMmark scores available on VMware’s site. In earlier posts, our analysis predicted “comparable” price-performance results between Shanghai and Nehalem-EP systems based on the economics of today’s memory and processors availability:

So what we’ve done here is taken the on-line configurations of some of the benchmark competitors. To make things very simple, we’ve just configured memory and CPU as tested – no HBA or 10GE cards to skew the results. The only exception – as pointed out by our challenger – is that we’ve taken the option of using “street price” memory where “street price” is better than the server manufacturer’s memory price.

Here’s our line-up:

System Processor Qty. Speed (GHz) Speed (GHz, Opt) Memory Configuration Street Price
Inspur NF5280 X5570 2 2.93 3.2 96GB (12x8GB) DDR3 1066 $18,668.58
Dell PowerEdge R710 X5570 2 2.93 3.2 96GB (12x8GB) DDR3 1066 $16,893.00
IBM System x 3650M2 X5570 2 2.93 3.2 96GB (12x8GB) DDR3 1066 $21,546.00
Dell PowerEdge M610 X5570 2 2.93 3.2 96GB (12x8GB) DDR3 1066 $21,561.00
HP ProLiant DL370 G6 W5580 2 3.2 3.2 96GB (12x8GB) DDR3 1066 $18,636.00
Dell PowerEdge R710 X5570 2 2.93 3.2 96GB (12x8GB) DDR3 1066 $16,893.00
Dell PowerEdge R805 2384 2 2.7 2.7 64GB (8x8GB) DDR2 533 $6,955.00
Dell PowerEdge R905 8384 4 2.7 2.7 128GB (16x8GB) DDR2 667 $11,385.00

Here we see Dell offering very aggressive DDR3/1066 pricing [for the R710] allowing us to go with on-line configurations, and HP offering overly expensive DDR2/667 memory prices (factor of 2) forcing us to go with 3rd party memory. In fact, IBM did not allow us to configure their memory configuration – as tested [with the 3650M2] – with their on-line configuration tool [neither did Dell with the M610] so we had to apply street memory prices. [Note: the So here’s how they rank with respect to VMmark:

System VMware Version Vmmark Score Vmmark Tiles Score/Tile Cost/Tile
Inspur NF5280 ESX Server 4.0 build 148592 23.45 17 1.38 $1,098.15
Dell PowerEdge R710 ESX Server 4.0 build 150817 23.55 16 1.47 $1,055.81
IBM System x 3650M2 ESX Server 4.0 build 148592 23.89 17 1.41 $1,267.41
Dell PowerEdge M610 ESX Server 4.0 23.9 17 1.41 $1,273.59
HP ProLiant DL370 G6 ESX Server 4.0 build 148783 23.96 16 1.50 $1,164.75
Dell PowerEdge R710 ESX Server 4.0 24 17 1.41 $993.71
Dell PowerEdge R805 ESX Server 3.5 U4 build 120079 11.22 8 1.40 $869.38
Dell PowerEdge R905 ESX Server 3.5 U3 build 120079 20.35 14 1.45 $813.21

As you can easily see, the cost-per-tile (analogous to $/VM) favors the Shanghai systems. In fact, the one system that we’ve taken criticism for including in our previous comparisons – the Supermicro 6026T-NTR+ with 72GB of DDR3/1066 (running at DDR3/800) – actually leads the pack in Nehalem-EP $/tile, but we’ve excluded it from our tables since it has been argued to be a “sub-optimal” configuration and out-lier. Again, the sweet spot for price-performance for Nehalem, Shanghai and Istanbul is in the 48GB to 80GB range with inexpensive memory: simple economics.

Please note, that not one of the 2P VMmark scores listed on VMware’s official VMmark results tally carry the Opteron 2393SE version of the processor (3.1GHz) or HT3-enabled motherboards. It is likely that we’ll not see HT3-enabled scores nor 2P ESX 4.0 scores until Istanbul’s release in the coming month. Again, if Shanghai’s $/tile is competitive with Nehalem’s today (again, in the 48GB to 80GB configurations), Istanbul – with the same memory and system costs – will be even more so.

Update: AMD’s Margaret Lewis has a similar take with comparison prices for AMD using DDR2/533 configurations. Her numbers – like our previous posts – resolve to $/VM, however she provides some good “street prices” for more “mainstream” configurations of Intel Nehalem-EP and AMD Shanghai systems. See her results and conclusions on AMD’s blog.

h1

Quick Take: Nutty Intel VT Story

May 6, 2009

ZDnet has an interesting story that’s getting some traction about Windows 7’s XP mode and how you may not be able to run it on your Intel platform. Since the technology relies on Intel-VT or AMD-v to work, if your chip doesn’t have it, you’re cooked. Unlike AMD’s all-or-nothing approach that creates uniformity across server and workstation platforms – delivering all features to all but the “Semperon” versions of the AMD64, Intel likes to market “reduced feature” versions to keep price points meaningful.

Intel’s approach also makes it a nightmare for consumer end-users to determine what they get from their money, as described very well in ZDnet’s blog:

Here’s a real-world example. Dell’s Vostro 420 is a well-built, no-frills desktop PC designed for the small and medium business market. The screen [graph] below shows the current lineup of CPUs that you can choose from when you build this system to order at Dell’s website. Four of the six options support Intel VT; I’ve circled the two CPUs that don’t support VT.

(see ZDnet’s blog entry for graphic and story) Read the rest of this entry ?

h1

Shanghai Economics 101 – Conclusion

May 6, 2009

In the past entries, we’ve looked only at the high-end processors as applied to system prices, and we’ll continue to use those as references through the end of this one. We’ll take a look at other price/performance tiers in a later blog, but we want to finish-up on the same footing as we began; again, with an eye to how these systems play in a virtualization environment.

We decided to finish this series with an analysis of  real world application instead of just theory. We keep seeing 8-to-1, 16-to-1 and 20-to-1 consolidation ratios (VM-to-host) being offered as “real world” in today’s environment so we wanted to analyze what that meant from an economic side.

The Fallacy of Consolidation Ratios

First, consolidation ratios that speak in terms of VM-to-host are not very informative. For instance, a 16-to-1 consolidation ratio sounds good until you realize it was achieved on an $16,000 4Px4C platform. This ratio results in a $1,000-per-VM cost to the consolidator.

In contrast, let’s take the same 16-to-1 ratio on a $6,000 2Px4C platform and it results in a $375-per-VM cost to the consolidator: a savings of nearly 60%. The key to the savings is in vCPU-to-Core consolidation ratio (provided sufficient memory exists to support it). In the first example that ratio was 1:1, but in the last example the ratio is 2:1. Can we find 16:1 vCPU-to-Core ratios out there? Sure, in test labs, but in the enterprise we think the valid range of vCPU-to-Core consolidation ratios is much more conservative, ranging from 1:1 to 8:1 with the average (or sweet spot) falling somewhere between 3:1 and 4:1.

Second, we must note that memory is a growing aspect of the virtualization equation. Modern operating systems no longer “sip” memory and 512MB for a Windows or Linux VM is becoming more an exception than a rule. That puts pressure on both CPU and memory capacity as driving forces for consolidation costs. As operating system “bloat” increases, administrative pressure to satisfy their needs will mount, pushing the “provisioned” amount of memory per VM ever higher.

Until “hot add” memory is part of DRS planning and the requisite operating systems support it, system admins will be forced to either over commit memory, purchase memory based on peak needs or purchase memory based on average memory needs and trust DRS systems to handle the balancing act. In any case, memory is a growing factor in systems consolidation and virtualization.

Modeling the Future

Using data from the Univerity of Chicago and as a baseline and extrapolating forward through 2010, we’ve developed a simple model to predict vMEM and vCPU allocation trends. This approach establishes three key metrics (already used in previous entries) that determine/predict system capacity: Average Memory/VM (vMVa), Average vCPU/VM (vCVa) and Average vCPU/Core (vCCa).

Average Memory per VM (vMVa)

Average memory per VM is determined by taking the allocated memory of all VM’s in a virtualized system – across all hosts – and dividing that by the total number of VM’s in the system (not including non-active templates.) This number is assumed to grow as virtualization moves from consolidation to standardized deployment. Read the rest of this entry ?

h1

Shanghai Economics 101 – Continued

May 4, 2009

Let’s look at some more real world applications of what we’ve learned from the VMmark results for Nehalem and what it means in a practical comparison. We’ll award Nehalem-EP’s SMT a 25% bonus for in our comparisons when vCPU/core count is taken into the measurement. In a 6:1 consolidation, this means 60 vCPU’s for 2P Nehalem and 48 vCPU’s for Shanghai. Using this bias, the following cost characteristics are revealed for VM’s with average memory footprints of 1.5GB, for the Nehalem-EP 3.2GHz system:

Nehalem-EP Configuration Street $ 1536MB VM’s, 1 vCPU’s Max vCPU’s (6/c) Cost/VM
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 24GB DDR3/1333 $7,017.69 13 60 $539.82
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 48GB DDR3/1066 $7,755.99 28 60 $277.00
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 72GB DDR3/800 $8,708.19 42 60 $207.34
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 96GB DDR3/1066 $21,969.99 57 60 $385.44
2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 144GB DDR3/800 $30,029.19 60 60 $500.49
2 x 2P/8C, Nehalem-EP, W5580 3.2GHz, 6.4GT QPI with 144GB DDR3/800 $60,058.38 120 120 $500.49

We’ll compare this to a Shanghai 2P system at 3.1GHz vs. the Nehalem-EP system:

Shanghai 2P/HT3 Configuration Street $ 1536MB VM’s, 1 vCPU’s Max vCPU’s (6/c) Cost/VM Savings per VM Savings %
2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 32GB DDR2/800 $5,892.12 18 48 $327.34 $212.48 39.36%
2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 48GB DDR2/800 $6,352.12 28 48 $226.86 $50.14 18.10%
2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 64GB DDR2/533 $6,462.52 37 48 $174.66 $32.68 15.76%
2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 80GB DDR2/667 $8,422.12 47 48 $179.19 $28.14 13.57%
2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 96GB DDR2/667 $11,968.72 48 48 $249.35 $136.09 35.31%
2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 128GB DDR2/533 $14,300.92 48 48 $297.94 $202.55 40.47%
2 x 2P/8C Shanghai, 2393 SE, 3.1GHz, 4.4GT HT3 with 128GB DDR2/533 $28,601.83 96 96 $297.94 $202.55 40.47%

Read the rest of this entry ?