Monday, February 15, 2010

The Evolution of Data Center Efficiency Metrics

In April 2006, Christian Belady (now Director of Green Grid and Principal Infrastructure Architect Microsoft Global Foundation Services) gave one of the most important symposium presentations in Data Center history at the Uptime Institute Forum. It was the first time the concept of PUE had been showcased, and from that the Green Grid was formed and the PUE metric became a reality.

This is an important milestone in history as it was the first time ever that there was an independent Data Center power and performance metric that was relatively easy to understand, implement and measure.

So what exactly is PUE – In simple terms, PUE is the total facility power usage divide by the IT equipment power load.

The PUE performance metric is a step in the right direction – However – It also leaves a lot to be desired. There are many whitepapers on the various aspects, but the biggest question I want to focus on is: What in the Data Center falls into the “IT equipment power load” component of the equation? The Green Grid has left this open to interpretation, and has not released a set of requirements or guidelines for what is included in the PUE.

I personally argue that all equipment in a Data Center should be included (because after all, what we are trying to get is an indicator of the power efficiency of a whole facility), however in practice this is not the case. It is not commonly known or advertised that large Data Centre’s remove equipment that is not classified an "IT Equipment load" purely to improve the performance number. And to make matters worse, when you see a PUE number advertised you have no way of knowing what was included and what was not.

This APC White paper discusses this issue at length, and they point out that the following items in the Data Centre are not typically included in a facility’s PUE:

  • Power Distribution Units – According to the APC study, PDU’s lose as much as 10% of the power that they consume, which would have a dramatic impact on the PUE equation if they were included

  • In a mixed use facility – The air conditioners and chillers might not be included in the PUE as they are also supporting other aspects of the operation even though the cooling requirements for the actual IT Infrastructure are typically the highest users of the cooling

  • An on the side support facility to run and maintain the Data Center – For example a Network Operations Centre (NOC) – For a large scale business with multiple Data Centers, the NOC support infrastructure would be very large and would have a huge impact on the PUE if included

As one of my colleagues pointed out “there’s PUE and there’s PUE”.

Don’t get me wrong – PUE is a great first step in the right direction for power performance metrics, but I hope as I’ve shown you there is plenty of scope for improvement.

There are even some industry specific power and performance metrics emerging – For example in the networking world, The Energy Consumption Rating Initiative (ECR) is an emerging performance metric that seems to be gaining traction. The ECR uses a category system, which organizes network and telecom equipment into classes with a different measuring and performance methodology for each class. The classes can then be combined and then the end result is a “performance-per-energy-unit” rating.

APC suggest that a standard set of guidelines should be drawn up which dictates what equipment must be included in the PUE. Personally, I’m not totally convinced that this will help solve the ultimate problem of how to accurately measure and report power performance metrics over all Data Centers with disparate infrastructures and business models.

Which leaves a big challenge moving forwards – How does one build a set of guidelines that can be used to measure power performance over disparate infrastructure solutions and business models?

In my opinion, I believe that the ECR Initiative should be adopted and opened up to all things that use power in the Data Center, with classes of equipment categorized, that all add up to an end figure. Sure it will be a little more complicated, but in the end we would have a performance metric that is consistent from one facility to another and includes and counts all of the power that is used in that facility.

Hope you are all well - I'll be back soon with more Data Center and Windows Server blog posts in the not to distant future!