Dynamic Power Management: A Quantitative Approach
by Johan De Gelas on January 18, 2010 2:00 AM EST- Posted in
- IT Computing
Performance per Watt rules the datacenter, right? Wrong. Yes, you would easily be lead astray after the endless "Green ICT" conferences, the many power limited datacenters, and the flood of new technologies that all have the "Performance/Watt" stamp. But if performance per Watt is all that counts, we would be all be running atom and ARM based servers. Some people do promote Atom based servers, but outside niche markets we don't think it will be a huge success. Why not? Think about it: what is the ultimate goal of a datacenter? The answer is of course the same as for the enterprise as a whole: serve as many (internal or external) customers as possible with the lowest response time at the lowest cost.
So what really matters? Attaining a certain level of performance. At that point you want the lowest power consumption possible, but first you want to attain the level of performance where your customers are satisfied. So it is power efficiency at a certain performance level that you are after, not the best performance/Watt ratio. Twenty times lower power for 5 times lower performance might seem an excellent choice from the performance/watt point of view, but if your customers get frustrated with the high response times they will quit. Case closed. And customers are easily frustrated. "Would users prefer 10 search results in 0.4 seconds or 25 results in 0.9 seconds?" That is a question Google asked [1]. They found out to their surprise that a significant number of users got bored and moved on if they had to wait 0.9 seconds. Not everyone has an application like Google, but in these virtualized times we don't waste massive amounts of performance as we used to in the beginning of this century. Extra performance and RAM space is turned into more servers per physical server, or business efficiency. So it is very important not to forget how demanding we all are as customers when we are browsing and searching.
Modern CPUs have a vast array of high-tech weapons to offer good performance at the lowest power possible. PowerNow!, SpeedStep, Cache Sizing, CoolCore, Smart Fetch, PCU, Independent Dynamic Core Technology, Deep Sleep, and even Deeper Sleep. Some of those technologies have matured and offer significant power savings with negligible performance impact. A lot of them are user configurable: you can disable/enable them in the BIOS or they get activated if you chose a certain power plan in the operating system. Those that are configurable are so for a good reason: the performance hit is significant in some applications and the power savings are not always worth the performance hit. In addition, even if such technologies are active under the hood of the CPU package, it is no guarantee that the operating system makes good use of it.
How do we strike the right balance between performance and energy consumption? That is the goal of this new series of articles. But let's not get ahead of ourselves; before we can even talk about increasing power efficiency at a certain performance point, we have to understand how it all works. This first article dives deep into power management, to understand what works and what only works on PowerPoint slides. There is more to it than enabling SpeedStep in your server. For example, Intel has been very creative with Turbo Boost and Hyper-Threading lately. Both should increase performance in a very efficient way. But does the performance boost come with an acceptable power consumption increase? What is acceptable or not depends on your own priorities and applications, but we will try to give you a number of data points that can help you decide. Whether you enable some power management technologies, how you configure your OS is not the only decision you have to make as you attempt to provide more efficient servers.
Both AMD and Intel have been bringing out low power versions of their CPUs that trade clock speed for lower maximum power. Are they really worth the investment? A prime example of how the new generation forces you to make a lot of decisions is the Xeon L3426: a Xeon "Lynnfield" which runs at 1.86GHz and consumes 45W in the worst case according to Intel. What makes this CPU special is that it can boost its clock to 3.2GHz if you are running only a few active threads. This should lower response times when relatively few users are using your application, but what about power consumption? AMD's latest Opteron offers six cores at pretty low power consumption points, and it can lower its clock from 2.6GHz all the way down to 800MHz. That should result in significant power savings but the performance impact might be significant too. We have lots of questions, so let's start by understanding what happens under the hood, in good old AnandTech "nuts and bolts" tradition.
Warning: This article is not suited for quick consumption. Remember, you come to AnandTech for hard hitting analysis, and that's what this article aims to provide! Please take your time… there will be a quiz at the end. ;-)
35 Comments
View All Comments
n0nsense - Monday, January 18, 2010 - link
Here is what system sees ...only one is 2.5, other three are 2.0 :)
nons ~ # cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz
stepping : 7
cpu MHz : 2497.000
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm tpr_shadow vnmi flexpriority
bogomips : 5009.38
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz
stepping : 7
cpu MHz : 1998.000
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 4
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm tpr_shadow vnmi flexpriority
bogomips : 7012.69
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz
stepping : 7
cpu MHz : 1998.000
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 2
cpu cores : 4
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm tpr_shadow vnmi flexpriority
bogomips : 5009.08
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz
stepping : 7
cpu MHz : 1998.000
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 4
apicid : 3
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 lahf_lm tpr_shadow vnmi flexpriority
bogomips : 5009.09
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
VJ - Tuesday, January 19, 2010 - link
These are mobile CPUs, however:With Linux on a Latitude (Intel T7200 or T7500), CPU Frequency Scaling Monitor allows one to scale the frequency of one core to its max while leaving the other core at its minimum.
With an AMD TL62, this is not possible. The induced scaling of one core causes the frequency of the other core to follow.
With an AMD ZM84 this is possible. Just like with the Latitude, one can have one core at its max with the other core at its minimum.
Maybe what's shown is not what's taking place.
Additionally;
http://www.intel.com/technology/itj/2006/volume10i...">http://www.intel.com/technology/itj/200...al_Manag...
"For example, in a Dual-Processor system, when the OS decides to reduce the frequency of a single core, the other core can still run at full speed. In the Intel Core Duo system, however, lowering the frequency to one core slows down the other core as well."
VJ - Tuesday, January 19, 2010 - link
Additionally; AMD's ZM84 allows each core to operate at different frequencies. The lowest frequency is 575Mhz while the highest is 2300Mhz.I can set one core to 1150Mhz with the other set at 2300Mhz. This is different from the Intel (Mobile) CPUs I've come across where a difference in frequency between cores is only possible when one core is (seemingly) operating at its lowest frequency (in a dual core system).
What is also interesting from aforementioned cpuinfo output is that only core is running at its max frequency while all (3) other cores are (seemingly) at their minimum frequency. Considering my previous conjecture on C2 and C0 states, it would be surprising if one can show cpuinfo output where 2 cores are running at max frequency while the other 2 cores are running at any frequency other than max frequency. That shouldn't be possible at all.
valnar - Thursday, May 6, 2010 - link
Does anyone know if this kind of power management for Lynnfield processors is available in Windows 2003?hshen1 - Sunday, June 23, 2013 - link
This is really a good article for power management researchers like me!!