Sep 3, 2007

Yes, It's Still Safe to Power Off and Power On That Server

After my previous post on the reliability of power supplies, I decided to see what our Cassatt experiences can tell us about server reliability. Within my department, I have engineering labs located in three locations-- Colorado Springs, Minneapolis and San Jose-- and about 500 servers in total.

Mukund and I decided to look at the data from 123 servers located in San Jose. These servers are used by Mukund's team for System Test activities. His team has developed over 700 automated tests that are used to qualify our Cassatt product suite. As part of the test run, servers are routinely power-cycled. We physically pull power from the servers at the start of each test run. All the nodes are on managed Power Distribution Units (APC's and Baytechs), and the automated tests power down the outlets from the PDU before running the tests. This has been in place since 2004.

For the 123 servers that were analyzed, not a single power-supply or disk drive failed during the past two years.

Here are the server counts in the study:
  • 26 IBM HS-20 blades
  • 8 HP DL380 G4
  • 45 HP DL360 G4
  • 8 HP DL360 G3
  • 6 HP DL140
  • 3 HP DL385
  • 5 Sun SPARC
  • 1 IBM x345
  • 15 Dell 1850
  • 6 Dell 2650
During the past 5 months, the power supplies on these 23 servers were power-cycled 18,826 times. That's an average of once per day per server. As part of the system testing, these servers were power-cycled repeatedly by using their power controller. The power operations from the power controller generate stress on the server's internal comments, such as the motherboard and disk drives, but the power supply remains connected to A/C power. These power operations from the power controller are not counted in the 18,826 figure cited earlier.

In a future posting, Mukund and I will provide more details on these additional power operations. We will also provide data from the servers in our other engineering labs.

So if you're still afraid to power down that server, don't worry! Power supplies and hard drives are very reliable these days. From several different studies, we've seen that power supplies hold up quite well from (and are even designed for) power cycling.