Human error is one element that is impossible to completely remove from the downtime equation.
The risk can be significantly reduced by providing comprehensive, regular training and clear and careful operating procedures, but it will not be completely eradicated.
But with the market demanding more and data centre complexity growing, the potential for human error is amplified. Working in tight spaces with multiple cords and cables, engineers face increasing challenges from accidental overloading of branch circuits.
For example, during a crash repair when a server is removed from the rack, it's common for the many cables to cross over. The wrong cords can either be wrongly or accidently unattached, leaving another server without power. Technology can help iron out these issues, by selecting products with lockable cords, providing protection against accidental power loss of the attached IT equipment.
Communication is crucial
Companies need to start leveraging the insight provided by power monitoring tools like PDUs, which enable them to understand how power is being distributed around the data centre. Using the intelligence from a PDU, data centre management can accurately identify trends and opportunities to manage their servers and avoid server crashes. Without the brain, servers may be running too hot or too cold, on the brink of crashing.
Measuring and understanding power consumption in a data centre is fundamental to reducing this power-related disruption. Research by McKinsey & Company found that only 6 to 12 percent of energy was used to power operating servers, therefore idle servers that remain unmonitored, are wasting around 90 percent of the total energy consumption. Intelligent PDUs can accurately measure energy consumption and determine the efficiency of any servers within the data centre.
Therefore, if managers are equipped with this knowledge they can proactively ensure power is effectively distributed throughout the data centre and make informed capacity planning decisions.
Companies are continuing to put their IT resources at risk by their lack of knowledge about downtime and therefore managing it in the wrong way, with many data centres running at much lower temperatures than necessary (Source). This is a prominent problem in the industry with many data centre operators over-cooling their infrastructure due to a mis-held belief that IT cannot withstand temperatures above 25 degrees Celsius. In sharp contrast, Dell research revealed that systems actually failed more often in colder data centres running below 16 degrees rather than at 25 degrees.
The truth is that by raising the temperature of the data centre by as little as one degree, huge savings can be made on energy costs without putting operations at risk. In fact, the recent ASHRAE (the American Society of Heating, Refrigerating and Air-conditioning Engineers) third edition of the 'Thermal Guidelines for Data Processing Environments' has suggested that data centre managers should extend the recommended ranges for IT equipment to allow for aggressive economisation. But this is only possible with technology engineered to enable this flexibility - most standard PDUs are rated to only 45 degrees. Leading industry players have been raising the temperature in their data centres, but while server manufacturers are developing products to withstand higher temperatures, many PDU players have not been following this important trend.
Sign up for Computerworld eNewsletters.