The human species has a healthy aversion for risk, a trait that we share with the rest of the animal kingdom. Such behavior is actually considered an evolutionary
adaptation. Primitive humans living through the harshness of the Stone Age had to be considerably more survivalist than today. Any tools or skills they possessed would have been comparably indispensable and to risk losing any could have been fatal. As such, risk-takers of the tribes had to constantly weigh the odds when deciding to gather food or explore new areas..
Sometimes, however, you can take the person out of the Stone Age, but not the Stone Age out of the person. Today, the chance of getting eaten by a predator has of course greatly diminished (unless we mean it figuratively by a fierce competitor). That said, the world has become infinitely more complex and the risk assessment for potential losses has gotten way more complicated.
Nonetheless, assess we must. We still feel the need to evaluate the safest action we are going to take, even if we are not always very good at it. Some of us, for example, prefer to travel thousands of miles by car instead of hopping on a plane because we are scared of the latter, even though it is demonstrably safer.
It becomes then important to apply pragmatic and objective practices for a proper (not subjective) evaluation of eventual unwanted outcomes. Such need has forced modern risk analysis and management to evolve immensely in the last half century, even employing
Artificial Intelligence techniques.
It is no surprise that data centers builders and managers take enormous precautions to protect themselves by implementing multiple layers of redundancy. What do they protect themselves from?
Outages of course, whether they are to power or communication links. What about fire, explosions and other hazards? (Nice infographics from DataCenterDynamics here). Moreover, let’s not forget virtual hacking and physical attacks.
In that last regard, some data centers are built like fortresses with armed guards. A particular one in Portugal has an actual
moat around it. Fortunately, no crocodiles roam those waters. This certainly supports the argument that in case of a Zombies Apocalypse, a Data Center may be the best place to seek shelter
However, even though lots of brainpower has been applied to the field of counteracting Murphy’s Law, many companies actually spend very little time on actual Risk Management.
A report from the Organization for Economic Co-operation and Development (OECD) indicates that executive boards do not spend lots of time on business risk management (only 14%) and most
just accept proposed strategies.
We put forward that such poor score may be due to a somewhat-lacking IT Governance, which may become the subject of another blog entry, since it demands its own level of reflection.
If, however, we stay on the topic of Risk Management
for data centers, what are the strategies that can be used to assist? Certainly, we need to know what assets are in play, from IT or the facility and rate their criticality to the business, i.e. their impact on the bottom line if they go down. Also, we need to identify the threats that could make them fail, which in turn will help pinpoint the vulnerabilities.
Some strategies are best elaborated at the planning and design stages of the data center. Elements such as the site location have their own set of risk mitigation points. It is usually at this point in time that the level of tolerance to risks, since they can never be completely eliminated, usually gets defined.
Once the Data Center is in operation, however, managers have to juggle with the IT requirements, the risks and the efficiency of it all. Most countries have some kind of national Energy and Security directives, sometimes in the form of laws. Compliance to those is of course an extra burden.
It is clear that to make the right decisions we have to ask the right questions. Which brings up the issue of who or what can provide the needed answers.
In this blog edition, we have mentioned and discussed many of the reasons to implement a DCIM solution, but we can rationally state that risk management underlies them all.
Having a thorough visibility and control of all the assets, understanding their interdependence and criticality, affords the operators the knowledge to make informed decisions as far as tolerating the likelihood of any kind of outage.
Today’s demands on data centers require them to be agile and adaptable and DCIM can support a framework where that becomes possible.
Cavemen certainly lived a simpler, albeit more dangerous, life. Our caves have become enormous, equipped with air conditioning, UPS’s and diesel generator that we have got to manage and maintain. Let’s hope we can keep our own sabre-toothed tigers’ equivalents at bay.
And Zombies too…