Workers and customers, empowered by smartphones and widely available Wi-Fi services, want and are demanding 24x7 access to email, company network resources, and Web sites. And thanks to today’s global marketplace, even small companies must support round-the-clock activities.
Unfortunately, IT system downtime remains a problem for companies of all sizes. A 2010 eWEEK article1 reporting on an industry study noted that North American businesses suffer an average of 10 hours of IT downtime annually. The article went on to note that this downtime costs small companies about $55,000 in revenue each year, while large companies lose about $1 million per year.
To avoid the problems that can cause downtime, companies need to closely observe server room environmental conditions and be alerted when problems arise. This is an area where ITWatchDogs environmental monitoring solutions can help.
Examining the Causes of Downtime
Several data center environmental factors can contribute to or increase downtime and service disruptions. Heat can be a killer. Extreme heat buildup can fry a server, knocking it offline and perhaps damaging it permanently. Even moderate heat buildup can have an impact. According to studies done by the high performance computing researchers at Los Alamos National Laboratory, equipment failure rate doubles for every increase of 18 degrees Fahrenheit. Increased failure rate due to prolonged heating has also been noted by the Uptime Institute and others.
When it comes to monitoring temperature, it is not good enough simply to nail a thermostat to the wall. Since the temperature can vary drastically around different pieces of equipment, you should consider placing separate temperature probes within individual racks or critical devices. That way, problems with a broken fan or an air-conditioning failure will show up quickly. Similarly, you might be able to identify a server that is overheating due to it running excessive workload.
To take nuances into account, ITWatchDogs environmental monitors are designed for today’s crowded server rooms. They are small, ranging in size from only 4 inches long up to the largest models that are rack mountable at 19in/1U. The devices can run off of existing electrical power outlets, and many support Power over Ethernet (POE).
The monitors have built-in Web servers and use standard networking protocols, including TCP/IP and HTTP. This allows server-room administrators and their technical staff to monitor temperatures over an Ethernet network or remotely via the Web from anywhere. The information is presented in a manner that allows quick inspection of current temperatures, as well as historical data to help spot heating pattern trends. Finally, all of their environmental monitors are capable of sending alerts via SNMP traps, email, and SMS messages. Some devices can also trigger an external phone dialler to provide voice call alerts up to nine phone numbers, when pre-defined thresholds are exceeded.
Other server-room environments can cause downtime problems, and need comparable monitoring and alerting capabilities. Humidity is another major threat. The reason: Humidity is the amount of water vapor in the air, and too much water vapor can form condensation on electronic components, leading to electrical shorts. If the humidity is too low, there is an increased chance of damage from electrostatic discharge. In either case, uncontrolled humidity can severely damage critical server components, causing the server to crash and shutting down access to applications and data. Unfortunately, humidity is one of the trickiest environmental characteristics of a server room to measure and, as such, requires very close attention.
To measure humidity, most companies have focused on relative humidity. In fact, for years the guidelines followed were based on recommendations of the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE). The group suggested that the relative humidity for computer rooms be within the 40 percent to 55 percent range. However, because relative humidity varies with temperature, ASHRAE now recommends that data centers measure absolute humidity, expressed as the dew point (it should fall within 41.9 to 59 degrees Fahrenheit).
As was the case with temperature measurements, humidity can vary significantly within a data center. So sensors must be placed throughout the room and server racks. Water in a server room is never good news. Whether the source is a leaking or burst pipe, or a flood, water can easily shut down an entire organization. Examples include:
• A water main break in Texas took down the computer systems in the Dallas County Records Building. According to The Dallas Morning News2 , this “[crippled] operations for almost the entire county government.”
• Rains flooded a T-Mobile data center in the Pacific Northwest, taking down servers supporting the company’s service activation portals and web sites3.
Water is usually measured using a cable that is run under an equipment room’s raised floor. When water comes in contact with the cable, an alarm is triggered. Pro-active water monitoring should make use of sensors capable of detecting the presence of water over a large area so remedial action can be taken before it shorts out equipment.
A less frequent cause of downtime is fire and smoke. In 2008, a fire destroyed 75 servers, routers and switches in a Green Bay data center, according to Data Center Knowledge4. Smaller fires and smoke from equipment or frayed wires can trigger fire-suppression systems which, while much better today at safeguarding equipment, can still cause damage to IT equipment.
To detect fire and smoke requires more than traditional building smoke alarms. The problem is that when they sense smoke there may be no one around to hear it. What’s needed is an alarm that connects to web-enabled environmental monitors. In this way, the smoke alarm works as it normally does, but its alert can now be sent via a SNMP trap, e-mail, SMS, and/or voice call to multiple IT staff members.
ITWatchDogs environmental monitors come equipped with various on-board sensors along with digital and analog inputs for external sensors, including temperature, humidity, water, smoke, and fire to name a few. The environmental monitors provide a way to remotely monitor server room conditions, view historical data to spot trends, and receive alerts when conditions exceed pre-defined thresholds. The information provided by ITWatchDogs environmental monitors can help a server room staff:
• Notice changing conditions and take pre-emptive action to prevent downtime
• Spot troublesome fluctuations and anomalies that might contribute to downtime
• Receive alerts when conditions warrant immediate attention.
Furthermore, ITWatchDogs climate monitors can be configured to display video feeds from up to four IP network cameras. The interface provides a quick view of remote conditions along with environmental measurements when logged in. For a manager working remotely or at home over the weekend, secure access to the interface is perfect to see who’s in the server room and check what’s going on from time to time. Upon alarm, a quick glance can also help determine if a trip to the facility is required or not.
And finally, when it comes to server room downtime, the elephant in the room is power outages. Power outages are the leading cause of downtime. Certainly, short outages can be covered with properly configured UPS systems. However, in some cases, a UPS might further contribute to equipment failure if it leaves servers running while the A/C remains off.
Naturally, if the power outage is longer-term – for instance, a severe winter storm tears down power lines – knowledge of the extent of the power failure is essential so that backup plans can be initiated.
For power monitoring, ITWatchDogs offers the Remote Power Manger X2 (RPM X2). This adds remote power monitoring and switching capabilities to any ITWatchDogs environment monitors supporting a digital sensor port. The add-on accessory presents real time logging and graphing of voltage, amperage, real power, apparent power, power factor and kilowatt-hour to provide trend analysis and power metrics for future planning. The device enables users to set alarm thresholds for these measurements and it can remotely reboot locked systems or control system power via the secure user interface.
ITWatchDogs as your Technology Partner
To increase IT system availability, organizations need to take a proactive approach to monitoring the environmental conditions that contribute to downtime and disruptions.
Certainly, for years IT equipment such as servers, switches, and storage devices have had temperature and fan sensors, as well as software to send alerts when temperatures rise or a fan fails. But in many cases, these systems only notify you once a problem is severe. Additionally, these monitors only give you information about an individual device.
Proactively monitoring conditions in the entire server room or data center helps identify issues before they turn into a problem. This allows time to rectify matters before equipment deteriorates or fails.
ITWatchDogs offers a wide range of environmental monitors providing cost effective ways for server-room managers and their staffs to proactively monitor their IT infrastructure and maintain system uptime. The products provide a quick and easy way to keep an eye on remote conditions from a secure web interface and receive alert notifications when specified alarm thresholds are exceeded. The interface displays live video feeds and environmental measurements including temperature, humidity, airflow, light, sound, power, water detection, and more. The measurements are logged and graphed for viewing trend patterns. External processes or applications can be automated on an alarm trigger or remotely through the web interface with units supporting output-relay control or with the Remote Power Manager X2.
ITWatchDogs’ climate monitors use standard Web server software to display their measurements and camera feeds. All management and monitoring tools are accessible securely via Ethernet or the Internet; no software installation is required. The monitors have SNMP agent software to integrate with popular networking management tools, and they support SNMP v1, v2c, and v3.
Most importantly, the ITWatchDogs line of products provide the pro-active monitoring needed to maintain high availability in today’s data centers and equipment rooms.
1 “IT Outages Cause Businesses $26.5 Billion in Lost Revenue Each Year, Survey,” eWEEK, December 10, 2010 http://www.eweek.com/c/a/IT-Infrastructure/IT-Outages-Cause-Businesses-265-Billion-in-Lost-Revenue-Each-Year-Survey-280492/
2 “Water main break cripples Dallas County computers, operations,” The Dallas Morning News,
June 2, 2010 http://www.dallasnews.com/sharedcontent/dws/news/localnews/stories/DN-countyflood_02met.ART0.State.Edition2.295d6ee.html
3 “T-Mobile Down Due to Flooding?” BRG.com, December 4, 2007 http://www.bgr.com/2007/12/04/t-mobile-down-due-to-flooding/
4 “Fire Destroys Wisconsin Data Center,” Data Center Knowledge, March 31, 2008 http://www.datacenterknowledge.com/archives/2008/03/31/fire-destroys-wisconsin-data-center/
For More Information Contact Interworld Electronics