Ethernet switch reliability: Temperature vs. moving parts
Common wisdom says that in industrial high-temperature environments, free convection cooling (via heat sinks and other passive means) is the obvious choice for electronic equipment, such as Ethernet switches. In the words of one pundit, fanless electronic systems offer "the inherent reliability of solid-state systems" at the box level.
Common wisdom says that in industrial high-temperature environments, free convection cooling (via heat sinks and other passive means) is the obvious choice for electronic equipment, such as Ethernet switches. In the words of one pundit, fanless electronic systems offer 'the inherent reliability of solid-state systems' at the box level. However, common wisdom may be wrong.
Industrial systems have been typically panel- or field-mounted with roomy enclosures that offer a reasonable amount of unrestricted air to surround the system. High-component-density intelligent devices for industrial applications are often found in industrial control centers where space is at a premium. Rack-mount boxes, such as those housing Ethernet switches, are designed to be compact. When airflow is restricted, interior temperatures can rise, causing operating temperature to have as much or more effect on mean time between failure (MTBF) than potential electro-mechanical failure rates.
The most important consideration in choosing the cooling techniques for industrial Ethernet switches is the application for which the switch will be used. Outside conditions for Ethernet switches, where the units will have only infrequent visual checks, may mandate sealed, free-convection-cooled equipment. Environmental factors—dust, insect penetration, and moisture—harm fan-cooled systems, even with air filters. Equally, indoor systems in areas with a high particulate count (mines), or where fan noise is unacceptable (a movie studio set), may benefit from a sealed system with free-convection cooling.
When environmental extremes are not the dominant application issue, however, and motor noise would be a minor contributor to the ambient noise level in the facility, fan-cooled systems may offer higher reliability.
A recent MTBF calculation for passive- and fan-cooled 24-port Ethernet switches used the Bellcore Reliability Prediction Procedure (RPP), a widely used predictor of MTBF based on system component parameters, such as number of transistors, power dissipation, and environmental factors. Since calculations relate to similar components incorporated in all switches, industrial Ethernet switches across vendor lines should yield the same general profile.
At an ambient temperature of 30on system.
There is a potential flaw in MTBF calculations, however: they are based on the ambient temperature in which the system is operating. In reality, the components operate inside a package, where it is hotter. And, as the trend of the data shows, temperature is a factor in component failure.
Operating temperature measurements indicate that internal component temperatures for rack-mount Ethernet switches using free-convection cooling average 40rade electronic components.
Given internal temperature differentials, free-convection designs for rack-mount switches should be able to operate with normal reliability in environments with ambient temperatures up to 45-50
The two curves in the graph show calculated MTBFs for the aggregate components in rack-mount switch designs, with and without forced-convection devices, for a range of temperatures. The space between the two curves is the difference in electronic component reliability for a free-convection design (the top line) vs. a forced-convection design. Vertical bars show calculated internal temperatures for rack-mount switches operating in a comfortable room ambient of 25 °C.
For the forced convection design (left vertical bar), the internal components experience a temperature delta above room ambient of 15 °C, and therefore operate at 40 °C. The green dot on the left shows the expected reliability of a fan-cooled switch, when the true internal heat of the components is overlaid on the curve: 7.5 years. For the free-convection switch (right vertical bar), the internal components experience a temperature that is 25 °C higher and therefore operate at 65 °C. Reliability of the free-convection design plummets from 9.5 years to 5.5 years.
The challenges of cooling electronics within an enclosed box structure are likely to become even more demanding. Studies show that the heatload per product footprint (watts/ft
Industrial engineers are working hard to develop more efficient heatsinks to ameliorate the damaging effects of higher operating temperatures. Shrinking form factors, use of copper for superior thermal conductivity, and advanced heatsink designs (crimped, skived, micro-forged or machined fin structures) can sink higher thermal loads despite restricted airflow. Nonetheless, all other things equal, the ability to efficiently cool internal components will be a better predictor of product reliability than the calculated potential for electro-mechanical failure.
The belief that forced-convection devices introduce an unacceptable level of unreliability in industrial systems is so prevalent that many industrial Ethernet switch vendors cite 'no moving parts' as the automatically preferred solution. It is more reasonable to say that there are legitimate places for fan-cooled and passively cooled products in industrial environments.
In 'dirty' environments, sealed boxes and free convection are necessary to achieve acceptable reliability. When environmental pollution is not a driving factor, however, fan cooling may be the better option. In a growing number of industrial applications, the current 'obvious' choice—free convection—may not be so obvious after all.
Madren, GarrettCom president for more than 10 years, is an innovator in Ethernet solutions for industrial and telecommunications applications. He has more than 30 years’ experience in computing and networking and holds a BSEE from North Carolina State and an MBA from Harvard.
The Bellcore Reliability Prediction Procedure (RPP) is a widely used predictor of MTBF based on system component parameters, such as number of transistors, power dissipation, and environmental factors. For detailed information, click here .
Frank Madren is president of GarrettCom Inc.,