WLAN troubleshooting best practices
Because no system is ever perfect, a user’s wireless local area network (WLAN) will shut down. Understanding how and why this happened is important, and there are some fundamental steps the user can take to get things back on track.
Owing to various factors, including, but not limited to failure to survey, inadequate survey, change in environment or equipment, expansion of network, poor design, etc., wireless local area networks (WLANs) will begin to exhibit problems because no system is perfect. Some indicators are a general slowing down of the network; downloads are taking much longer, it is taking an inordinate amount of time to connect to the network, or it has become impossible to connect to the wireless network.
A WLAN works at the bottom two layers of the OSI stack—layer 1, the physical layer and layer 2, the datalink layer. Layers 1 and 2 are responsible for formatting and transmitting data over the wireless medium by formatting the data packet into a data frame and then accessing and controlling the medium while transmitting the frame, respectively. The data packet, also called the MAC Protocol Data Unit (MPDU), is the frame structure that contains the upper layer (layers 3 to 7) payload. The MAC Service Data Unit (MSDU), which is passed down from the user application to the Wi-Fi radios, is the physical layer. In a wired network, the physical layer would be a cable. When the radio attempts to access the network, it is the datalink layer that performs the medium arbitration to allow the connection. This layer is also responsible for authentication.
Slow operation, or reduced throughput, can result from several issues. One common issue is simply a higher client-to-access point (AP) ratio than what the system was designed for. For instance, the original network was designed using two APs serving 20 users. Theoretically, an AP can handle up to 2,007 users, but this is impractical. The rule of thumb for AP loading is roughly 50 clients per AP to assure trouble-free access and adequate bandwidth. Overloading an AP results in vastly reduced bandwidth and throughput and also results in increased medium contention and increased retransmissions due to repeated attempts at association. The fix for this problem is to add additional APs to reduce client loading or to institute load-leveling algorithms if available.
A related problem is seen in WLANs that have been configured for the same channel instead of using a multi-channel architecture. This arrangement results in co-channel interference, which creates an atmosphere rife with retransmissions. Single-channel architecture is not practical in any network that is not controller-based; a controller will actively manage client traffic and avoid co-channel interference. Unless it is controller-based, which brings a different set of problems, the WLAN must be designed to use different overlapping channels with at least a 20 Mhz separation on center. This problem straddles both bottom layers; on the physical layer, the channel frequency is corrupted due to interference, while the medium contention is a layer 2 function.
What if one or several clients are unable to connect to an AP? Or to several APs? This could be due to several factors. On layer 1, the problem could be due to the client not being on the correct channel, the client is using an incompatible method of modulation, the AP is being jammed, or the client radio or AP has been turned off. All of these are valid scenarios and surprisingly common. Start by identifying an AP that is definitely operating correctly and broadcasting its service set identifier (SSID). Determine that the client is operating properly by associating with this AP.
If the client is configured properly and is compatible, it should be a simple matter to identify which AP is the problem. Using a spectrum analyzer makes the task much easier, particularly if a jamming signal is present or suspected. A spectrum analyzer is a frequency domain device that will display the radio frequency (RF) signals of the wireless devices in the area. While a handheld device is most efficient, there are several good PC-based applications that will identify all compatible networks in the area and provide you with the SSID, channel, and power levels of the WLAN or AP.
Once the problem AP is located, try rebooting it before taking further steps. This simple step is effective in most cases. In the case of a client device, resetting the network adapter or rebooting the device almost always works if the AP is operational and configured correctly. These simple steps often save hours of fruitless troubleshooting and subsequent misconfiguration of devices. Another technique that is sometimes effective is to reset the TCP/IP stack on a client device. This is accomplished by using the netshell command line suite. To reset the stack, bring up the command line utility and at the command line enter:
netsh int ip reset
This command will reset the TCP/IP and DHCP registry keys that TCP/IP uses. Should you require a log for the reset, use the following command:
netsh int ip reset C:\resetlog.txt
Restart the computer after each reset and then do an ipconfig to determine if the reset has occurred and the client device has a valid IP address.
Attenuation is also a major concern that will render a WLAN useless. Metal and concrete dividers will greatly reduce signal strength and quality. Try moving the AP or client device to a different location and observe the received signal strength indicator (RSSI) after a few seconds. If the signal improves, consider a physical reconfiguration of the network devices, or add more APs to provide better coverage. This will also cut down on AP loading and probably improve throughput. Remember to properly assign channels because assigning the same channel to neighboring APs will only exacerbate existing problems. Placing of an AP on the far side of an elevator shaft will probably result in a loss of signal for most clients on the other side, while placing an AP on either side of the shaft will likely provide seamless coverage throughout the entire space.
Interference, whether unintended or malicious, is also a cause of a loss of network connectivity. After trying everything else and seeing no improvement, it could be time to bring in the big guns to assess your RF environment. This requires the use of a spectrum analyzer, a device that will determine what frequencies are being used in your area. If there is a source of interference, then the spectrum analysis will reveal the nature of it and possibly allow you to track it down and eliminate it. In the next segment we will talk about electromagnetic interference (EMI) and its effects on RF communications.
If you aren't knowledgeable about WLANs and the arcane and convoluted methods employed to effect communication in a very crowded spectrum, it is still advisable to retain an expert. The money spent on a seasoned professional will likely offset the frustrating time spent trying to determine why the WLAN is so slow or simply not working.
- Daniel E. Capano, owner and president, Diversified Technical Services Inc. of Stamford, Conn., is a certified wireless network administrator (CWNA); firstname.lastname@example.org. Edited by Chris Vavra, production editor, CFE Media, Control Engineering, email@example.com.
www.controleng.com/blogs has other wireless tutorials from Capano on the following topics:
Wireless intrusion detection and protection systems
Integrating a wireless LAN into an existing wired LAN
Choosing between single and multi-channel architecture
www.controleng.com/webcasts has wireless webcasts, some for PDH credit.
Control Engineering has a wireless page.
Learn more about Netstumbler, which is a free program, at www.netstumbler.com.