Troubleshooting TCP IP
Troubleshooting TCP IP
Introduction
You just received another late evening page from the help desk. I have a problem with my network access, it just doesnt work! What should you do next? Troubleshoot! Troubleshooting is a necessary part of supporting any network installation. Determining and repairing problems can consume a lot of time, especially if you dont know what to do or how to do it correctly and quickly. In this paper, I will explain how you might consider troubleshooting different problems that could exist in your network. These techniques can all be performed using some of the common tools available in modern operating systems. The more you know about these tools, the better you can use them to fix your problems.
Methodology Required
Effective troubleshooting requires the use of a methodology. Without a methodology, you still may be able to successfully troubleshoot some problems. But other problems may be hidden, because you have not considered all of the possibilities. You may already have a favorite method for troubleshooting. If it works, use it. You might also wish to consider another approach. One method that has worked successfully is the use of the OSI model for problem determination. The OSI model is describes the data communication processes. The model consists of seven different layers. Each layer has a given set of responsibilities in the communication process. The seven layers are as follows: Layer 7 Layer 6 Layer 5 Layer 4 Layer 3 Layer 2 Layer 1 Application Presentation Session Transport Network Data Link Physical
Information flows from layer 7, the application layer, down through to layer 1, where it is placed on the physical network circuits. After it arrives at the next device it will flow up and down the layers as needed. Data switches typically only look at information at layer 2 to make switching decisions. Routers look at information
Page 2
at layer 3. Clients and Servers will make use of all seven layers. At the destination machine the information will rise from layer 1 up to layer 7, where it interfaces with the application. Using the OSI model as a foundation of the troubleshooting methodology allows you to examine the various layers of the model to make sure the involved devices and circuits are performing as expected. It is like trying to track a relative on a long distance trip from New York City to Los Angeles. After leaving New York, the relatives are expected to spend the night in Columbus, Ohio, a 560 mile trip. You can call the hotel in Columbus to find out if they got there. If they have, you know that part of the trip was successful. In using the OSI model, we can verify the success of the communication process at various layers or levels of the model. Did my communication attempt make it to a specific layer? Did it make it to a specific device? If not, where did it fail? Locating the point of failure helps you determine what the failure might be and what solution might be needed.
Activity Indicators
For your workstation, the key is to make sure that your device is actually sending or receiving information from other devices.
In the world of Windows, the sure indicator is the blinking lights found on the task bar. In the picture above, the two small screens are there to show the network activity. If you see the color of the screens change, blink, or turn solid a solid light blue color, you are sending and/or receiving information from or to other devices. Your physical network is working!
ARP
Another sure way to see if you have been actively communicating with other devices is to view the contents of your ARP cache. ARP is a process used to locate other devices on your local network to get their Media Access Control (MAC) address. If you have entries in your ARP cache, you are communicating with other devices.
Page 3
You can view your ARP cache by entering the command arp -a as seen above. In the example, the device has successfully communicated with four different IP devices. The only issue with ARP is that the ARP cache is often empty because old entries disappear on a regular schedule. It is not uncommon to find that there are ARP entries in your ARP cache.
NETSTAT
Another program can be used to determine local area network physical connectivity. NETSTAT is a common program on most operating systems. It has a number of capabilities but in this troubleshooting role, the NETSTAT option that shows Ethernet activity is the one you want to use.
By entering netstat -e you will see a display similar to the one above. Here we can clearly see that the Ethernet interface has sent and received some traffic. In this case it is not much, but activity has occurred. However this was in the past. Whats happening now? If you wait a minute or two and then enter the same command, you should see a change in the number. If you do, you are communicating. If you do not see a change, you possibly have a problem with your Ethernet connectivity.
Protocol Analyzers
If you have access to a protocol analyzer, you might use it to diagnose a physical network problem. The protocol analyzer captures the frame you are transmitting and receiving on your workstation. Once you start capturing frames with the tool, you should see the frame counters increase and may see the frames enter the capture buffer.
In this example, you can actually see the frames that have entered the capture buffer. A limiting factor for the use of a protocol analyzer as a diagnostic tool is the fact that most companies and organizations do not allow most users to have access to a protocol analyzer due to security considerations!
Page 4
Toward the middle of the display, you can also see the 5 minute input and output rates for the Ethernet 0/0 interface. The interface is averaging a mere one packet per second. This is extremely low for most devices, but it does show that the device is communicating.
Link Lights
You may also be able to determine if your device is successfully attached to the network by looking at the link lights on the device, if there are any. Some Network Interface Cards (NICs) have a light that indicates the cable is properly connected to the wall jack, and that the wall jack is connected to a switch or hub at the other end. The light shows that a physical connection exists, and that the network is available. A similar indictor light is found on the switch as well. A network technician might be able to tell you if the light for your device connection is on on the switch. If it is, look elsewhere for the problem. There are many other techniques and tools for troubleshooting the physical and data link layers. The ones suggested above are the most common and easiest for most people to use.
IPCONFIG or IFCONFIG
The first step in making sure we have proper addresses on a Windows devices is to use the ipconfig command.
Page 5
IPCONFIG shows the IP address, mask, and default gateway information that is currently in use on the device. You could have looked at the device configuration, but this is much quicker. It also does not show the whole story that a troubleshooter might need. Using the ipconfig /all command shows all of the configuration details. Here you can see the address of the DNS server, the DHCP server if in use, the DNS domain information and much more that you will need to verify.
Make certain that the IP address, the subnet mask, the default gateway, and the DNS server address are correct for the subnet in which you are located. If any of these four entries are incorrect, you are probably not going to be able to communicate successfully on your network. In the UNIX environment, the command that is used to see network device configuration is ifconfig. They are different commands that result in the display of similar information.
Ping
Once you have determined that your configuration is correct, use PING to test your connectivity in the network. To use ping, open the command window and enter ping <address|domain name>. You can ping by IP address or domain name of the device you wish to reach.
Page 6
A commonly used approach is to ping other workstations on your local network. If those pings do not work, you have a local connectivity problem of some type. Next you would ping your default gateway. If you cannot ping your default gateway, you do not have connectivity to the router and will not be able to access any devices that are not on your local area network. Next you might try to ping other servers or devices administered by your company that are attached to other internal networks. If all of these pings succeed but you still cannot get to devices attached to other networks, like devices on the Internet, you might need to use other tools. But before I go on, there might be one problem with ping. Ping is seen as a hacker tool in some organizations. Your attempts to ping devices might not work, even though you really have connectivity. Some organizations will not allow ping packets to go through routers. Using ping to find network problems in those places will be impossible.
Tracert/Traceroute
No matter what you call it, the ability to trace a route through the network is essential in troubleshooting routing problems. Traceroute was originally implemented in Unix with a very long name. Microsoft shortened the name to Tracert due to its early file naming limitations. It has stuck with the program since then. If you enter tracert <address|domain name>, the program will begin displaying the addresses of the routers used to go between your device and the device you wish to trace the route to.
In the example above, there are four routers between the original device and the target device, 192.168.2.50. Three times are displayed for the round trip between the original device and each router. The route was successfully traced. If, however, the route was not complete, the process would not show the trace complete message. A series of * would replace the time display, indicating that the next router or device did not respond.
Page 7
If you see a repeated set of lines with the *s instead of the times, the last router address displayed indicates the location of the break in the route. This information should be passed on to your network layer troubleshooters to help them discover and repair the situation.
Pathping
Microsoft has added an additional layer of functionality to tracert and has renamed the new program as Pathping. Pathping does the same initial process as tracert but then goes on to calculate more precise network times for each hop in the route.
For troubleshooting, tracert gives you results in a hurry. If you have more time and need more detailed throughput numbers, pathping might be more useful.
Page 8
If the router network is a bit unstable, it is also possible that a routing loop exists. Should that occur, you could possibly see a display like the one below. In this case, the packets that were sent to 10.1.1.50 were in an endless loop in the routed network and were discarded by router 168.193.1.1 because of that.
In both cases above, additional research by a network administrator is required to determine the actual cause of the problem. But the symptoms are well defined by the error messages received from ICMP.
Protocol Analyzer
As mentioned before, a protocol analyzer is an excellent tool to use in diagnosing networking problems. By opening the actual datagrams, you can get closer to the issues that are causing problems.
Telnet
One troubleshooting tool that is often overlooked is Telnet. Telnet is an application program that was designed for terminal emulation. The intended use is to provide command-line access to mainframes, UNIX systems, routers, and switches for applications and configuration. It can also be used to verify that an application is actually available on a server. For example, entering the following command will verify that an e-mail server is actually functioning on the named server: telnet server1 25
The 25 inserted after the server name is the port number used by SMTP e-mail servers. To use this method of verification, you must know the port numbers used by the various applications. This method only works for applications that use the Transmission Control Protocol (TCP).
Page 9
Port Scanning
Another method for determining if a service is available on a server is to use a Port Scanner. A word of caution is necessary. Port Scanners are often considered hacker tools, and the use of a port scanner in your enterprise network might be restricted. Before you try any port scanners determine if you have permission to do so. The port scanner sends probing messages to the server in question and sees if the server responds. In the example below, the port scanner was looking for all open User Datagram Protocol (UDP) ports. Each port represents an application that the server has available for users.
The same process can be run to determine the available TCP ports or applications. Below is an example of that use.
Page 10
In either case, you now know what applications are available on the server.
Protocol Analyzer
The use of a protocol analyzer in troubleshooting requires a substantial knowledge of the layer 4 protocols, TCP, and UDP. The analyzer shows each element of the datagram and each part of the associated protocols.
The example above shows a UDP header and the beginning of the associated RIP information captured from the network using a protocol analyzer. Knowledge of UDP and RIP are required to successfully use this information for troubleshooting. Here is another example. This shows a TCP segment involving a telnet session between a client and server.
Page 11
Summary
Learning how to troubleshoot is an ongoing process. Simple troubleshooting techniques that you might use to solve one issue will not be sufficient to solve a more complex issue. During down time, or if you have some extra idle time built into your schedule, become more familiar with your tool set. Practice using the tools as frequently as possible. Compare notes with your co-workers. Fixing network problems is a rewarding part of network administration. The more practice you get, the better you will become at performing this important function.
Learn More
Learn more about how you can improve productivity, enhance efficiency, and sharpen your competitive edge. Check out the following Global Knowledge courses:
Page 12
Understanding Networking Fundamentals TCP/IP Networking Network+ Boot Camp For more information or to register, visit www.globalknowledge.com or call 1-800-COURSES to speak with a sales representative. Our courses and enhanced, hands-on labs offer practical skills and tips that you can immediately put to use. Our expert instructors draw upon their experiences to help you understand key concepts and how to apply them to your specific work situation. Choose from our more than 700 courses, delivered through Classrooms, e-Learning, and On-site sessions, to meet your IT and management training needs.
Page 13