CompTIA Network+ N10-008 – Module 17 – Troubleshooting Networks Part 5

  • By
  • April 9, 2023
0 Comment

9. 17.8 Common Network Service Issues

In this video, we want to talk about some common network service issues, and we might run into some of these as we’re doing our day to day troubleshooting. One thing I’ve seen a lot of is an issue with name resolution. Maybe the DNS server that you’re using isn’t doing its job. I used to work at a university, and we had a DNS server located on campus that we maintained.

And many times we got a report that the Internet is damned down. And it wasn’t that the Internet was down. Our connection wasn’t down. It was simply that the DNS server was not doing its job. So what I would start doing to troubleshoot that is, I would ping IP addresses that I knew to be out on the Internet. Could I get to an Internet based host using its IP address instead of resolving its name? If so, then we could start to focus our troubleshooting on name resolution.

For example, you might want to try to ping eight eight or eight four Four. Those are a couple of Google servers that do DNS resolution for us, and those are some easy to remember IP addresses. So if you can ping those, you’re getting out to the Internet. But if you cannot get to Cisco. com, but you can ping those addresses, then you might want to focus on name resolution troubleshooting. We might also be configured with an incorrect default gateway. Think about it. Our default gateway is the way we get off of our subnet out to the rest of the world. And if we’re pointing to an incorrect IP address for our default gateway, we’re probably not going to be going off of our local subnet. So let’s make sure we have that configured correctly, either statically or via DHCP.

Similarly, we want to have correct subnet mask information. The subnet mask for an IP version four address specifies which bits refer to the network portion of an address and which bits refer to the host portion of an address. So it’s possible that we could have a couple of IP addresses that are actually on different subnets just based on the subnet mask, although a casual observation might have us believe that they’re on the same subnet. For example, I might have ten 1160 and ten 1170. Are those on the same subnet? I don’t know.

It depends on the subnet mask. Sometimes they might be on the same subnet, other times they might be on different subnets. It depends on the subnet mask. For example, let’s say that we’ve got a couple of hosts and they don’t agree on what the subnet mask is. One host might think that an IP address is local when maybe it’s not, so it doesn’t try to go to the default gateway, or a host might think that a device is remote when it’s not.

So it does try to go to the default gateway to get to somebody that’s on the local subnet. Let’s make sure we have correct subnet masks set up. And we also want to make sure that all of our IP addresses on our network are unique. We don’t want to have overlapping IP addresses because we could have an unintentional host responding to our request because it has an overlapping IP address with the device that we wanted to reach. Now, this is not to say we should never have overlapping IP addresses. For example, you’ve got two sites in their own location. They’re remote from one another. You might be using the same IP addressing within each site. You might be using private IP addressing, RFC 1918 addressing. For example, both sites might be using the ten dot address space. That’s fine, because you’re going to be doing that network address translation as you leave one site going out over the Internet and then doing that again as you come into that remote site. But within a site. We do not want to have overlapping IP addressing. And similarly, we don’t want to have overlapping Mac addresses. This could corrupt our Mac Address table on a switch.

If it starts to believe that a specific Mac address is off gigabit one one. And then it sees evidence that oh, no, it’s off gigabit 110 and no, it’s back on one. No, it’s back on 110. That could cause a lot of instability in the Mac address table and prevent frames from being delivered to where they’re supposed to be delivered. Now, it’s rare that this would happen, but you can actually go in and statically configure a Mac address. And it’s possible that somebody might do that with malicious intent. And something else to keep in mind when it comes to IP addresses is we want to make sure that those IP addresses have not expired.

Remember earlier in the course, we talked about DHCP, and that was a really common way that hosts could get their IP address information. But that DHCP information was a lease. It was not a permanent assignment. The DHCP server would be saying to the host for a period of time, let’s say three days. As an example, you have this IP address and the associated information like subnet mask and default gateway. And then about halfway through those three days, the host would try to renew its lease. But if that deed CP server had been down for a while, it didn’t hear that request to renew the lease, and then it comes back online after a period of time that IP address can expire. We want to make sure that we have an active IP address that has not expired.

And also somebody could intentionally or accidentally place a rogue DHCP server on our network. I think I gave you an example earlier where one time I was setting up Internet access for one of those big tech conventions, and everything was working beautifully. The night before, we as the service provider, had our DHCP server set up. But then the next morning, the vendors are all rolling in and setting up their equipment. And some of their equipment actually was configured as DHCP servers.

So we had about four different DHCP servers on the show floor, and lots of incorrect IP addresses were being handed out. And we discussed earlier in the course that madness ensued, but I could have prevented that had I used a feature called DHCP Snooping. And with that feature, we only trust DHCP offer messages coming in specific trusted ports on a switch. Another issue that I’ve seen quite a bit is when it comes to certificates. Let’s say that we have an untrusted SSL certificate, a Secure Sockets Layer certificate, to be a trusted certificate. That certificate which contains an entity’s public key, among other things, that certificate needs to be signed by a trusted third party.

And we call that trusted third party a CA. And that might stand for certification authority, or in some literature it’ll be called a certificate authority, but we call it a CA for short. We need that trusted third party that CA to sign a certificate. However, if we’re running a service on a local host that we have, I’ve seen this many times where local services will have self signed certificates, they’re not using a trusted third party. There’ll be this certificate, it’ll have a public key in it, but it’s not signed. So we might have to say that we will make an exception and trust specific certificates that are not signed, because we know that it was locally signed. However, if it was not intended to be self signed, then perhaps we should not trust it. But sometimes a certificate will not be trusted because it’s not signed, or perhaps it has expired. And one reason that I’ve seen a certificate expire is because the device reading the certificate, it doesn’t have correct time, it doesn’t know what the correct date is, it doesn’t know what the correct time is. So we might view a certificate as being expired, when really it’s not.

Or we can have time based access control lists in our network, and they say that during different times of the day or different days of the week, specific traffic is going to be permitted and or denied. And those devices that have those rules set up, if they don’t have correct time, then we can have very unpredictable behavior in the network. And having correct time is also really important for doing troubleshooting. Earlier in the course, we talked about doing event correlation where I can look at log information. Let’s say that something really bad happened at 02:00 A. m. Last night. Maybe I look at some log information and see that this server over here had an issue at 01:59 A. m. , and this other device, maybe a router, it had an issue at 150 08:00 A. m. .

Maybe I can do some event correlation and conclude that those events were related. And we talked in this course about how to set correct time on your network devices. Do you remember the protocol that we used? It was NTP the network time protocol. We made the observation that someone with one watch always knows what time it is, but someone with two watches is never quite sure. Yeah, we want to have one watch. We want to have a common time source. And another DHCP issue we might have is having a pool or a scope, a collection of IP addresses within a subnet. And maybe we’re out of those. We have exhausted our DHCP pool, or sometimes we call that a scope.

This happened to me not too long ago at a church where I support their network. I set up a DHCP server, and it was handing out IP addresses in a class C network. I think my scope might have had something like 200 IP addresses it could give out. But as people started coming in for Sunday services with their smartphones and some people had tablets and the staff had laptops on the network, all of those devices combined exhausted the DHCP scope. We ran out of IP addresses, and as a result, people were not able to get on the Internet because so many people had brought in their mobile devices and connected to the public WiFi. So what I had to do is I had to go in and change it from a Class C subnet with a 24 bit subnet mask. I had to back that off and use a class B subnet. I used the RFC 1918 address space of 170 216 16, and that gave me thousands and thousands of valid IP addresses in the same DHCP scope. And another really common issue we might run into when it comes to network services is having specific ports blocked. Here. I’m talking about TCP. Ports or UDP ports.

Those ports might be blocked by a firewall. Those ports might be blocked by an access control list. We need to think about all the rules that we have set up in the network. Are we really allowing traffic through that needs to get through. For example, if I’m going out to the Internet, I might say it’s a great idea to say that I want to set up an access control list that will allow traffic from the trusted inside network to go out to the big scary Internet. But I’ll have another rule that says don’t allow traffic from the Internet to come back into the private network for security reasons. But think about that.

That’s not going to work very well. If I want to have a conversation with a web server on the Internet, my packet can get to the web server. But that access control list configuration is going to drop the return traffic coming back from the web server. And we could get around an issue like that using a stateful firewall, as we discussed earlier. However, maybe we’re peering with a router at one of our sites. Maybe we’re using the EIGRP routing protocol. Well, if we’re not explicitly allowing EIGRP to flow through that firewall, in other words, we’re blocking eagrp traffic that could prevent the eagrp neighborship from forming. We need to have a clear understanding of exactly what ports need to get through our device, be it a router or a firewall. And sometimes the service is just non responsive. We try to go to a web service on a server and the page simply does not come up.

That’s happened to me many times, and oftentimes a fix for that is simply a reboot. Or we might just simply restart a service instead of rebooting the entire server. Oftentimes that will shake things loose enough to where the service starts running. And if it still doesn’t run after a reset, a reload or a reboot, then we might need to start digging a little bit deeper. Let’s see if that service is running on another server, maybe we can point to that other server instead. While we’re getting this issue figured out, do I need to reinstall some software on the server? That can be the issue.

And sometimes a service might not be responding because we have some sort of a hardware issue. Maybe we have an issue with the server’s memory or hard drive or a network interface card. And there could be other hardware issues on the network. I’ve seen switches have some ports that are working and some ports that are not working. There are a lot of potential hardware issues that we might run into. And this has been a look at some common network service issues that we should keep in the back of our mind as we’re doing our network troubleshooting.

10. 17.9 General Networking Issues

Over the last few videos, we’ve been considering some different troubleshooting targets we may run into. Let’s wrap up that series of videos with this video, considering some general network issues. First, we may have a misconfiguration of our device. When things are not working correctly. You may want to check the configuration, for example, of a router interface. Does it have the correct IP address? The subnet mask, is it administratively up? Is the speed? Is the duplex set correctly? And if you look at a router’s routing table, you may find missing routes and you start trying to track down why are those routes not getting advertised to me?

Maybe you’ve not formed a neighborship with a router because you have some mismatched parameters in your routing protocol. Or you might have redundant links between a couple of routers and you get into a routing loop. The good thing about a routing loop as compared to a switching loop, however, is that a layer three packet has a TTL field. That’s a time to live field, and it gets decremented every time we go through a router. So if I had a TTL of three and I’m in a loop, when it goes into the next router, it’s going to be decremented to a two, then it loops back to the original router and it’s decremented to a one. Then it goes to the other router and it’s decremented to a zero.

And when that TTL hits zero, the packet is going to be dis guarded. So we don’t have an endless loop situation like we might have in a switching topology. We can also check of the interface status that may give us some insight as to what’s going on. We can look for packets that are too big, that have FCS errors, those are called giants packets that are too small, those are called runts. What kind of errors, if any, are we getting? And many times we’re looking at stats and we don’t know if they’re good or bad. We’re not sure what that number means because we don’t have baselines to compare it against. So it is critical during the good times that you get some measurements of baseline information, such as what is the average bandwidth on this interface?

What is the average error rate on this Wang connection? And back in the Ethernet days, where we had Hubs, there was the issue of collisions where only one device could transmit at a time, and if two devices transmitted at approximately the same time, their frames could collide and they would have to be retransmitted. We’ve largely moved away from Hubs in the Ethernet world, but collisions can still happen in a wireless network. We may have a couple of wireless clients that transmit at the same time, and this is pre WiFi version six, by the way, and they may collide.

And we talked about the routing loops and how the TTL would decrement to zero and drop a packet that was in an endless loop. We don’t have that luxury with switches. With switches, we may get into a broadcast storm situation where there’s a broadcast on the network and it just loops forever between two switches over their redundant connections. The good news is there’s a fix for that.

And the fix, as we discussed earlier in the course, is STP the spanning tree protocol. In a multicast network, if we’re using a pim dense mode, that’s where the multicast source initially assumes that every router in this entire topology must want to receive my traffic. And that multicast stream is going to be flooded throughout all the routers. But any routers that do not connect to a client that wants to receive the multicast stream, they will prune themselves off of the tree that just got created by that flood. But in a large network, that flooding could have an impact on network performance. And the bad news is, in some cases that flooding and pruning behavior, it happens every three minutes.

That’s one of the reasons that I prefer to use a pim sparse mode instead of PIM dense mode, because it does not have that flooding and pruning behavior. And if we do have a couple of paths to get from a router out to some destination, maybe on the Internet, if we’re leaving via one path and coming back via another path, that could lead to asymmetrical routing. And depending on what application we’re using, that could cause issues. We may have DNS issues where we’re not able to resolve a domain name into an IP address. Oftentimes I would have a user say that I cannot get to the Internet or the Internet is broken. And as I investigated it, I realized that the DNS server that we had was simply not working. And what I did is I had a collection of IP addresses that were on the Internet, and I would try to ping those IP addresses.

And if I could successfully ping those IP addresses, I knew that the Internet was not broken. I just may not be resolving names and I could go check that DNS server. We might have issues with time. The network time protocol is supposed to keep our devices in sync in terms of time. But if it’s not working correctly, we may have some errors, such as a digital certificate expiring when it should not expire, because the date and time are not correct on the device. And today more and more people are starting to bring their own personal device into the corporate environment.

And that’s called BYOD bring your own device. They may bring their own laptop, their own tablet, their own smart phone, and expect to get on the network and do everything they want to do from their device. That could cause some issues if that device has any sort of security weakness. Maybe the antivirus software is not up to date or nonexistent. Maybe it’s using an implementation of wireless networking that’s not supported in the environment.

So what many organizations do is they have BYOD policies that govern what sort of device someone can bring. And finally, we may have an issue where a service is not working because the license has expired. Or maybe it does not exist. Maybe we just had a trial version of something so many times, and this can be true. For network appliances as well as servers, we may want to check the license licenses to make sure that they’re existing, that they cover the appropriate feature set, and that they’ve not expired.

Comments
* The most recent comment are at the top

Interesting posts

The Growing Demand for IT Certifications in the Fintech Industry

The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »

CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared

In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »

The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?

If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »

SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification

As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »

CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?

The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »

Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?

The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »

img