There are some problems that tend to be more popular than others. In this video, you’ll learn how to troubleshoot the most common hardware issues.
<< Previous Video: How to Troubleshoot Next: Troubleshooting Hard Drives >>
If you’re having problems with your hardware, then the underlying operating system is not going to be working very well either. So in this video, we’ll look at troubleshooting some very common hardware issues.
One type of very abrupt hardware problem is one where you’re using your computer normally and then suddenly everything goes black. Your computer might shut down and power off without any type of warning whatsoever. Very often, this problem is caused by excessive heat. Instead of having your system continue to run and damage components inside of your computer, it turns itself off completely to avoid any type of heat issues.
You may find that these issues manifest themselves often when you’re running an application that uses a lot of CPU or you’re doing gaming. Be sure to check all of the fans and the airflow through your entire system and check the individual chips to make sure the heat sinks are on those chips and are working normally. You might have a way to monitor all of the fan status and the temperatures inside of your system through statistics available in the BIOS or from a third party application.
This problem might be occurring because of bad hardware in your computer. Once you boot your system back up, you may want to check Device Manager or check your Event Viewer to see if there’s anything in the logs. It might also be useful to run a hardware diagnostics of all of your different components to make sure that everything is working properly.
These types of shutdown issues can be challenging to troubleshoot because there’s so little information available. If you’ve recently made any changes to your system, you may want to remove some of that new hardware and see if the problem is still occurring.
Another challenging problem to resolve is one where you’re using your computer normally, the mouse is working, the keyboard is working, and then suddenly everything freezes. You’re still able to see information on the screen, but your mouse is no longer moving, and your keyboard doesn’t appear to be providing any input to the operating system. With some lockup problems, the issue is so abrupt that even after your reboot, there’s nothing in the logs to provide you with any additional information.
While the lock up is occurring, it might be useful to determine the extent of the lock up process. Check to see if any drive lights are flickering or if there any activity lights on your network interface card. And if you’re using Windows, you may want to use Control All Delete to see if you can prompt the system to provide you with a Task Manager.
This might be a good time to determine if any changes have occurred in your system. Have there been updated drivers or updated operating system patches? How often have those patches been added, and could we see any changes between the time before the patches were added and the time that we’re having these lockup issues?
Sometimes these lockup problems occur because the operating system has no other choice but to stop its activity due to low resources. So if you find your system is very low on memory or very low on storage space, this may have an impact on the operating system’s ability to function. If you’re still not able to determine why this particular lockup is occurring, it might be worthwhile to perform a hardware diagnostic to make sure that everything is working properly with the underlying system itself.
When you turn your computer on, it performs a basic test of the hardware. We call this basic diagnostic the POST or the Power On Self Test. It’s going to check to make sure that you have a CPU in your system, that there’s video available for you to see what’s coming out on the screen, and that your system has memory to be able to function.
If there’s a problem with any of those basic components, your system will begin to beep, and these series of beeps will help you determine what part of the power on self test process is having a problem. There are usually a different number of beeps depending on what the problem happens to be, or there may be longer or shorter beeps to help you understand where the problems may be occurring. You want to check the documentation for your system to determine what those beeps might mean.
These beep codes are very different between different BIOS manufacturers, so it’s not very useful to begin memorizing what a specific beep code might be. Instead, you should have an overall understanding of why the system might be beeping at you and then understand where you can go to get more information about that beep code.
If you turn on your computer and there’s no video on the screen at all, no messages, and everything stays completely blank, then you probably have some type of video problem. If the power on self test determines that there’s a video issue, you may have beeps that occur that can tell you that it wasn’t able to find any video on your computer. If you do have a bad external video card, you may want to replace that adapter, and you may want to check the BIOS configuration, especially if there are multiple options for video in your system to make sure that none of those might be accidentally disabled.
You might also find during the boot process that your system complains that the date and time is not accurate. And if that has occurred, then you probably need to replace the battery that’s on the motherboard. This battery maintains the date and time on the motherboard, especially if you remove the power from the computer. If you replace the battery, then your system will be able to maintain the date and time going forward.
You might also find that your system is trying to boot from an invalid device. For example, it should be booting from a hard drive, but instead, it’s trying to boot from a DVD ROM drive. If that’s the case, you may want to look at the boot order that’s in your BIOS configuration and set it so the most appropriate device is set closer to the top of the boot order.
Of course, you want to make sure that the startup device does have a valid operating system to boot from. And you may want to check to see if there’s any removable media in a startup device. You might have a startup device defined to be the DVD ROM drive, and if there is a non-bootable DVD ROM in that drive, then it won’t be able to boot an operating system. By removing that DVD ROM from the device, your system won’t try to boot from that drive and instead go to the next device in the list.
Another challenging problem to troubleshoot is when your system is continually rebooting. So it boots up, performs some type of startup process, and then begins rebooting all over again. You need to determine where this reboot process is occurring. Are you getting a BIOS screen only, or do you find that you get the operating system splash screen and then find the system is rebooting? Or does it go all the way to the desktop of your operating system to only then reboot the system?
This might be caused by a bad driver or bad configuration for your operating system. If you’re running Windows, you can press F8 during the startup process and choose the option to boot from a last-known working configuration. This will boot your system with the same configuration that you had the last time you had a successful log in.
Another option you may want to try is starting in Safe Mode. Press that F8 key during the startup process and select the Safe Mode inside of Windows. This will load a basic configuration of Windows. And once you’re inside there, you can even choose to disable any automatic restarts in the System Properties of Windows.
If you think this problem might be caused by hardware, you can try removing that hardware or replace it with known good hardware. It might also be useful to check all of the connections to all of the different systems to make sure that you’ve got good solid connections on all of your adapter cards and all of your power connectors.
One problem that’s difficult to work around is when you have no power at all. You press the power on your system and you’re not getting any lights or any activity. The question then, is there no power at the power source or the power outlet or is there no power coming from your power supply? One way to tell is to get your multimeter, be able to test the AC power at your outlets, and the DC power coming off of your power supply on the inside of your computer.
The problem might also be that some components inside of your computer appear to be getting power but other components are not. For example, your fans might be spinning, but the entire system is not booting up. One thing you might want to check is to see where the fan power is connected. You can sometimes connect fans directly to the power supply, or you may be connecting them to a fan controller on your motherboard.
If your fans are connected to the power supply and they’re starting up, that might indicate that at least some part of the power supply is working. If you’re getting no power on self test to your motherboard, the issue still could be power supply-related or it might be related to a bad motherboard.
If the problem is related to the power supply, we might be seeing lower voltages come right off of the power supply, which are enough to turn the fans but not start the motherboard. The best way to check is to get your multimeter or to get a power supply tester and check all of the different voltages coming from that power supply.
If you don’t have good airflow through your computer, then you’ll find that the temperature rises very quickly. We have heat that’s coming from your CPUs, from video adapters, from memory, and all of the other components that are inside of your computer. You want to check the cooling systems. Make sure that your fans are working and that air is being pulled through the computer. Check your heat sinks and make sure that they are attached properly to the chips, and make sure that everything is clean and there’s no dust that might be hindering the airflow through your computer.
Most computers these days have sensors that will tell us what the temperature is inside of their computer and the temperature of individual components. You can often see these numbers in your BIOS, or you can try some third party software to be able to see these. A good example might be HWMonitor in Windows. You can find that at cpuid.com.
Checking the amount of airflow through your system can be done very quickly just by putting your hand on the outside of the case. And you can also perform a visual check to make sure there’s nothing in the way that might be preventing that air flowing through your computer.
Our computers are not designed to be noisemakers. They should simply have a low hum that’s coming from the fans and the other components inside of your computer. If you’re hearing any grinding noises or rattling, then you may want to check and make sure that nothing is loose inside of your computer case.
If you’re hearing a scraping problem, then you may be having a problem with your hard drive. Make sure you have a good backup of all of your data, and you may want to run a hardware diagnostic to check on that device. If there’s a constant clicking noise coming from your computer, it might be hard drive-related, or it might be related to a bad fan or some debris that might be in the fan itself.
And if you happen to hear a loud pop and smell smoke, then you might have a problem with a blown capacitor. With the larger capacitors, it might be very easy to see if you’re running into problems. The top of the capacitors should be very flat, but if you see any capacitors that are bulged or capacitors that are missing the outside completely, then you’ll have to replace those components.
Sometimes you may find that your computer is working perfectly normally, but intermittently, you might have problems with one particular component. Sometimes the system is working and other times it isn’t, and it seems to be random when these particular problems are happening. Sometimes this is related to a bad adapter card installation, so you may want to check the adapter cards and perhaps reseat them inside of those adapter slots.
Also make sure that all of your adapter cards are screwed down to make sure they’re not going to move once they’ve been installed. Sometimes this is caused by bad hardware or hardware that may have a bad connection. This can be caused by the vibrations that are occurring constantly with the fans that are blowing through your system, and of course, there’s heat inside of your system that may be causing these problems as well. You may be able to tighten the connectors or you may have to replace the hardware to solve these intermittent problems.
You can also gather a lot of information about the hardware inside of your computer from indicator lights. Some motherboards have lights that will give you information as the computer is booting, and others have codes that will display that will give you a numeric message that shows you the process of the boot and it will tell you if any errors occur.
The power lights are also very useful to be able to see if any of the components inside of your system are receiving power from the power supply. If you look at the network interface card, there may be a number of different lights that you could see. One are these lights might be the link light which lights up whenever it sees another ethernet component at the other end of the cable.
There might also be another light that tells us what speed we’ve connected to the network. Sometimes this is the same light as the link light and it changes colors depending on if there is a link and what speed the link might be. There might also be another light on the network interface card that blinks whenever there is activity seen across the network.
If you smell smoke or there’s a burning smell coming from your computer, that would be a significant sign of some type of hardware issue. If you do smell something burning or you visually see smoke coming from your computer, you should unplug your system immediately. This might be able to prevent any additional damage from occurring to your system.
Once the system is cooled down, you may be able to look at the individual components on your motherboard or your adapter cards and see if you can identify where that smoke or the burn smell may have been coming from. This may be a case where you are literally sniffing around inside of your computer to determine where the problem area might be. And once you’ve replaced that hardware, you can then plug everything back in and get your system back up and running.
If you’re using your computer and you get a display like this, then you know that you’ve hit a significant problem. This is a Windows stop screen, and it’s a message that tells you that it found a problem that it was not able to recover from and it has stopped the entire operating system from going any further. Some people call this the Blue Screen of Death, because once you hit this screen, your operating system must be restarted to get everything going again.
If you’re troubleshooting this system, the information on the screen can be very important, so you want to be sure to make a note of all of the different messages this screen might be giving you. This information is also written to the event log inside of Windows. So if somebody has rebooted the system and didn’t make a note of this information, you’ll be able to go back to the event log and find this same data.
The information that’s on the stop screen can be very valuable during the troubleshooting process, but it does contain very esoteric information, such as memory addresses and device driver details. You may find that the manufacturer of the problematic application or device driver may find this information much more useful.
There are going to be details on this stop screen that you’re going to want to make a note of so that you can help resolve this problem. This stop screen says that the problem seems to be caused by a particular file that seems to be a device driver. And the error message is that there is a page fault in non-paged area. We then have a stop error message, a series of memory addresses, and then more detail about that specific device driver. Your stop message may look a little different than this, but you want to make sure that you get all of the important details when you begin the troubleshooting process.
If you’re using macOS, you might run into another type of hardware problem where you’re moving your mouse around the screen and then suddenly your mouse turns into a spinning ball that’s full of colors. Some people call this the Spinning Ball of Death, but this is really called the macOS 10 spinning wait cursor. And this is giving you feedback. That’s something inside the system is holding up the entire operating system from moving forward.
Sometimes the spinning ball goes away and you regain control of your operating system, but sometimes, once the ball begins spinning, it never stops and you’re never able to regain any type of access to the OS. There’s many possible reasons for spinning wait cursor. One might be related to the application that you’re using, and there might be a bug that needs to be resolved. Or there might be bad hardware in your computer, and the system’s not able to access that hardware and is instead putting up the spinning wait cursor while it tries to access that bad hardware. It could also be related to an overall slowdown. As information is paging in or out of disk, for example, the spinning ball might show up as that paging process is occurring.
If you’re not able to get any type of recovery, you’ll have to restart the system. And once you do restart the computer, you’ll have access to the console logs inside of macOS and you might be able to determine why this system was hung on that spinning ball of death.
Throughout this video, we’ve been talking about gathering information from the logs, so let’s look at where these logs are stored in the different operating systems. In Windows, there, of course, is Event Viewer where a centralized repository of logs is available for application, security, the operating system, and all other components of the OS. When you’re booting, there are a series of logs that are created during the boot process. You can turn this on inside of System Configuration. There is the option for boot log. And Windows will store that log under C colon backslash Windows, backslash ntbtlog.txt.
Inside of Linux, you’ll find most logs for the operating system and for applications under /var/log. And in macOS, you can view a consolidated view of logs in the Utilities folder using the Console app.
When an error message appears on the screen, the information contained in that error message can be very helpful for resolving the issue. Even if the error message seems very plain and something that’s very obvious, you should still document that that’s the error that you received. It will help during the troubleshooting process to know that you received that error versus something that might be more detailed or more complex. If you see an error message occur, it’s important to write down or take a picture or video of that error message on the screen, and it’s useful to always tell your users that if they ever see an error message to always make a note of what that message is saying.
We’ve all seen applications that provide us with error messages that may not make a lot of sense or they might have codes or messages that might only help the developer of the application. But that information is still very important to document, so make sure you make a note of any messages like that that appear on the screen. You’ll be able to take that information to the internet or to the third party developer and they’ll be able to tell you exactly what that means. This is going to save you a lot of time during your troubleshooting process, and you’ll be able to narrow down exactly where the problem might be because you have the exact error message.