Image © Maksim Kabakou, 123RF.com

Image © Maksim Kabakou, 123RF.com

The Fine Art of Troubleshooting

Welcome

Article from ADMIN 53/2019
By
System troubleshooting is an art. It is a science. And, sometimes, it's brute force.

System troubleshooting is an art. It is a science. And, sometimes, it's brute force.

Junior system administrators have often asked, "How do you troubleshoot a problem when you have no clue where to start?" My answer has never changed: Start with the simple things first. This advice has helped me resolve every problem I've ever encountered over the past 20 years. Sure, some problems are difficult to solve, and some even seem impossible, but if you start with the simple things first, your chances of success are very high.

People in general tend to complicate problems and solutions. They tend to reach for the least probable cause for a problem and then apply the least likely solution to resolve it. I guess it's just human nature to assume that there is no easy problem or easy solution. I have found just the opposite. Most of the problems that I've seen have a reasonable cause and a relatively simple solution. I've been on many root cause analysis and postmortem calls, where I said, "I rebooted the system and everything came back as it should." Of course, I always had to explain why that resolution was the correct one and it was usually met with unhealthy skepticism and much criticism.

I can't count the number of times I heard, "Well, rebooting fixed the issue temporarily, but you didn't really resolve the problem or apply a permanent fix to it." My task was to restore service and not to spend days or weeks researching a memory leak in an application. A reboot fixed the problem. Subsequent reboots will continue to resolve the problem. Until the developers fix the application, rebooting is the correct response to the problem.

System administrators, especially junior admins, love to see long uptimes for systems. It is impressive to see a system that has an uptime of 500+ days. Everyone loves bragging rights of long uptimes. I once worked on a system that had an uptime of more than 1,300 days – a Sun Enterprise 450

...
Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • No Hands
    As system administrators, we deal with a variety of issues, problems, and tasks that face us on a regular basis. Our managers ask us to solve problems with fewer staff. They ask us to "make do" with underpowered systems.
  • Looking Backward, Looking Forward
    Whatever IT challenges you face in 2025, will you meet them with anger, fear, and indifference – or will you go forward with interest, curiosity, and humility?
  • Is System Administration Bound for Extinction?
    Writers and tech journalists have predicted for years that the system administrator role is an endangered species, with extinction just around the corner. Are they right?
  • Dealing with IT Burnout
    I'm not the first writer or the first system administrator to discuss IT job burnout, but I think I have a few ideas to help when it happens to you.
  • Welcome
    The old saying that no one plans to fail, but many fail to plan is true. However, in the complex and ever-evolving field of IT, sometimes all the planning doesn't guarantee success.
comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs



Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>
	</a>

<hr>		    
			</div>
		    		</div>

		<div class=