By on 06/12/2013
SL has been a supplier of monitoring solutions for complex applications for a very long time. In this role, we’ve asked hundreds of highly-skilled developers of these apps as well as the business owners about problems they face on a day-to-day basis. While their answers are all over the map, I’ve seen a common thread running through them all.
Good software developers can build just about anything. That’s why they are so darned expensive. If you are responsible for delivering a critical application to your business, the last thing you want to do is waste your developers’ time.
These developers can debug anything too ! They usually have domain expertise and can be extremely resourceful. There is no shortage of tools they can use to help them, whether home-grown, or acquired from a vendor. They pull together data from many different sources and determine where a problem lies and recommend or implement a solution. They are good at this and they do it a lot … day after day after day.
And that’s the problem … they never get a break !
They are always “on-call” … always getting pinged. You almost want to warn them: Be careful what you volunteer for because you might be doing it for the rest of your life.
So how do we help these people ?
Let’s give them a way to take their expert knowledge about dependencies within these systems along with the current operating state … and make it readily visible to others in the organization. If the people who need this information can see if for themselves in a self-service way, then the developers are less likely to get pinged every time there is suspicion that something might be wrong. Application users can check for themselves whether systems are running OK, and only the really big problems should come to the experts.
We see this problem a lot with applications that are based on complex middleware such as messaging or distributed caching (such as Oracle Coherence). These subsystems are viewed as mysterious black boxes and get blamed first when anything isn’t working correctly. The developers are often the only ones capable of investigating and validating that things are OK.
SL has recognized that this is an important requirement of a good solution: don’t collect and analyze data just for developers to use, but provide ways to present the data in a relevant and self-service manner to others in the organization. This way, highly-skilled and expensive developers can work on stuff that really counts, new applications, performance improvements, better ways of doing things.
They might even be able to get a day at the beach !
Practically speaking, they are never going to be 100 % free of their role as expert support for complex critical applications. But if 50%, or maybe 70%, or even 90 % of the issues can be resolved by other people because they have access to all the important information, then that can be called progress !