Part 2 – Building Better Monitoring for DotNetNuke Servers

11 December, 2012 CloudContent ManagementDotNetNukeHostingSecuritySoftwareStabilityTechnologyWindows

This is Part 2 of a 4-part series on the construction and implementation of our new server monitoring service … included with all our Business and Enterprise server plans.

– Tony Valenti

Don’t Reinvent the Wheel … Find a Really Good Wheel and Make it Better 

In the last post we talked about the need for an updated monitoring solution.  Today we discuss the process we used to find the right solution.

We kicked off our project by researching alternative monitoring solutions. There are a large number of monitoring packages available from a good number of companies. We started investigating other software packages to see which one would suit us best.

One of the most popular monitoring solutions in the linux hosting industry is a piece of software called Nagios. It is a free, open-source Perl/MySQL app. After taking a good look at Nagios, we were not fully comfortable with it. Yes, it was free and it was open source, but did I really want to mess with the source code of our new monitoring solution? Did I really want my important customers using a free monitoring product? Did I really want to introduce a mission-critical Perl/MySQL app that we were going to have to maintain? Was there a solid product roadmap and vision for Nagios?

I think you can guess those answers.

Nagios was the favorite of Linux hosting providers, but we are a windows provider. Now, we don’t talk about the size of PowerDNN very often, but we are the 4th largest Windows hosting company in the world (right behind Rackspace, Softlayer, and GoDaddy). So it made sense to take a look at what our three bigger-brothers were doing.

Well, believe it or not, they weren’t doing that much at all. In almost every case, the extent of their server monitoring was limited to simple ping monitoring. Ping can tell you if a server is up or down (well, actually it can just tell you if it is responding to ping which usually means it is either up or down) but it doesn’t provide much useful information beyond that. I wanted to know information about all of the critical parts of a server: CPU, RAM, Disk, and network.

Simply Speaking

Observer Effect

Don’t Look Now

In order to do this, we were going to need a tool that could use the Simple Network Management Protocol (SNMP) to pull that data. SNMP was specifically built to be used by monitoring engines and it allows tools to send a request like “Get FtpServer.FtpStatistics.TotalFilesSent” to a server and process the response. One of the things that puts the “simple” in SNMP is that it is not a TCP “connection-oriented” protocol but is instead a UDP “connection-less” protocol.

The benefit of UDP in this context is that it requires less overhead.  The Observer Effect dictates (in technology, psychology, and physics) that “to observe is to disturb.”  When you are trying to monitor a server’s resource utilization and performance one should avoid adding a bunch of new processes to accomplish the task. To be clear, I am not saying that SNMP should have been built utilizing UDP, only that it was and that there are some benefits to that approach in this instance.

After a bit of looking for a solid SNMP monitoring tool, I ran into SolarWinds. I was immediately impressed by all of the different things they had monitoring solutions for: networks, servers, users, configs, storage … you name it. After looking at their Network Performance Monitor, it was pretty clear it was the right choice. It was great for server and network monitoring, it was module based (just like DotNetNuke), and it was a .NET / SQLServer app. It also had more important pieces, such as users, permissions, views, and access rights.  All of these were really great.

Nothing is perfect

However, there were some big pain points in Network Performance Monitor for us. It has no API and it really isn’t designed for hosting companies. Instead it is designed for a company like IBM to use to monitor all of IBM’s servers and it really isn’t designed to monitor Joe’s servers, Bob’s server, and Mary’s server. So Network Performance Monitor got us there in the types of tasks, but missed the mark on the kind of scope that we needed at PowerDNN.

In the next post we will discuss how we took the bare spokes of Network Performance Analyzer and turbo-charged it to fit a modern hosting environment.

– Read part 1 again. –          – Go on to part 3 now. –