Monday, December 31, 2007

Some New Performance Monitoring Tools

There is a simple and extensible open source C based daemon called collectd that writes to RRD files, an alternative to Orca/procallator for people who don't want the Perl based memory footprint of procallator. I'll check it out on my Gumstix millicomputer sometime.

There is yet another open source full-function monitoring tool called Zabbix that looks similar to Cacti in scope, possibly with more features, and with a SQL database backend. It has a commercial company backing it with support contracts etc, somewhat like the XE Toolkit.

The most interesting commercial tool I saw at CMG earlier this month is a capacity monitoring tool called PAWZ from Perfcap Corporation. The key thing they have worked on is taking the human out of the loop as much as possible with sophisticated capacity modelling algorithms and a simple and scalable operational model. It is very similar in concept to the capacity planning research I was working on and publishing in 2002-2004. The core idea is that you care about "headroom" in a service, and anything that limits that headroom is taken into account. Running out of CPU power, network bandwidth, memory, threads etc. will increase response time of the service, so monitor them all, track trends in headroom and calculate the point in time where lack of headroom will impact service response time. At eBay we used to call this the "time to live" for a service. You can easily focus on the services that have the shortest time to live, and proactively make sure that you have a low probability of poor response time. I'm going to take a closer look at this one...


  1. Hi Adrian,

    I was wondering whether you have ever checked out JXInsight a performance management tool designed around the SPE process with support for software execution models (contextual distributed trace paths), and system execution models (timeline analysis).

    We recently introduced a new approach to typical low level code or block execution profiling & monitoring based on a flexible resource metering (and billing) API that includes support for multiple metering strategies.

    Probe Metering Strategies

    Probes API Tutorial

    Probes in Action

    Probes is just one aspect of JXInsight it also supports metric monitoring, contextual traces (across processes), true resource transaction analysis, as well as a powerful runtime state diagnostics solution for fast problem determination.

    Kind regards,

    William Louth
    JXInsight Product Architect


  2. I switched from ORCA to using Nagios with PnP.

    I have written some bash scripts and a couple of perl scripts to execute remote commands via SSH to gather performance data.

    Using Orca as a grocery list, i have written scripts to gather the same information as Orca did but w/o the footprint that ORCA had.