Monday, October 18, 2010

Getting Started with NServiceBus Part 7: Custom Performance Counters

Out of the box, NSB gives you a performance counter that tells you hold old the oldest message in the queue is. This gives you an idea as to how long messages are sitting in the queue. We decided to add a couple of new counters to tell us some other things about our performance.

The first one is what is the average processing time for a message? This is the time from when the message was pulled off the queue until the time the transaction completes. What this won't give us is granularity in determining which message handlers are moving slowly. Since most installations will have only a single or small amount of message handlers, this should be good enough for us.

The second counter is how many failures do I have per second? This is important since NSB has built-in retry logic. We have no visibility to failures unless we look at the log as processing occurs or waiting until the message moves to the error queue. We wanted to see quickly if we were processing slowing due to a lot of errors.

To get this going we can create a new profile similar to the NServiceBus.PerformanceCounters profile in NSB. First we create some classes to represent out monitoring:
    public class Monitoring : IProfile { }
    public class InstallMonitoring : IProfile { }
We have to have 2 profiles, one for running the counters and one for installing the counters. We'll cover the installation first. The first thing we need to do is remove the counters and reinstall them if they exist.
        public void ProfileActivated()
        {
            Logger.Debug("Starting installation of  monitoring");

            if (PerformanceCounterCategory.Exists(CategoryName))
            {
                Logger.Warn(String.Format("Category {0} already exists, going to delete it first", CategoryName));
                PerformanceCounterCategory.Delete(CategoryName);
            }

            CounterCreationDataCollection counterData = new CounterCreationDataCollection();

            counterData.AddRange(this.InstallAverageMessageProcessTimeCounter());
            counterData.AddRange(this.InstalledFailedMessageProcessingCounter());

            PerformanceCounterCategory.Create(CategoryName, "NServiceBus Monitoring", PerformanceCounterCategoryType.MultiInstance, counterData);

        }
Once we've ditched the counter we rebuild it by creating some counter data. I'll show one of the installs. To calculate the average time of something we need to do two things. We need to increment the time and also increment the basis of the event. The basis must be added after the average timer(you'll get an error if you don't).
        private CounterCreationDataCollection InstallAverageMessageProcessTimeCounter()
        {
            Logger.Debug("Starting installation of average process time monitoring");

            CounterCreationDataCollection counterData = new CounterCreationDataCollection();

            CounterCreationData avgTime = new CounterCreationData(AvgProcessTimeCounterName, 
                "Avg message processing time", 
                PerformanceCounterType.AverageTimer32);

            counterData.Add(avgTime);

            CounterCreationData avgBase = new CounterCreationData(AvgProcessTimeBaseCounterName, 
                "Avg message processing time base", 
                PerformanceCounterType.AverageBase);

            counterData.Add(avgBase);

            Logger.Debug("Installation of average processing time monitoring successful.");

            return counterData;
        }
Now that we have our counters installed, we need to create the profile to actual post to the counters. When your profile is activated, we'll tie into 2 key events in the NSB runtime, "StartedMessageProcessing" and "FinishedMessageProcessing". In doing so we can calculate the time that has elapsed and update our basis. We'll use a simple StopWatch to keep track of it all.
        private void MonitorAverageMessageProcessTime()
        {
            PerformanceCounter avgProcessTimeCounter = null;
            PerformanceCounter avgProcessTimeBaseCounter = null;
            Stopwatch watch = null;

            try
            {
                avgProcessTimeCounter = new PerformanceCounter(CategoryName, AvgProcessTimeCounterName, Program.EndpointId, false);
                avgProcessTimeBaseCounter = new PerformanceCounter(CategoryName, AvgProcessTimeBaseCounterName, Program.EndpointId, false);
            }
            catch (Exception e)
            {
                throw new InvalidOperationException("NServiceBus monitoring is not set up correctly. Running this process with the flag NServiceBus.InstallMonitoring should fix the problem.", e);
            }

            GenericHost.ConfigurationComplete +=
                (o, e) =>
                {
                    ITransport transport = Configure.Instance.Builder.Build<ITransport>();
                    transport.StartedMessageProcessing += (s, ea) => { watch = Stopwatch.StartNew(); };
                    transport.FinishedMessageProcessing +=
                        (s, ea) =>
                        {
                            watch.Stop();
                            avgProcessTimeCounter.IncrementBy(watch.ElapsedTicks);
                            avgProcessTimeBaseCounter.Increment();
                        };
                };
        }
Please note that since we are posting to a counter for each message this will somewhat degrade the performance. All we have to do now is run our process once with the "InstallMonitoring" profile to install the counters and then again with the "Monitoring" profile to post to the counters. You must have the process running before you see anything in PerfMon. When you fire PerfMon up, you should see something like this:

Since we installed the counters as Multi-Instance you should see multiple entries for each process running.  I have a simple Request/Response sample that I used to test this out.  Once you get sample running you can send a bunch of messages and start to see some data:

Sweet!  Now we have some additional ways we can monitor NSB.  As always, code can be found here: http://github.com/afyles/Blog/tree/master/NServiceBusExplorer/  Take a look at the "Profiles" project and the "RequestResponse" projects.

No comments:

Post a Comment