Thursday, December 18, 2014

[Pro]Active Sitecore Monitoring

When you hear the words "Application Lifecycle Management", the first thing that comes into your mind is likely Build / Deploy / Test workflow. But there is one thing which is missing here, and it is called "Monitoring".

Once you've deployed the website, you'll see that website performance may vary a lot depending on many known and unknown factors. They can be caused by both external (such as bot traffic, daily visits spikes, membership providers (CRM, AD) performance) and internal (different Sitecore jobs, scheduled tasks, automations, etc.) factors.

How do you know how the website performs? It's not something that you'll find in Sitecore logs, and it's not one hundred percent clear from IIS logs either.


Regarding the last part - IIS logs are actually incredibly useful for monitoring, as they contain information about every single request made to the website. So in this blog post, I'll show how to make them easy to read and useful for both monitoring and issue troubleshooting.

There are some awesome tools like AppDynamics and NewRelic which I also suggest to check out, but here I'll start with describing what you can get for free and scale without limits :)

First thing you should do is download:

ElasticSearch will be used as a main storage for the log files. It can scale-out automatically and handle terabytes of data in case you have lots of servers and want to keep the history forever.

Kibana is a tool which visualizes any information which was put into ElasticSearch.

LogStash - it can parse almost any kind of log messages into and save it as a document in ElasticSearch with strongly-typed field.

I won't go deep into the installation process, as there are many guides about it, here's the sample video about it: http://www.elasticsearch.org/videos/kibana-logstash/.

After you setup Kibana to visualize the log files, the default dashboard will probably look like this:



You can create your own dashboards using the snippets below:

1) All requests that took more than 5 seconds to process:

time-taken:[5000 TO *] 

2) And were processed by the server with hostname "WEB6D1"

AND server:"WEB4D1"

3) Came from specific IP address

AND clientip:"66.249.73.13" 

4) Let's show only POST requests to all pages that contain the word "Request" in their name:

AND method:"POST" AND path:"*Request.aspx*"

What else you can do? How about visualizing Email Campaign Manager (ECM) "Message Opened" events?

path:"RegisterEmailOpened.aspx"

It works using standard Lucene syntax (ElasticSearch is based on Lucene internally). Here are some more visual examples.

1. Want to know if there are any bots crawling the website right now? Facet the user agent field:


2. Similar facets block can be useful to see the most requested pages (or resources):


3. You can even show all the request at the world map, and it can be either all requests, or the ones you've filtered earlier, for example only ECM "Message Opened" events across the world:



One more nice benefit of using ElasticSearch/Kibana is that you can put any data there, and compare it side by side. Want to know whether you server slows down when your CRM is unresponsive - just plot response times for both at the same chart and you'll see it:



There is so much stuff you can do with it, that I could spend hours describing it, you can event calculate APDEX for your websites with Kibana:



I actually wanted to talk about this stuff at Sitecore Symposium, but unfortunately, my talk proposal wasn't approved.

If you have any question regarding the setup - just let me know and I'll be glad to help to make it work for your Sitecore website. It's an awesome set of tools, it saved me days(maybe even weeks) of my time, it scales, and it's been running in production for several months already :)