May 13th, 2008

Using packet Sniffing for Web Analytics

If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!

Packet Sniffing
Firstly a packet sniffer is a really simple application that passively listens to any network traffic that runs through or past a network card. When it ’sniffs’ the network it picks up all the packets for every protocol such as tcp/ip and ARP, it also picks up encrypted SSL packets.

This all sounds very technical and worlds away from anything related to marketing or web analytics so how does it fit in?

Well, using a packet sniffer you can pick up all the packets contained within a HTTP or HTTPS request. If it is HTTPS traffic then you can provide the SSL certificate to the packet sniffer and access the requests in their unencrypted form.

Once the packet sniffer has recreated the HTTP and HTTPS traffic it can then create a log file, similar to one created by a web server. From this you can use your favourite web log analyzer to process the log files and provide you with website visitor data.

So where does packet sniffing fit into the data collection methodologies?

You might already know that the main difference between page tags and log files is that page tag data is collection on the client side whereas log files are generated on the web server. Packet sniffing also resides on the web server or at least the Local Area Network (LAN). This means it has the same problems as log files with proxy caching and so is likely to be less accurate than page tags.

But there are advantages, packet sniffers pick up every piece of tcp traffic including form data that has been sent using the POST method and all packet sniffer applications will output that data. For technically minded web analysts there are loads of performance statistics about the network that are also output to the log files.

Another extremely useful aspect of packet sniffers is te ability to amalgamate data from multiple web servers into one log file. For example, lets say that a large content provider has 20 servers that are load balanced and in front of them there are 10 proxy servers. If we use standard log files then we need to either use the proxy logs assuming the proxy servers are all on the same platforms and can be configured correctly to output the required information, or cluster the 20 server log files during analysis. Using a packet sniffer in front of the proxies we can pick up all of the data from one point and because it uses passive sniffing it will not slow down the network traffic.

In any other situation I would suggest page tags or log files depending upon your preference. If you are currently using a packet sniffer(like Clipen) in your analytics environment I would be interested to hear of your experiences which you can detail in a comment below.

3 Responses to 'Using packet Sniffing for Web Analytics'

  1. 1Doug Watt
    September 7th, 2007 at 7:34 pm

    We sell Packet Sniffing for Web Analytics, loading enterprise databases and IT forensics applications. Some of the biggest US e-commerce sites use the technology to report on all their sites (our biggest had one day where they exceeded 120M pageviews in a day). We find very little problems with caching, this technical issue is very minor in practise as compared to tagging ( and we have done comparisions).

    The massive advantage of packet sniffing (Passive Data Capture) is that no site changes are required and there is no 30K javascript file downloaded which slows the page load down when it initialized all teh methods and is a hacker’s dream to get at your analytics and site.

    When you have to do custom tagging, you are changing the operational site. If its a big site, you probably need an expensive test system which does not exactly represent the traffic. There is a huge QA and approval process because of the risk of breaking the site so every analytics tweak is a big deal. We have clients that spent over a million dollars trying to tag the site and keep up with the maintenance for exactly this reason.

    With Passive Data Capture, you can test immediatley against the real traffic. Its easy to add test streams with different rules so you don’t affect the official verion until the changes are done. They are easy to check against teh aofficial version. Of course there is QA and approval, but you don’t need an expensive system, there is no risk of breaking the site and the whole process of implementation and changes is far simplere and quicker.

    Because we can emulate log files, including the ones produced by tag servers, we feed many Web Analytics packages including our own partners (Webabacus and Sawmill). Data is sessionized, filtered and transformed in real-time and can be streamed to an IP socket for real-time predictive analysis. On-lien stats, response times, alters, etc are also available.

    If you are going to do anything more than basic tagging, look at Passive Data Capture.

    Doug Watt
    www.metronomelabs.com


  2. 2Matt Hopkins
    September 7th, 2007 at 8:37 pm

    Thanks Doug, that’s quite enlighting.

    I know that some large websites do spend a lot of money on tagging up websites and it can be a really big problem.


  3. 3www.mstaggart.com » sniff sniff
    October 16th, 2007 at 9:23 pm

    […] my promise to conduct a self-study of web analytics, I am logging my learning for the day — packet sniffers.  They seem like they rock.  In a nutshell, packet sniffers are exactly what they sound like […]


Leave a Response