Free MS Excel template for analyzing queries
Updated December 19, 2006 (originally posted 12/12/06; added summary spreadsheet, much more functionality)
If you're lucky, you're using Mondosoft, WebSideStory, or some other commercial application to generate reports from your query data. If not, we've developed a pair of simple and free Excel templates (20Kb) that you can use to help you come to grips with your queries.
You don't need to be an Excel guru to use these templates. And you'll find they work well with the simple PERL parsing script we've previously made available; take the output and paste it into our spreadsheet. Make a few minor tweaks (instructions are included), and you'll get something that looks like this:

If you have the data available, you could add columns for # of results, # of results clicked through, whether or not a best bet is associated, and so on.
With additional minor changes, you the template will generate a nifty Zipf Distribution to really beat people over the head with the long tail/short head/middle torso message. Here's one which shows just the first 5% of unique queries (and look how long the tail is even with this subset!):

And here's another version of the distribution in logarithmic format:

An additional template is uesful for maintaining running totals of your numbers:

Feel free to add all sorts of customizations. We'd really appreciate it you'd share your own cool innovations and additions. If we use them in the next version of the spreadsheet, we'll credit you here and in the book.
And of course, please comment below if you have any questions or suggestions for how we can make this tool better.
–Lou Rosenfeld
Comments
Use a log-log graph for Zipf. Makes a straight line. You can't really tell how Zipf-y the graph is when the axes are linear.
Also, I've used a cumulative column. I'd like to know how many queries make up 50% of my traffic, for example. Do 20% of them really make 80%? Make a cumulative count column and a percentage column. Very useful for sizing your caches. Might be useful for coverage of manual Best Bets. You can add a column for queries with Best Bets (yes/no) and use IFSUM to sum those, then divide by the total queries.
Posted by: Walter Underwood | December 13, 2006 06:41 PM
Walter, we already had cumulatives, and were adding log charts when you commented. This and many more new features just posted; thanks for the input!
Posted by: Lou Rosenfeld | December 19, 2006 11:26 AM
You can get Excel-friendly reports form Ultraseek, too. Choose your report, and then check the box to generate it as tab-separated values. You'll get a downloadable .tsv file.
Posted by: Walter Underwood | December 21, 2006 06:16 PM
While there is a lot that can be accomplished with this data, it is insufficient (that is, it needs to be agumented with other design research): 1) it does not determine what results best statisfy the various search terms (terms do not fully imply intent -- the same search term could be used for entirely different scenarios) and 2) related to 1...the search terms do not suggest what the individual wants to 'do' with the content. Again, intent is relevant here.
In a an overall design research strategy, I would use this data to 1) point to items that need more in-depth research and 2) pair this data with ongoing feedback comments and/or continuous survey data (e.g. with a tool like WebIQ (http://www.usabilitysciences.com/research/index.html or their inline model that was design specifically for search feedback -- both highly recommended).
Posted by: Paula Thornton | January 3, 2007 08:18 PM
Paula, for certain, no single method is The Answer. (That's why I'm trying to publish books on a variety of methods!) We'll discuss how well SA fits (or doesn't) with a variety of UX methods in the book.
Posted by: Lou Rosenfeld | January 3, 2007 08:49 PM
To me the spreadsheet's dependance on perl is a real limitation. It means I have to get permission from my IT dept to install perl,it means the spreadsheet is in effect limited to the machine on which perl is available and assumes a level of knowledge of perl.
Is there a way you can make the spreadsheet standalone so that raw data gets pasted into the spreadsheet somewhere and it does all the processing?
I was hoping the spreadsheet would also tell me what the top queries with no results were.
I'm really disappointed with the availability of useful reporting in commercial search engines, despite the relatively low complexity of the reports required. Hopefully your spreadsheets can fill that gap.
Cheers
Charlie.
Posted by: Charlie Ward | February 26, 2007 12:58 AM
Hi Charlie; we'd love to see versions of the parsing script in other languages than PERL. We're just hoping someone might be willing to write them. (Whoever is, please let me know.)
But you should definitely be able to paste the data from one file (which originates on one machine) to another where your copy of Excel lives. That's how we've done it so far.
We'll be releasing an updated version soon, BTW. Just got to find the cycles to do so...
Posted by: Lou | February 26, 2007 12:16 PM