Google Hacks Free Open Book

Google Hacks

Previous Section Next Section

Hack 63 Tracking Result Counts over Time

figs/expert.giffigs/hack63.gif

Query Google for each day of a specified date range, counting the number of results at each time index.

Sometimes the results of a search aren't of as much interest as knowing the number thereof. How popular a is a particular keyword? How many times is so-and-so mentioned? How do differing phrases or spellings stack up against each other?

You may also wish to track the popularity of a term over time to watch its ups and downs, spot trends, and notice tipping points. Combining the Google API and daterange: [Hack #11] syntax is just the ticket.

This hack queries Google for each day over a specified date range, counting the number of results for each day. This leads to a list of numbers that you could enter into Excel and chart, for example.

There are a couple of caveats before diving right into the code. First, the average keyword will tend to show more results over time as Google ads more pages to its index. Second, Google doesn't stand behind its date-range search; results shouldn't be taken as gospel.

This hack requires the Time::JulianDay (http://search.cpan.org/search?query=Time%3A%3AJulianDay) Perl module.

63.1 The Code

#!/usr/local/bin/perl
# goocount.pl
# Runs the specified query for every day between the specified
# start and end dates, returning date and count as CSV.
# usage: goocount.pl query="{query}" start={date} end={date}\n}
# where dates are of the format: yyyy-mm-dd, e.g. 2002-12-31

# Your Google API developer's key
my $google_key='insert key here';

# Location of the GoogleSearch WSDL file
my $google_wdsl = "./GoogleSearch.wsdl";

use SOAP::Lite;
use Time::JulianDay;
use CGI qw/:standard/;

# For checking date validity
my $date_regex = '(\d{4})-(\d{1,2})-(\d{1,2})';

# Make sure all arguments are passed correctly
( param('query') and param('start') =~ /^(?:$date_regex)?$/
  and param('end') =~ /^(?:$date_regex)?$/ ) or
  die qq{usage: goocount.pl query="{query}" start={date} end={date}\n};

# Julian date manipulation
my $query = param('query');
my $yesterday_julian = int local_julian_day(time) - 1;
my $start_julian = (param('start') =~ /$date_regex/)
  ? julian_day($1,$2,$3) : $yesterday_julian;
my $end_julian = (param('end') =~ /$date_regex/)
  ? julian_day($1,$2,$3) : $yesterday_julian;

# Create a new Google SOAP request
my $google_search  = SOAP::Lite->service("file:$google_wdsl");

print qq{"date","count"\n};

# Iterate over each of the Julian dates for your query
foreach my $julian ($start_julian..$end_julian) {
  $full_query = "$query daterange:$julian-$julian";
  # Query Google
  my $result = $google_search ->
    doGoogleSearch(
      $google_key, $full_query, 0, 10, "false", "",  "false",
      "", "latin1", "latin1"
    );

  # Output
  print
    '"',
    sprintf("%04d-%02d-%02d", inverse_julian_day($julian)),
    qq{","$result->{estimatedTotalResultsCount}"\n};
}

63.2 Running the Hack

Run the script from the command line, specifying a query, start, and end dates. Perhaps you'd like to see track mentions of the latest Macintosh operating system (code name "Jaguar") leading up to, on, and after its launch (August 24, 2002). The following invocation sends its results to a comma-separated (CSV) file for easy import into Excel or a database:

% perl goocount.pl query="OS X Jaguar" \
start=2002-08-20 end=2002-08-28 > count.csv

Leaving off the > and CSV filename sends the results to the screen for your perusal:

% perl goocount.pl query="OS X Jaguar" \
start=2002-08-20 end=2002-08-28

If you want to track results over time, you could run the script every day (using cron under Unix or the scheduler under Windows), with no date specified, to get the information for that day's date. Just use >> filename.csv to append to the filename instead of writing over it. Or you could get the results emailed to you for your daily reading pleasure.

63.3 The Results

Here's that search for Jaguar, the new Macintosh operating system:

% perl goocount.pl query="OS X Jaguar" \
start=2002-08-20 end=2002-08-28
"date","count"
"2002-08-20","18"
"2002-08-21","7"
"2002-08-22","21"
"2002-08-23","66"
"2002-08-24","145"
"2002-08-25","38"
"2002-08-26","94"
"2002-08-27","55"
"2002-08-28","102"

Notice the expected spike in new finds on release day, August 24th.

63.4 Working with These Results

If you have a fairly short list, it's easy to just look at the results and see if there are any spikes or particular items of interest about the result counts. But if you have a long list or you want a visual overview of the results, it's easy to use these numbers to create a graph in Excel or your favorite spreadsheet program.

Simply save the results to a file, and then open the file in Excel and use the chart wizard to create a graph. You'll have to do some tweaking but just generating the chart generates an interesting overview, as shown in Figure 6-3.

Figure 6-3. Excel graph tracking mentions of OS X Jaguar
figs/gooH_0603.gif

63.5 Hacking the Hack

You can render the results as a web page by altering the code ever so slightly (changes are in bold) and directing the output to an HTML file (>> filename.html):

...
print
  header(  ),
  start_html("GooCount: $query"),
  start_table({-border=>undef}, caption("GooCount:$query")),
  Tr([ th(['Date', 'Count']) ]);

foreach my $julian ($start_julian..$end_julian) {
  $full_query = "$query daterange:$julian-$julian";
  my $result = $google_search ->
    doGoogleSearch(
      $google_key, $full_query, 0, 10, "false", "",  "false",
      "", "latin1", "latin1"
    );

  print
    Tr([ td([
      sprintf("%04d-%02d-%02d", inverse_julian_day($julian)),
      $result->{estimatedTotalResultsCount}
    ]) ]);
}

print
  end_table(  ),
  end_html;  
    Previous Section Next Section


         Main Menu
    Main Page
    Table of content
    Copyright
    Dedication
    Credits
    Foreword
    Preface
    Chapter 1. Searching Google
    Chapter 2. Google Special Services and Collections
    Chapter 3. Third-Party Google Services
    Chapter 4. Non-API Google Applications
    Chapter 5. Introducing the Google Web API
    Chapter 6. Google Web API Applications
    6.1 Hacks #60-85
    6.2 The Ingenuity of Millions
    6.3 Learning to Code
    6.4 What You'll Find Here
    6.5 Finding More Google API Applications
    6.6 The Possibilities Aren't Endless, but They're Expanding
    Hack 60 Date-Range Searching with a Client-Side Application
    Hack 61 Adding a Little Google to Your Word
    Hack 62 Permuting a Query
    Hack 63 Tracking Result Counts over Time
    Hack 64 Visualizing Google Results
    Hack 65 Meandering Your Google Neighborhood
    Hack 66 Running a Google Popularity Contest
    Hack 67 Building a Google Box
    Hack 68 Capturing a Moment in Time
    Hack 69 Feeling Really Lucky
    Hack 70 Gleaning Phonebook Stats
    Hack 71 Performing Proximity Searches
    Hack 72 Blending the Google and Amazon Web Services
    Hack 73 Getting Random Results (On Purpose)
    Hack 74 Restricting Searches to Top-Level Results
    Hack 75 Searching for Special Characters
    Hack 76 Digging Deeper into Sites
    Hack 77 Summarizing Results by Domain
    Hack 78 Scraping Yahoo! Buzz for a Google Search
    Hack 79 Measuring Google Mindshare
    Hack 80 Comparing Google Results with Those of Other Search Engines
    Hack 81 SafeSearch Certifying URLs
    Hack 82 Syndicating Google Search Results
    Hack 83 Searching Google Topics
    Hack 84 Finding the Largest Page
    Hack 85 Instant Messaging Google
    Chapter 7. Google Pranks and Games
    Chapter 8. The Webmaster Side of Google
    Colophon
    Index


    More Books
    PHP Hacks
    Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax
    The Koran (Holy Qur'an)
    Macromedia Flash 8 Bible
    Search Engine Optimization for Dummies
    YouTube Traffic
    PHP 5 for Dummies
    Harry Potter and The Chamber of Secrets
    Harry Potter and the Sorcerer's Stone
    The Pilgrim's Progress
    Wireless Hacks
    Flash Hacks. 100 Industrial-Strength Tips & Tools
    PayPal Hacks. 100 Industrial-Strength Tips and Tools
    Amazon Hacks
    Pdf Hacks
    The Da Vinci Code
    Google Hacks
    The Holy Bible
    Windows XP For Dummies
    Harry Potter and the Half-Blood Prince
    Seo Book
    Upgrading and Repairing Networks
    Macromedia Dreamweaver 8 UNLEASHED
    Windows XP Annoyances
    Windows XP Hacks
    Microsoft Windows XP Power Toolkit
    Teach Yourself MS Office In 24Hours
    iPod & iTunes Missing Manual
    PC Hacks 100 Industrial-Strength Tips and Tools
    PC Overclocking, Optimization, and Tuning - 2th Edition
    PC Hardware In A Nutshell 3rd Edition
    PC Hardware in a Nutshell, 2nd Edition
    Upgrading and Repairing PCs
    Google for Dummies
    MySQL Cookbook
    Teach Yourself Macromedia Flash 8 In 24 Hours
    PHP CookBook
    Sams Teach Yourself JavaScript in 24 Hours
    PHP5 Manual
    Free Games Paper Airplanes
    500 Juegos Gratis 500 Giochi Gratis 500 Jeux Gratuits 500 Jogos Gratis 500 Kostenlose Spiele