Google Hacks Free Open Book

Google Hacks

Previous Section Next Section

Hack 62 Permuting a Query

figs/expert.giffigs/hack62.gif

Run all permutations of query keywords and phrases to squeeze the last drop of results from the Google index.

Google, ah, Google. Search engine of over 3 billion pages and 3 zillion possibilities. One of Google's charms, if you're a search engine geek like me, is trying various tweaks with your Google search to see what exactly makes a difference to the results you get.

It's amazing what makes a difference. For example, you wouldn't think that word order would make much of an impact but it does. In fact, buried in Google's documentation is the admission that the word order of a query will impact search results.

While that's an interesting thought, who has time to generate and run every possible iteration of a multiword query? The Google API to the rescue! This hack takes a query of up to four keywords or "quoted phrases" (as well as supporting special syntaxes) and runs all possible permutations, showing result counts by permutation and the top results for each permutation.

You'll need to have the Algorithm::Permute Perl module for this program to work correctly (http://search.cpan.org/search?query=algorithm%3A%3Apermute&mode=all).

62.1 The Code

#!/usr/local/bin/perl
# order_matters.cgi
# Queries Google for every possible permutation of up to 4 query keywords,
# returning result counts by permutation and top results across permutations.
# order_matters.cgi is called as a CGI with form input

# Your Google API developer's key
my $google_key='insert key here';

# Location of the GoogleSearch WSDL file
my $google_wdsl = "./GoogleSearch.wsdl";

use strict;

use SOAP::Lite;
use CGI qw/:standard *table/;
use Algorithm::Permute;

print
  header(  ),
  start_html("Order Matters"),
  h1("Order Matters"),
  start_form(-method=>'GET'),
  'Query:   ', textfield(-name=>'query'),
  '   ',
  submit(-name=>'submit', -value=>'Search'), br(  ),
  '<font size="-2" color="green">Enter up to 4 query keywords or "quoted phrases"</font>',
  end_form(  ), p(  );

if (param('query')) {

 # Glean keywords
 my @keywords = grep !/^\s*$/,  split /([+-]?".+?")|\s+/, param('query');

 scalar @keywords > 4 and 
  print('<font color="red">Only 4 query keywords or phrases allowed.</font>'), last; 

 my $google_search = SOAP::Lite->service("file:$google_wdsl");

 print 
  start_table({-cellpadding=>'10', -border=>'1'}),
  Tr([th({-colspan=>'2'}, ['Result Counts by Permutation' ])]),
  Tr([th({-align=>'left'}, ['Query', 'Count'])]);
 
 my $results = {}; # keep track of what we've seen across queries
 
 # Iterate over every possible permutation
 my $p = new Algorithm::Permute( \@keywords );
 while (my $query = join(' ', $p->next)) {

  # Query Google
  my $r = $google_search -> 
   doGoogleSearch(
    $google_key, 
    $query,
    0, 10, "false", "",  "false", "", "latin1", "latin1"
   );
     print Tr([td({-align=>'left'}, [$query, $r->{'estimatedTotalResultsCount'}] )]);
  @{$r->{'resultElements'}} or next;
   
  # Assign a rank
  my $rank = 10;
  foreach (@{$r->{'resultElements'}}) {
   $results->{$_->{URL}} = {
    title => $_->{title},
    snippet => $_->{snippet},
    seen => ($results->{$_->{URL}}->{seen}) + $rank
   };
   $rank--;
  }
}

print 
  end_table(  ), p(  ),
  start_table({-cellpadding=>'10', -border=>'1'}),
  Tr([th({-colspan=>'2'}, ['Top Results across Permutations' ])]),
  Tr([th({-align=>'left'}, ['Score', 'Result'])]);

foreach ( sort { $results->{$b}->{seen} <=> $results->{$a}->{seen} } keys %$results ) {
  print Tr(td([
    $results->{$_}->{seen},
    b($results->{$_}->{title}||'no title') . br(  ) .
    a({href=>$_}, $_) . br(  ) .
    i($results->{$_}->{snippet}||'no snippet')
  ]));
}

  print end_table(  ),
}
print end_html(  );

62.2 Running the Hack

The hack runs via a web form that is integrated into the code. Call the CGI and enter the query you want to check (up to four words or phrases). The script will first search for every possible combination of the search words and phrases, as Figure 6-1 shows.

Figure 6-1. List of permutations for applescript google api
figs/gooH_0601.gif

The script then displays top 10 search results across all permutations of the query, as Figure 6-2 shows.

Figure 6-2. Top results for permutations of applescript google api
figs/gooH_0602.gif

62.3 Using the Hack

At first blush, this hack looks like a novelty with few practical applications. But if you're a regular researcher or a web wrangler, you might find it of interest.

If you're a regular researcher—that is, there are certain topics that you research on a regular basis—you might want to spend some time with this hack and see if you can detect a pattern in how your regular search terms are impacted by changing word order. You might need to revise your searching so that certain words always come first or last in your query.

If you're a web wrangler, you need to know where your page appears in Google's search results. If your page loses a lot of ranking ground because of a shift in a query arrangement, maybe you want to add some more words to your text or shift your existing text.

    Previous Section Next Section


         Main Menu
    Main Page
    Table of content
    Copyright
    Dedication
    Credits
    Foreword
    Preface
    Chapter 1. Searching Google
    Chapter 2. Google Special Services and Collections
    Chapter 3. Third-Party Google Services
    Chapter 4. Non-API Google Applications
    Chapter 5. Introducing the Google Web API
    Chapter 6. Google Web API Applications
    6.1 Hacks #60-85
    6.2 The Ingenuity of Millions
    6.3 Learning to Code
    6.4 What You'll Find Here
    6.5 Finding More Google API Applications
    6.6 The Possibilities Aren't Endless, but They're Expanding
    Hack 60 Date-Range Searching with a Client-Side Application
    Hack 61 Adding a Little Google to Your Word
    Hack 62 Permuting a Query
    Hack 63 Tracking Result Counts over Time
    Hack 64 Visualizing Google Results
    Hack 65 Meandering Your Google Neighborhood
    Hack 66 Running a Google Popularity Contest
    Hack 67 Building a Google Box
    Hack 68 Capturing a Moment in Time
    Hack 69 Feeling Really Lucky
    Hack 70 Gleaning Phonebook Stats
    Hack 71 Performing Proximity Searches
    Hack 72 Blending the Google and Amazon Web Services
    Hack 73 Getting Random Results (On Purpose)
    Hack 74 Restricting Searches to Top-Level Results
    Hack 75 Searching for Special Characters
    Hack 76 Digging Deeper into Sites
    Hack 77 Summarizing Results by Domain
    Hack 78 Scraping Yahoo! Buzz for a Google Search
    Hack 79 Measuring Google Mindshare
    Hack 80 Comparing Google Results with Those of Other Search Engines
    Hack 81 SafeSearch Certifying URLs
    Hack 82 Syndicating Google Search Results
    Hack 83 Searching Google Topics
    Hack 84 Finding the Largest Page
    Hack 85 Instant Messaging Google
    Chapter 7. Google Pranks and Games
    Chapter 8. The Webmaster Side of Google
    Colophon
    Index


    More Books
    PHP Hacks
    Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax
    The Koran (Holy Qur'an)
    Macromedia Flash 8 Bible
    Search Engine Optimization for Dummies
    YouTube Traffic
    PHP 5 for Dummies
    Harry Potter and The Chamber of Secrets
    Harry Potter and the Sorcerer's Stone
    The Pilgrim's Progress
    Wireless Hacks
    Flash Hacks. 100 Industrial-Strength Tips & Tools
    PayPal Hacks. 100 Industrial-Strength Tips and Tools
    Amazon Hacks
    Pdf Hacks
    The Da Vinci Code
    Google Hacks
    The Holy Bible
    Windows XP For Dummies
    Harry Potter and the Half-Blood Prince
    Seo Book
    Upgrading and Repairing Networks
    Macromedia Dreamweaver 8 UNLEASHED
    Windows XP Annoyances
    Windows XP Hacks
    Microsoft Windows XP Power Toolkit
    Teach Yourself MS Office In 24Hours
    iPod & iTunes Missing Manual
    PC Hacks 100 Industrial-Strength Tips and Tools
    PC Overclocking, Optimization, and Tuning - 2th Edition
    PC Hardware In A Nutshell 3rd Edition
    PC Hardware in a Nutshell, 2nd Edition
    Upgrading and Repairing PCs
    Google for Dummies
    MySQL Cookbook
    Teach Yourself Macromedia Flash 8 In 24 Hours
    PHP CookBook
    Sams Teach Yourself JavaScript in 24 Hours
    PHP5 Manual
    Free Games Paper Airplanes
    500 Juegos Gratis 500 Giochi Gratis 500 Jeux Gratuits 500 Jogos Gratis 500 Kostenlose Spiele