Google Hacks Free Open Book

Google Hacks

Previous Section Next Section

Hack 83 Searching Google Topics

figs/expert.giffigs/hack83.gif

A hack that runs a query against some of the available Google API specialty topics.

Google doesn't talk about it much, but it does make specialty web searches available. And I'm not just talking about searches limited to a certain domain. I'm talking about searches that are devoted to a particular topic. The Google API makes four of these searches available: The U.S. Government, Linux, BSD, and Macintosh.

In this hack, we'll look at a program that takes a query from a form and provides a count of that query in each specialty topic, as well as a count of results for each topic. This program runs via a form.

83.1 Why Topic Search?

Why would you want to topic search? Because Google currently indexes over 3 billion pages. If you try to do more than very specific searches you might find yourself with far too many results. If you narrow your search down by topic, you can get good results without having to exactly zero in on your search.

You can also use it to do some decidedly unscientific research. Which topic contains more iterations of the phrase "open source"? Which contains the most pages from .edu (educational) domains? Which topic, Macintosh or FreeBSD, has more on user interfaces? Which topic holds the most for Monty Python fans?

83.2 The Code

#!/usr/local/bin/perl
# gootopic.cgi
# Queries across Google Topics (and All of Google), returning 
# number of results and top result for each topic.
# gootopic.cgi is called as a CGI with form input

# Your Google API developer's key
my $google_key='insert key here';

# Location of the GoogleSearch WSDL file
my $google_wdsl = "./GoogleSearch.wsdl";

# Google Topics
my %topics = (
  ''       => 'All of Google',
  unclesam => 'U.S. Government',
  linux    => 'Linux',
  mac      => 'Macintosh',
  bsd      => 'FreeBSD'
);

use strict;

use SOAP::Lite;
use CGI qw/:standard *table/;

# Display the query form
print
  header(  ),
  start_html("GooTopic"),
  h1("GooTopic"),
  start_form(-method=>'GET'),
  'Query: ', textfield(-name=>'query'), '   ',
  submit(-name=>'submit', -value=>'Search'),
  end_form(  ), p(  );

my $google_search  = SOAP::Lite->service("file:$google_wdsl");

# Perform the queries, one for each topic area
if (param('query')) {
  print 
    start_table({-cellpadding=>'10', -border=>'1'}),
    Tr([th({-align=>'left'}, ['Topic', 'Count', 'Top Result'])]);

  foreach my $topic (keys %topics) {

    my $results = $google_search -> 
      doGoogleSearch(
        $google_key, param('query'), 0, 10, "false", $topic,  "false",
        "", "latin1", "latin1"
      );

    my $result_count = $results->{'estimatedTotalResultsCount'};

    my $top_result = 'no results';

    if ( $result_count ) {
      my $t = @{$results->{'resultElements'}}[0];
      $top_result = 
        b($t->{title}||'no title') . br(  ) .
        a({href=>$t->{URL}}, $t->{URL}) . br(  ) .
        i($t->{snippet}||'no snippet');
    }
   
    # Output
    print Tr([ td([
      $topics{$topic},
      $result_count,
      $top_result
      ])
    ]);
  }

  print 
    end_table(  ),
}

print end_html(  );

83.3 Running the Hack

The form code is built into the hack, so just call the hack with the URL of the CGI script. For example, if I was running the program on researchbuzz.com and it was called gootopics.pl, my URL might look like http://www.researchbuzz.com/cgi-bin/gootopic.cgi.

Provide a query and the script will search for your query in each special topic area, providing you with an overall ("All of Google") count, topic area count, and the top result for each. Figure 6-21 shows a sample run for "user interface" with Macintosh coming out on top.

Figure 6-21. Google API topic search for "user interface"
figs/gooH_0621.gif

83.4 Search Ideas

Trying to figure out how many pages each topic finds for particular top-level domains (e.g., .com, .edu, .uk) is rather interesting. You can query for inurl:xx site:xx, where xx is the top-level domain you're interested in. For example, inurl:va site:va searches for any of the Vatican's pages in the various topics; there aren't any. inurl:mil site:mil finds an overwhelming number of results in the U.S. Government special topic—no surprise there.

If you are in the mood for a party game, try to find the weirdest possible searches that appear in all the special topics. "Papa Smurf" is as good a query as any other. In fact, at this writing, that search has more results in the U.S. Government specialty search than in the others.

    Previous Section Next Section


         Main Menu
    Main Page
    Table of content
    Copyright
    Dedication
    Credits
    Foreword
    Preface
    Chapter 1. Searching Google
    Chapter 2. Google Special Services and Collections
    Chapter 3. Third-Party Google Services
    Chapter 4. Non-API Google Applications
    Chapter 5. Introducing the Google Web API
    Chapter 6. Google Web API Applications
    6.1 Hacks #60-85
    6.2 The Ingenuity of Millions
    6.3 Learning to Code
    6.4 What You'll Find Here
    6.5 Finding More Google API Applications
    6.6 The Possibilities Aren't Endless, but They're Expanding
    Hack 60 Date-Range Searching with a Client-Side Application
    Hack 61 Adding a Little Google to Your Word
    Hack 62 Permuting a Query
    Hack 63 Tracking Result Counts over Time
    Hack 64 Visualizing Google Results
    Hack 65 Meandering Your Google Neighborhood
    Hack 66 Running a Google Popularity Contest
    Hack 67 Building a Google Box
    Hack 68 Capturing a Moment in Time
    Hack 69 Feeling Really Lucky
    Hack 70 Gleaning Phonebook Stats
    Hack 71 Performing Proximity Searches
    Hack 72 Blending the Google and Amazon Web Services
    Hack 73 Getting Random Results (On Purpose)
    Hack 74 Restricting Searches to Top-Level Results
    Hack 75 Searching for Special Characters
    Hack 76 Digging Deeper into Sites
    Hack 77 Summarizing Results by Domain
    Hack 78 Scraping Yahoo! Buzz for a Google Search
    Hack 79 Measuring Google Mindshare
    Hack 80 Comparing Google Results with Those of Other Search Engines
    Hack 81 SafeSearch Certifying URLs
    Hack 82 Syndicating Google Search Results
    Hack 83 Searching Google Topics
    Hack 84 Finding the Largest Page
    Hack 85 Instant Messaging Google
    Chapter 7. Google Pranks and Games
    Chapter 8. The Webmaster Side of Google
    Colophon
    Index


    More Books
    PHP Hacks
    Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax
    The Koran (Holy Qur'an)
    Macromedia Flash 8 Bible
    Search Engine Optimization for Dummies
    YouTube Traffic
    PHP 5 for Dummies
    Harry Potter and The Chamber of Secrets
    Harry Potter and the Sorcerer's Stone
    The Pilgrim's Progress
    Wireless Hacks
    Flash Hacks. 100 Industrial-Strength Tips & Tools
    PayPal Hacks. 100 Industrial-Strength Tips and Tools
    Amazon Hacks
    Pdf Hacks
    The Da Vinci Code
    Google Hacks
    The Holy Bible
    Windows XP For Dummies
    Harry Potter and the Half-Blood Prince
    Seo Book
    Upgrading and Repairing Networks
    Macromedia Dreamweaver 8 UNLEASHED
    Windows XP Annoyances
    Windows XP Hacks
    Microsoft Windows XP Power Toolkit
    Teach Yourself MS Office In 24Hours
    iPod & iTunes Missing Manual
    PC Hacks 100 Industrial-Strength Tips and Tools
    PC Overclocking, Optimization, and Tuning - 2th Edition
    PC Hardware In A Nutshell 3rd Edition
    PC Hardware in a Nutshell, 2nd Edition
    Upgrading and Repairing PCs
    Google for Dummies
    MySQL Cookbook
    Teach Yourself Macromedia Flash 8 In 24 Hours
    PHP CookBook
    Sams Teach Yourself JavaScript in 24 Hours
    PHP5 Manual
    Free Games Paper Airplanes
    500 Juegos Gratis 500 Giochi Gratis 500 Jeux Gratuits 500 Jogos Gratis 500 Kostenlose Spiele