Google Hacks Free Open Book

Google Hacks

Previous Section Next Section

Hack 57 Programming the Google Web API with Python

figs/moderate.giffigs/hack57.gif

Programming the Google Web API with Python is simple and clean, as these scripts and interactive examples demonstrate.

Programming to the Google Web API from Python is a piece of cake, thanks to Mark Pilgrim's PyGoogle wrapper module (http://diveintomark.org/projects/pygoogle/). PyGoogle abstracts away much of the underlying SOAP, XML, and request/response layers, leaving you free to spend your time with the data itself.

57.1 PyGoogle Installation

Download a copy of PyGoogle and follow the installation instructions (http://diveintomark.org/projects/pygoogle/readme.txt). Assuming all goes to plan, this should be nothing more complex than:

% python setup.py install

Alternately, if you want to give this a whirl without installing PyGoogle or don't have permissions to install it globally on your system, simply put the included SOAP.py and google.py files into the same directory as the googly.py script itself.

57.2 The Code

#!/usr/bin/python
# googly.py
# A typical Google Web API Python script using Mark Pilgrim's
# PyGoogle Google Web API wrapper 
# [http://diveintomark.org/projects/pygoogle/]
# Usage: python googly.py <query>

import sys, string, codecs

# Use the PyGoogle module
import google

# Grab the query from the command-line
if sys.argv[1:]:
  query = sys.argv[1]
else:
  sys.exit('Usage: python googly.py <query>')

# Your Google API developer's key
google.LICENSE_KEY = 'insert key here'

# Query Google
data = google.doGoogleSearch(query)

# Teach standard output to deal with utf-8 encoding in the results
sys.stdout = codecs.lookup('utf-8')[-1](sys.stdout)

# Output
for result in data.results:
  print string.join( (result.title, result.URL, result.snippet), "\n"), "\n"

57.3 Running the Hack

Invoke the script on the command line as follows:

% python googly.py "query words"

57.4 The Results

% python googly.py "learning python"
oreilly.com -- Online Catalog: <b>Learning</b> 
<b>Python</b>
http://www.oreilly.com/catalog/lpython/
<b>Learning</b> <b>Python</b> is an 
introduction to the increasingly popular interpreted programming
language that's portable, powerful, and remarkably easy to use in both 
<b>...</b>   
...
Book Review: <b>Learning</b> <b>Python</b>
http://www2.linuxjournal.com/lj-issues/issue66/3541.html
<b>...</b> Issue 66: Book Review: <b>Learning</b> 
<b>Python</b> <b>...</b> Enter 
<b>Learning</b> <b>Python</b>. My executive summary 
is that this is the right book for me and probably for many others 
as well. <b>...</b>   

57.5 Hacking the Hack

Python has a marvelous interface for working interactively with the interpreter. It's a good place to experiment with modules such as PyGoogle, querying the Google API on the fly and digging through the data structures it returns.

Here's a sample interactive PyGoogle session demonstrating use of the doGoogleSearch, doGetCachedPage, and doSpellingSuggestion functions.

% python
Python 2.2 (#1, 07/14/02, 23:25:09) 
[GCC Apple cpp-precomp 6.14] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import google
>>> google.LICENSE_KEY = 'insert key here'
>>> data = google.doGoogleSearch("Learning Python")
>>> dir(data.meta)
['_  _doc_  _', '_  _init_  _', '_  _module_  _', 'directoryCategories', 
'documentFiltering', 'endIndex', 'estimateIsExact', 
'estimatedTotalResultsCount', 'searchComments', 'searchQuery', 
'searchTime', 'searchTips', 'startIndex']
>>> data.meta.estimatedTotalResultsCount
115000
>>> data.meta.directoryCategories
[{u'specialEncoding': '', u'fullViewableName': "Top/Business/Industries/
Publishing/Publishers/Nonfiction/Business/O'Reilly_and_Associates/
Technical_Books/Python"}]
>>> dir(data.results[5])
['URL', '_  _doc_  _', '_  _init_  _', '_  _module_  _', 'cachedSize', 
'directoryCategory', 'directoryTitle', 'hostName', 
'relatedInformationPresent', 'snippet', 'summary', 'title']
>>> data.results[0].title
'oreilly.com -- Online Catalog: <b>Learning</b> <b>Python'
>>> data.results[0].URL
'http://www.oreilly.com/catalog/lpython/'
>>> google.doGetCachedPage(data.results[0].URL)
'<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">\n
<BASE HREF="http://www.oreilly.com/catalog/lpython/"><table border=1
...
>>> google.doSpellingSuggestion('lurn piethon')
'learn python' 
    Previous Section Next Section


         Main Menu
    Main Page
    Table of content
    Copyright
    Dedication
    Credits
    Foreword
    Preface
    Chapter 1. Searching Google
    Chapter 2. Google Special Services and Collections
    Chapter 3. Third-Party Google Services
    Chapter 4. Non-API Google Applications
    Chapter 5. Introducing the Google Web API
    5.1 Hacks #50-59
    5.2 Why an API?
    5.3 Signing Up and Google's Terms
    5.4 The Google Web APIs Developer's Kit
    5.5 Using the Key in a Hack
    5.6 What's WSDL?
    5.7 Understanding the Google API Query
    5.8 Understanding the Google API Response
    Hack 50 Programming the Google Web API with Perl
    Hack 51 Looping Around the 10-Result Limit
    Hack 52 The SOAP::Lite Perl Module
    Hack 53 Plain Old XML, a SOAP::Lite Alternative
    Hack 54 NoXML, Another SOAP::Lite Alternative
    Hack 55 Programming the Google Web API with PHP
    Hack 56 Programming the Google Web API with Java
    Hack 57 Programming the Google Web API with Python
    Hack 58 Programming the Google Web API with C# and .NET
    Hack 59 Programming the Google Web API with VB.NET
    Chapter 6. Google Web API Applications
    Chapter 7. Google Pranks and Games
    Chapter 8. The Webmaster Side of Google
    Colophon
    Index


    More Books
    PHP Hacks
    Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax
    The Koran (Holy Qur'an)
    Macromedia Flash 8 Bible
    Search Engine Optimization for Dummies
    YouTube Traffic
    PHP 5 for Dummies
    Harry Potter and The Chamber of Secrets
    Harry Potter and the Sorcerer's Stone
    The Pilgrim's Progress
    Wireless Hacks
    Flash Hacks. 100 Industrial-Strength Tips & Tools
    PayPal Hacks. 100 Industrial-Strength Tips and Tools
    Amazon Hacks
    Pdf Hacks
    The Da Vinci Code
    Google Hacks
    The Holy Bible
    Windows XP For Dummies
    Harry Potter and the Half-Blood Prince
    Seo Book
    Upgrading and Repairing Networks
    Macromedia Dreamweaver 8 UNLEASHED
    Windows XP Annoyances
    Windows XP Hacks
    Microsoft Windows XP Power Toolkit
    Teach Yourself MS Office In 24Hours
    iPod & iTunes Missing Manual
    PC Hacks 100 Industrial-Strength Tips and Tools
    PC Overclocking, Optimization, and Tuning - 2th Edition
    PC Hardware In A Nutshell 3rd Edition
    PC Hardware in a Nutshell, 2nd Edition
    Upgrading and Repairing PCs
    Google for Dummies
    MySQL Cookbook
    Teach Yourself Macromedia Flash 8 In 24 Hours
    PHP CookBook
    Sams Teach Yourself JavaScript in 24 Hours
    PHP5 Manual
    Free Games Paper Airplanes
    500 Juegos Gratis 500 Giochi Gratis 500 Jeux Gratuits 500 Jogos Gratis 500 Kostenlose Spiele