Google Hacks Free Open Book

Google Hacks

Previous Section Next Section

Hack 20 Searching Article Archives

figs/beginner.giffigs/hack20.gif

Google serves as a handy searchable archive for back issues of online publications.

Not all sites have their own search engines, and even the ones that do are sometimes difficult to use. Complicated or incomplete search engines are more pain than gain when attempting to search through archives of published articles. If you follow a couple of rules, Google is handy for finding back issues of published resources.

The trick is to use a common phrase to find the information you're looking for. Let's use the New York Times as an example.

20.1 Articles from the NYT

Your first intuition when searching for previously published articles from NYTimes.com might be to simply use site:nytimes.com in your Google query. For example, if I wanted to find articles on George Bush, why not use:

"george bush" site:nytimes.com

This will indeed find you all articles mentioning George Bush published on NYTimes.com. What it won't find is all the articles produced by the New York Times but republished elsewhere.

While doing research, keep credibility firmly in mind. If you're doing casual research, maybe you don't need to double-check a story to make sure it actually comes from the New York Times, but if you're researching a term paper, double-check the veracity of every article you find that isn't actually on the New York Times site.

What you actually want is a clear identifier, no matter the site of origin, that an article comes from the New York Times. Copyright disclaimers are perfect for the job. A New York Times copyright notice typically reads:

Copyright 2001 The New York Times Company

Of course, this would only find articles from 2001. A simple workaround is to replace the year with a Google full-word wildcard [Hack #13]:

Copyright * The New York Times Company

Let's try that George Bush search again, this time using the snippet of copyright disclaimer instead of the site: restriction:

"Copyright *  The New York Times Company" "George Bush"

At this writing, you get over three times as many results for this search as for the earlier attempt.

20.2 Magazine Articles

Copyright disclaimers are also useful for finding magazine articles. For example, Scientific American's typical copyright disclaimer looks like this:

Scientific American, Inc. All rights reserved.

(The date appears before the disclaimer, so I just dropped it to avoid having to bother with wildcards.)

Using that disclaimer as a quote-delimited phrase along with a search word—hologram, for example—yields the Google query:

hologram "Scientific American, Inc. All rights reserved."

At this writing, you'll get one result, which seems like a small number for a general query like hologram. When you get fewer results than you'd expect, fall back on using the site: syntax to go back to the originating site itself.

hologram site:sciam.com

In this example, you'll find several results that you can grab from Google's cache but are no longer available on the Scientific American site.

Most publications that I've come across have some kind of common text string that you can use when searching Google for its archives. Usually it's a copyright disclaimer and most often it's at the bottom of a page. Use Google to search for that string and whatever query words you're interested in, and if that doesn't work, fall back on searching for the query string and domain name.

    Previous Section Next Section


         Main Menu
    Main Page
    Table of content
    Copyright
    Dedication
    Credits
    Foreword
    Preface
    Chapter 1. Searching Google
    1.1 Hacks #1-28
    1.2 What Google Isn't
    1.3 What Google Is
    1.4 Google Basics
    1.5 The Special Syntaxes
    1.6 Advanced Search
    Hack 1 Setting Preferences
    Hack 2 Language Tools
    Hack 3 Anatomy of a Search Result
    Hack 4 Specialized Vocabularies: Slang and Terminology
    Hack 5 Getting Around the 10 Word Limit
    Hack 6 Word Order Matters
    Hack 7 Repetition Matters
    Hack 8 Mixing Syntaxes
    Hack 9 Hacking Google URLs
    Hack 10 Hacking Google Search Forms
    Hack 11 Date-Range Searching
    Hack 12 Understanding and Using Julian Dates
    Hack 13 Using Full-Word Wildcards
    Hack 14 inurl: Versus site:
    Hack 15 Checking Spelling
    Hack 16 Consulting the Dictionary
    Hack 17 Consulting the Phonebook
    Hack 18 Tracking Stocks
    Hack 19 Google Interface for Translators
    Hack 20 Searching Article Archives
    Hack 21 Finding Directories of Information
    Hack 22 Finding Technical Definitions
    Hack 23 Finding Weblog Commentary
    Hack 24 The Google Toolbar
    Hack 25 The Mozilla Google Toolbar
    Hack 26 The Quick Search Toolbar
    Hack 27 GAPIS
    Hack 28 Googling with Bookmarklets
    Chapter 2. Google Special Services and Collections
    Chapter 3. Third-Party Google Services
    Chapter 4. Non-API Google Applications
    Chapter 5. Introducing the Google Web API
    Chapter 6. Google Web API Applications
    Chapter 7. Google Pranks and Games
    Chapter 8. The Webmaster Side of Google
    Colophon
    Index


    More Books
    PHP Hacks
    Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax
    The Koran (Holy Qur'an)
    Macromedia Flash 8 Bible
    Search Engine Optimization for Dummies
    YouTube Traffic
    PHP 5 for Dummies
    Harry Potter and The Chamber of Secrets
    Harry Potter and the Sorcerer's Stone
    The Pilgrim's Progress
    Wireless Hacks
    Flash Hacks. 100 Industrial-Strength Tips & Tools
    PayPal Hacks. 100 Industrial-Strength Tips and Tools
    Amazon Hacks
    Pdf Hacks
    The Da Vinci Code
    Google Hacks
    The Holy Bible
    Windows XP For Dummies
    Harry Potter and the Half-Blood Prince
    Seo Book
    Upgrading and Repairing Networks
    Macromedia Dreamweaver 8 UNLEASHED
    Windows XP Annoyances
    Windows XP Hacks
    Microsoft Windows XP Power Toolkit
    Teach Yourself MS Office In 24Hours
    iPod & iTunes Missing Manual
    PC Hacks 100 Industrial-Strength Tips and Tools
    PC Overclocking, Optimization, and Tuning - 2th Edition
    PC Hardware In A Nutshell 3rd Edition
    PC Hardware in a Nutshell, 2nd Edition
    Upgrading and Repairing PCs
    Google for Dummies
    MySQL Cookbook
    Teach Yourself Macromedia Flash 8 In 24 Hours
    PHP CookBook
    Sams Teach Yourself JavaScript in 24 Hours
    PHP5 Manual
    Free Games Paper Airplanes
    500 Juegos Gratis 500 Giochi Gratis 500 Jeux Gratuits 500 Jogos Gratis 500 Kostenlose Spiele