Google Hacks Free Open Book

Google Hacks

Previous Section Next Section

Hack 100 Removing Your Materials from Google

figs/beginner.giffigs/hack100.gif

How to remove your content from Google's various web properties.

Some people are more than thrilled to have Google's properties index their sites. Other folks don't want the Google bot anywhere near them. If you fall into the latter category and the bot's already done its worst, there are several things you can do to remove your materials from Google's index. Each of Google's properties—Web Search, Google Images, and Google Groups—has its own set of methodologies.

100.1 Google's Web Search

Here are several tips to avoid being listed.

100.1.1 Making sure your pages never get there to begin with

While you can take steps to remove your content from the Google index after the fact, it's always much easier to make sure the content is never found and indexed in the first place.

Google's crawler obeys the "robot exclusion protocol," a set of instructions you put on your web site that tells the crawler how to behave when it comes to your content. You can implement these instructions in two ways: via a META tag that you put on each page (handy when you want to restrict access to only certain pages or certain types of content) or via a robots.txt file that you insert in your root directory (handy when you want to block some spiders completely or want to restrict access to kinds or directories of content). You can get more information about the robots exclusion protocol and how to implement it at http://www.robotstxt.org/.

100.1.2 Removing your pages after they're indexed

There are several things you can have removed from Google's results.

These instructions are for keeping your site out of Google's index only. For information on keeping your site out of all major search engines, you'll have to work with the robots exclusion protocol.

Removing the whole site

Use the robots exclusion protocol, probably with robots.txt.

Removing individual pages

Use the following META tag in the HEAD section of each page you want to remove:

<META NAME="GOOGLEBOT" CONTENT="NOINDEX, NOFOLLOW">
Removing snippets

A "snippet" is the little excerpt of a page that Google displays on its search result. To remove snippets, use the following META tag in the HEAD section of each page for which you want to prevent snippets:

<META NAME="GOOGLEBOT" CONTENT="NOSNIPPET">
Removing cached pages

To keep Google from keeping cached versions of your pages in their index, use the following META tag in the HEAD section of each page for which you want to prevent caching:

<META NAME="GOOGLEBOT" CONTENT="NOARCHIVE">
100.1.3 Removing that content now

Once you implement these changes, Google will remove or limit your content according to your META tags and robots.txt file the next time your web site is crawled, usually within a few weeks. But if you want your materials removed right away, you can use the automatic remover at http://services.google.com:8882/urlconsole/controller. You'll have to sign in with an account (all an account requires is an email address and a password). Using the remover, you can request either that Google crawl your newly created robots.txt file, or you can enter the URL of a page that contains exclusionary META tags.

Make sure you have your exclusion tags all set up before you use this service. Going to all the trouble of getting Google to pay attention to a robots.txt file or exclusion rules that you've not yet set up will simply be a waste of your time.

100.1.4 Reporting pages with inappropriate content

You may like your content fine, but you might find that even if you have filtering activated you're getting search results with explicit content. Or you might find a site with a misleading title tag and content completely unrelated to your search.

You have two options for reporting these sites to Google. And bear in mind that there's no guarantee that Google will remove the sites from the index, but they will investigate them. At the bottom of each page of search results, you'll see "Help Us Improve" link; follow it to a form for reporting inappropriate sites. You can also send the URL of explict sites that show up on a SafeSearch but probably shouldn't to safesearch@google.com. If you have more general complaints about a search result, you can send an email to search-quality@google.com.

100.2 Google Images

Google Images' database of materials is separate from that of the main search index. To remove items from Google Images, you should use robots.txt to specify that the Google bot Image crawler should stay away from your site. Add these lines to your robots.txt file:

User-agent: Googlebot-Image
Disallow: /

You can use the automatic remover mentioned in the web search section to have Google remove the images from its index database quickly.

There may be cases where someone has put images on their server for which you own copyright. In other words, you don't have access to their server to add a robots.txt file, but you need to stop Google's indexing of your content there. In this case, you need to contact Google directly. Google has instructions for situations just like this at http://www.google.com/remove.html; look at Option 2, "If you do not have any access to the server that hosts your image."

100.3 Removing Material from Google Groups

Like the Google Web Index, you have the option to both prevent material from being archived on Google and to remove it after the fact.

100.3.1 Preventing your material from being archived

To prevent your material from being archived on Google, add the following line to the headers of your Usenet posts:

X-No-Archive: yes

If you do not have the options to edit the headers of your post, make that line the first line in your post itself.

100.3.2 Removing materials after the fact

If you want materials removed after the fact, you have a couple of options:

  • If the materials you want removed were posted under an address to which you still have access, you may use the automatic removal tool mentioned earlier in this hack.

  • If the materials you want removed were posted under an address to which you no longer have access, you'll need to send an email to groups-support@google.com with the following information:

    • Your full name and contact information, including a verifiable email address.

    • The complete Google Groups URL or message ID for each message you want removed.

    • A statement that says "I swear under penalty of civil or criminal laws that I am the person who posted each of the foregoing messages or am authorized to request removal by the person who posted those messages."

    • Your electronic signature.

100.4 Removing Your Listing from Google Phonebook

You may not wish to have your contact information made available via the phonebook searches on Google. You'll have to follow one of two procedures, depending on whether the listing you want removed is for a business or for a residential number.

If you want to remove a business phone number, you'll need to send a request on your business letterhead to:

Google PhoneBook Removal
2400 Bayshore Parkway
Mountain View, CA 94043

You'll also have to include a phone number where Google can reach you to verify your request.

If you want to remove a residential phone number, it's much simpler. You'll need to fill out a form at http://www.google.com/help/pbremoval.html. The form asks for your name, city and state, phone number, email address, and reason for removal, a multiple choice: incorrect number, privacy issue, or "other."

    Previous Section Next Section


         Main Menu
    Main Page
    Table of content
    Copyright
    Dedication
    Credits
    Foreword
    Preface
    Chapter 1. Searching Google
    Chapter 2. Google Special Services and Collections
    Chapter 3. Third-Party Google Services
    Chapter 4. Non-API Google Applications
    Chapter 5. Introducing the Google Web API
    Chapter 6. Google Web API Applications
    Chapter 7. Google Pranks and Games
    Chapter 8. The Webmaster Side of Google
    8.1 Hacks #93-100
    8.2 Google's Preeminence
    8.3 Google's Importance to Webmasters
    8.4 The Mysterious PageRank
    8.5 The Equally Mysterious Algorithm
    8.6 Google's Ad Programs
    8.7 Keeping Up with Google's Changes
    8.8 In a Word: Relax
    Hack 93 A Webmaster's Introduction to Google
    Hack 94 Generating Google AdWords
    Hack 95 Inside the PageRank Algorithm
    Hack 96 26 Steps to 15K a Day
    Hack 97 Being a Good Search Engine Citizen
    Hack 98 Cleaning Up for a Google Visit
    Hack 99 Getting the Most out of AdWords
    Hack 100 Removing Your Materials from Google
    Colophon
    Index


    More Books
    PHP Hacks
    Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax
    The Koran (Holy Qur'an)
    Macromedia Flash 8 Bible
    Search Engine Optimization for Dummies
    YouTube Traffic
    PHP 5 for Dummies
    Harry Potter and The Chamber of Secrets
    Harry Potter and the Sorcerer's Stone
    The Pilgrim's Progress
    Wireless Hacks
    Flash Hacks. 100 Industrial-Strength Tips & Tools
    PayPal Hacks. 100 Industrial-Strength Tips and Tools
    Amazon Hacks
    Pdf Hacks
    The Da Vinci Code
    Google Hacks
    The Holy Bible
    Windows XP For Dummies
    Harry Potter and the Half-Blood Prince
    Seo Book
    Upgrading and Repairing Networks
    Macromedia Dreamweaver 8 UNLEASHED
    Windows XP Annoyances
    Windows XP Hacks
    Microsoft Windows XP Power Toolkit
    Teach Yourself MS Office In 24Hours
    iPod & iTunes Missing Manual
    PC Hacks 100 Industrial-Strength Tips and Tools
    PC Overclocking, Optimization, and Tuning - 2th Edition
    PC Hardware In A Nutshell 3rd Edition
    PC Hardware in a Nutshell, 2nd Edition
    Upgrading and Repairing PCs
    Google for Dummies
    MySQL Cookbook
    Teach Yourself Macromedia Flash 8 In 24 Hours
    PHP CookBook
    Sams Teach Yourself JavaScript in 24 Hours
    PHP5 Manual
    Free Games Paper Airplanes
    500 Juegos Gratis 500 Giochi Gratis 500 Jeux Gratuits 500 Jogos Gratis 500 Kostenlose Spiele