Pdf Hacks Free Open Book

Pdf Hacks

Origami Paper Planes
Paper Airplane Origami Boats. Learn hot to flod this crafts
Previous Section  < Day Day Up >  Next Section

Hack 7 Copy Data from PDF Pages

figs/moderate.gif figs/hack7.gif

Extract data from PDF files and use it in your own documents or spreadsheets.

Copying data from one electronic document to paste into another should be painless and predictable, such as the process depicted in Figure 1-7. Trying to copy data from a PDF, however, can be frustrating. The solution for Acrobat 6 and Adobe Reader users (on Windows, anyway) comes from an unlikely source: Acrobat 5.

Figure 1-7. TAPS faithfully copying formatted text and tables using Acrobat or Reader
figs/pdfh_0107.gif


Acrobat 5 includes the excellent TAPS text/table selection plug-in. Acrobat 6 does not. Because Acrobat plug-ins are modular, you can copy the TAPS folder (named Table) from the Acrobat 5 plug_ins folder [Hack #4] and paste it into the Acrobat 6 plug_ins folder. Voilà! Don't have Acrobat 5? The TAPS license permits liberal distribution, so visit http://www.pdfhacks.com/TAPS/ to view the license and download a copy. Don't have Acrobat 6, either? Use Adobe Reader instead. TAPS works in both Acrobat and Reader. Who would have guessed?

1.8.1 Adobe Reader 5 and 6

Adobe Reader gives you a single, simple Text Select tool that works well on single lines of text but not on tables or paragraphs. Sometimes it selects more text than you want. For greater control, hold down the Alt key (Version 6) or the Ctrl key (Version 5) and drag out a selection rectangle. Multiline paragraphs copied with this tool do not preserve their flow. Pasted into Word, each line is a single paragraph. Yuck!

You need the TAPS plug-in, which copies paragraphs and tables with fidelity. Copy the entire Table folder from your Acrobat 5 plug-ins directory (e.g., C:\Program Files\Adobe\Acrobat 5.0\Acrobat\plug_ins\Table) into your Reader plug-ins directory (e.g., C:\Program Files\Adobe\Acrobat 6.0\Reader\plug_ins). Restart Reader.

If you don't have Acrobat 5, visit http://www.pdfhacks.com/TAPS/ and download Acrobat_5_TAPS.zip. Unzip, and then move the resulting TAPS folder into your Reader plug_ins directory. Restart Reader. You'll now have the Table/Formatted Text Select Tool, as shown in Figure 1-8.

Figure 1-8. TAPS adding the Table/Formatted Text Select Tool under your Select Text button
figs/pdfh_0108.gif


The next section provides tips on how to use TAPS.

1.8.2 Acrobat 5

Acrobat 5 provides the same simple Text Select tool that Reader has. Use this basic tool for copying small amounts of unformatted text, as described previously in this hack.

For copying large amounts of formatted text, use the Table/Formatted Text Select (a.k.a. TAPS) tool. You can use it on paragraphs, columns, and tables. It preserves paragraph flow and text styles. Check its preferences (Edit Preferences Table/Formatted Text . . . ) to be sure you are getting the best performance for your purposes.

Activate the TAPS tool, then click and drag a rectangle around the text you want copied. Release the mouse and your rectangle turns into a resizable zone. There are two types of zones: Table (blue) and Text (green). If the tool's autodetection creates the wrong type of zone, right-click the zone and a context menu opens where you can configure it manually.

Copy the selection to the clipboard or drag-and-drop it into your target program.

1.8.3 Acrobat 6

Something went wrong with Acrobat 6 text selection. Adobe dropped the Table/Formatted Text Select tool (a.k.a. TAPS) and added the Select Table tool (a.k.a. TablePicker). This new tool is slow and performs poorly on many PDFs.

The solution is to get a copy of TAPS and install it into Acrobat 6. Section 1.8.1 explains how to find and install TAPS. Section 1.8.2 explains how to use TAPS.

A PDF owner can secure his document to prevent others from copying the document's text. In such cases, the text selection tools will be disabled. See [Hack #52] for a discussion on PDF security.


1.8.4 Selecting Text from Scanned Pages

If your document pages are bitmap images instead of text, try using Acrobat's Paper Capture OCR tool. It will convert page images into live text, though the quality of the conversion varies with the clarity of the bitmap image. You can tell when a page is a bitmap image by activating the Text Select tool and then selecting all text (Edit Select All). If the page has any text on it, the tool will highlight it. If nothing gets highlighted, yet the page appears to contain text, it is probably a bitmap image.

Sometimes, page text is created using vector drawings. This kind of text is not live text (so you can't copy it) and it also does not respond to OCR.

Acrobat 6 users can begin capturing a PDF by selecting Document Paper Capture Start Capture . . . . Unlike Acrobat 5, Acrobat 6 has no built-in limit on the number of pages you can OCR.

Acrobat 5 users (on Windows) must download the Paper Capture plug-in from Adobe. Select Tools Download Paper Capture Plug-in, and a web page will open with instructions and a download link. Or, download it directly from http://www.adobe.com/support/downloads/detail.jsp?ftpID=1907. This plug-in will OCR only 50 pages per PDF document.

    Previous Section  < Day Day Up >  Next Section
    Index: [SYMBOL][A][B][C][D][E][F][G][H][I][J][L][M][N][O][P][Q][R][S][T][U][V][W][X][Z]

    Origami Paper AirPlane
    Paper Airplane Origami Boats

         Main Menu
    PDF Hacks
    Table of Contents
    Copyright
    Credits
    Preface
    Chapter 1. Consuming PDF
    Introduction: Hacks #1-14
    Hack 1 Read PDFs with the Adobe Reader
    Hack 2 Read PDFs with Mac OS X's Preview
    Hack 3 Read PDFs with Ghostscript's GSview
    Hack 4 Speed Up Acrobat Startup
    Hack 5 Manage Acrobat Plug-Ins with Profiles on Windows
    Hack 6 Open PDF Files Your Way on Windows
    Hack 7 Copy Data from PDF Pages
    Hack 8 Convert PDF Documents to Word
    Hack 9 Browse One PDF in Multiple Windows
    Hack 10 Pace Your Reading or Present a Slideshow in Acrobat or Reader
    Hack 11 Pace Your Reading or Present a Slideshow in Mac OS X Preview
    Hack 12 Unpack PDF Attachments (Even Without Acrobat)
    Hack 13 Jump to the Next or Previous Heading
    Hack 14 Navigate and Manipulate PDF Using Page Thumbnails
    Chapter 2. Managing a Collection
    Chapter 3. Authoring and Self-Publishing: Hacking Outside the PDF
    Chapter 4. Creating PDF and Other Editions
    Chapter 5. Manipulating PDF Files
    Chapter 6. Dynamic PDF
    Chapter 7. Scripting and Programming Acrobat
    Colophon
    Index


    More Books
    PHP Hacks
    Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax
    The Koran (Holy Qur'an)
    Macromedia Flash 8 Bible
    Search Engine Optimization for Dummies
    YouTube Traffic
    PHP 5 for Dummies
    Harry Potter and The Chamber of Secrets
    Harry Potter and the Sorcerer's Stone
    The Pilgrim's Progress
    Wireless Hacks
    Flash Hacks. 100 Industrial-Strength Tips & Tools
    PayPal Hacks. 100 Industrial-Strength Tips and Tools
    Amazon Hacks
    Pdf Hacks
    The Da Vinci Code
    Google Hacks
    The Holy Bible
    Windows XP For Dummies
    Harry Potter and the Half-Blood Prince
    Seo Book
    Upgrading and Repairing Networks
    Macromedia Dreamweaver 8 UNLEASHED
    Windows XP Annoyances
    Windows XP Hacks
    Microsoft Windows XP Power Toolkit
    Teach Yourself MS Office In 24Hours
    iPod & iTunes Missing Manual
    PC Hacks 100 Industrial-Strength Tips and Tools
    PC Overclocking, Optimization, and Tuning - 2th Edition
    PC Hardware In A Nutshell 3rd Edition
    PC Hardware in a Nutshell, 2nd Edition
    Upgrading and Repairing PCs
    Google for Dummies
    MySQL Cookbook
    Teach Yourself Macromedia Flash 8 In 24 Hours
    PHP CookBook
    Sams Teach Yourself JavaScript in 24 Hours
    PHP5 Manual
    Free Games Paper Airplane
    Paper Airplane - Paper Airplane - Paper Airplane - Paper Airplane - Paper Airplane - Paper Airplane - Paper Airplane - Paper Airplane -