MySQL Cookbook Free Open Book

MySQL Cookbook

Previous Section Next Section

10.35 Using Temporary Tables for Data Transformation

10.35.1 Problem

You want to preprocess input data for MySQL, but you don't have access to external utilities for doing so.

10.35.2 Solution

Load the data into a temporary table, reformat it using SQL statements, then copy the records into the final destination table.

10.35.3 Discussion

To work with information that must be checked or transformed before it's ready to be added to a table, it's sometimes helpful to load a datafile into a temporary table first for validation purposes. (It's generally easier to work with a dataset that is isolated into its own table rather than combined with other records.) After you have made sure that the temporary table's contents are satisfactory, copy its rows to the main table and then drop it. (Note that the use of "temporary" in this context doesn't necessarily imply that you must use the keyword TEMPORARY when creating the table.[6] If you process the table in multiple phases over the course of several server connections, you'll need to create a non-TEMPORARY table, then drop it explicitly when you're done with it.)

[6] CREATE TEMPORARY TABLE is discussed in Recipe 3.25.

The following example illustrates how to use a temporary table to solve a common problem: loading data into a table when the values do not have the format required by the table structure. Suppose you have a table main that contains three columns, name, date, and value, where date is a DATE column requiring values in ISO format (CCYY-MM-DD). Suppose also that you're given a datafile newdata.txt to be imported into the table, but the contents look like this:

name1   01/01/99    38
name2   12/31/00    40
name3   02/28/01    42
name4   01/02/03    44

Here the dates are in MM/DD/YY format and must be converted to ISO format to be stored as DATE values in MySQL. One way to do this would be to run the file through the cvt_date.pl script shown earlier in the chapter:

% cvt_date.pl --iformat=us --add-century newdata.txt >tmp

Then you can load the tmp file into the main table. But this task also can be accomplished entirely in MySQL with no external utilities by importing the data into a temporary table and using SQL to perform the reformatting operations. Here's how:

  1. Create an empty table in which to load the test data. The following statements create the table tmp as an empty copy of main and add a cdate column to hold the dates from the datafile as character strings:

    mysql> CREATE TABLE tmp SELECT * FROM main WHERE 1 < 0;
    mysql> ALTER TABLE tmp ADD cdate CHAR(8);
  2. Load the datafile into the temporary table, storing the date values in the cdate column rather than in date:

    mysql> LOAD DATA LOCAL INFILE 'newdata.txt' INTO TABLE tmp (name,cdate,value);
  3. Transform the cdate values from MM/DD/YY format to YY-MM-DD format and store the results in the date column:

    mysql> UPDATE tmp
        -> SET date = CONCAT(RIGHT(cdate,2),'-',LEFT(cdate,2),'-',MID(cdate,4,2));

    MySQL will convert the two-digit years to four-digit years automatically, so the original MM/DD/YY values in the cdate column end up in the date column as ISO values in CCYY-MM-DD format. The following query shows what the original cdate values and the transformed date values look like after the UPDATE statement has been performed:

    mysql> SELECT cdate, date FROM tmp;
    +----------+------------+
    | cdate    | date       |
    +----------+------------+
    | 01/01/99 | 1999-01-01 |
    | 12/31/00 | 2000-12-31 |
    | 02/28/01 | 2001-02-28 |
    | 01/02/03 | 2003-01-02 |
    +----------+------------+
  4. Finally, copy the records from tmp to main (using the transformed date values rather than the original cdate values) and drop the temporary table:

    mysql> INSERT INTO main (name, date, value)
        -> SELECT name, date, value FROM tmp;
    mysql> DROP TABLE tmp;

This procedure assumes that MySQL's automatic conversion of two-digit years to four digits produces the correct century values. This means that the year part of the values must correspond to years in the range from 1970 to 2069. If that's not true, you'd need to convert the year values some other way. (See Recipe 10.30.)

The procedure also assumes that the cdate values are always exactly eight characters so that LEFT( ), MID( ), and RIGHT( ) can be used to extract the pieces. If this assumption is invalid, you'd have to modify the conversion procedure. One possibility would be to use SUBSTRING_INDEX( ) to break apart the strings at the / delimiters:

mysql> UPDATE tmp
    -> SET date =
    -> CONCAT(SUBSTRING_INDEX(cdate,'/',-1),'-',
    -> SUBSTRING_INDEX(cdate,'/',1),'-',
    -> SUBSTRING_INDEX(SUBSTRING_INDEX(cdate,'/',2),'/',-1));

Another application for post-import processing is name splitting. If you import values consisting of a first name, a space, and a last name into a column full_name, you can reformat the column into separate first_name and last_name columns with these statements:

UPDATE tbl_name SET first_name = SUBSTRING_INDEX(full_name,' ',1);
UPDATE tbl_name SET last_name = SUBSTRING_INDEX(full_name,' ',-1);

However, this task can easily become more difficult if any of the names have middle initials, or trailing words like Jr. or Sr. If that's the case, you're probably better off preprocessing the names prior to import, using a pattern matching utility that's smarter about breaking full names into components.

    Previous Section Next Section
    Index: [SYMBOL][A][B][C][D][E][F][G][H][I][J][K][L][M][N][O][P][Q][R][S][T][U][V][W][X][Y][Z]


         Main Menu
    Main Page
    Table of content
    Copyright
    Preface
    Chapter 1. Using the mysql Client Program
    Chapter 2. Writing MySQL-Based Programs
    Chapter 3. Record Selection Techniques
    Chapter 4. Working with Strings
    Chapter 5. Working with Dates and Times
    Chapter 6. Sorting Query Results
    Chapter 7. Generating Summaries
    Chapter 8. Modifying Tables with ALTER TABLE
    Chapter 9. Obtaining and Using Metadata
    Chapter 10. Importing and Exporting Data
    10.1 Introduction
    10.2 Importing Data with LOAD DATA and mysqlimport
    10.3 Specifying the Datafile Location
    10.4 Specifying the Datafile Format
    10.5 Dealing with Quotes and Special Characters
    10.6 Importing CSV Files
    10.7 Reading Files from Different Operating Systems
    10.8 Handling Duplicate Index Values
    10.9 Getting LOAD DATA to Cough Up More Information
    10.10 Don't Assume LOAD DATA Knows More than It Does
    10.11 Skipping Datafile Lines
    10.12 Specifying Input Column Order
    10.13 Skipping Datafile Columns
    10.14 Exporting Query Results from MySQL
    10.15 Exporting Tables as Raw Data
    10.16 Exporting Table Contents or Definitions in SQL Format
    10.17 Copying Tables or Databases to Another Server
    10.18 Writing Your Own Export Programs
    10.19 Converting Datafiles from One Format to Another
    10.20 Extracting and Rearranging Datafile Columns
    10.21 Validating and Transforming Data
    10.22 Validation by Direct Comparison
    10.23 Validation by Pattern Matching
    10.24 Using Patterns to Match Broad Content Types
    10.25 Using Patterns to Match Numeric Values
    10.26 Using Patterns to Match Dates or Times
    10.27 Using Patterns to Match Email Addresses and URLs
    10.28 Validation Using Table Metadata
    10.29 Validation Using a Lookup Table
    10.30 Converting Two-Digit Year Values to Four-Digit Form
    10.31 Performing Validity Checking on Date or Time Subparts
    10.32 Writing Date-Processing Utilities
    10.33 Using Dates with Missing Components
    10.34 Performing Date Conversion Using SQL
    10.35 Using Temporary Tables for Data Transformation
    10.36 Dealing with NULL Values
    10.37 Guessing Table Structure from a Datafile
    10.38 A LOAD DATA Diagnostic Utility
    10.39 Exchanging Data Between MySQL and Microsoft Access
    10.40 Exchanging Data Between MySQL and Microsoft Excel
    10.41 Exchanging Data Between MySQL and FileMaker Pro
    10.42 Exporting Query Results as XML
    10.43 Importing XML into MySQL
    10.44 Epilog
    Chapter 11. Generating and Using Sequences
    Chapter 12. Using Multiple Tables
    Chapter 13. Statistical Techniques
    Chapter 14. Handling Duplicates
    Chapter 15. Performing Transactions
    Chapter 16. Introduction to MySQL on the Web
    Chapter 17. Incorporating Query Resultsinto Web Pages
    Chapter 18. Processing Web Input with MySQL
    Chapter 19. Using MySQL-Based Web Session Management
    Appendix A. Obtaining MySQL Software
    Appendix B. JSP and Tomcat Primer
    Appendix C. References
    Colophone
    Index


    More Books
    PHP Hacks
    Processing Xml With Java - A Guide To Sax, Dom, Jdom, Jaxp, And Trax
    The Koran (Holy Qur'an)
    Macromedia Flash 8 Bible
    Search Engine Optimization for Dummies
    YouTube Traffic
    PHP 5 for Dummies
    Harry Potter and The Chamber of Secrets
    Harry Potter and the Sorcerer's Stone
    The Pilgrim's Progress
    Wireless Hacks
    Flash Hacks. 100 Industrial-Strength Tips & Tools
    PayPal Hacks. 100 Industrial-Strength Tips and Tools
    Amazon Hacks
    Pdf Hacks
    The Da Vinci Code
    Google Hacks
    The Holy Bible
    Windows XP For Dummies
    Harry Potter and the Half-Blood Prince
    Seo Book
    Upgrading and Repairing Networks
    Macromedia Dreamweaver 8 UNLEASHED
    Windows XP Annoyances
    Windows XP Hacks
    Microsoft Windows XP Power Toolkit
    Teach Yourself MS Office In 24Hours
    iPod & iTunes Missing Manual
    PC Hacks 100 Industrial-Strength Tips and Tools
    PC Overclocking, Optimization, and Tuning - 2th Edition
    PC Hardware In A Nutshell 3rd Edition
    PC Hardware in a Nutshell, 2nd Edition
    Upgrading and Repairing PCs
    Google for Dummies
    MySQL Cookbook
    Teach Yourself Macromedia Flash 8 In 24 Hours
    PHP CookBook
    Sams Teach Yourself JavaScript in 24 Hours
    PHP5 Manual
    Free Games Paper Airplanes
    500 Juegos Gratis 500 Giochi Gratis 500 Jeux Gratuits 500 Jogos Gratis 500 Kostenlose Spiele