10.4 Specifying the Datafile Format
10.4.1 Problem
You have a datafile
that's not in LOAD
DATA's default format.
10.4.2 Solution
Use FIELDS and LINES clauses to
tell LOAD DATA how to interpret
the file.
10.4.3 Discussion
By default, LOAD DATA assumes
that datafiles contain lines that are terminated by linefeeds
(newlines) and that data values within a line are separated by tabs.
The following statement does not specify anything about the format of
the datafile, so MySQL assumes the default format:
mysql> LOAD DATA LOCAL INFILE 'mytbl.txt' INTO TABLE mytbl;
To specify a file format explicitly, use a FIELDS
clause to describe the characteristics of fields within a line, and a
LINES clause to specify the line-ending sequence.
The following LOAD DATA
statement specifies that the datafile contains values separated by
colons and lines terminated by carriage returns:
mysql> LOAD DATA LOCAL INFILE 'mytbl.txt' INTO TABLE mytbl
-> FIELDS TERMINATED BY ':'
-> LINES TERMINATED BY '\r';
Each clause follows the table name. If both are present, the
FIELDS clause must precede the
LINES clause. The line and field termination
indicators can contain multiple characters. For example,
\r\n indicates that lines are terminated by
carriage return/linefeed pairs.
If you use
mysqlimport, command-line options
provide the format specifiers. mysqlimport
commands that correspond to the preceding two LOAD
DATA statements look like this:
% mysqlimport --local cookbook mytbl.txt
% mysqlimport --local --fields-terminated-by=":" --lines-terminated-by="\r" \
cookbook mytbl.txt
The order in which you specify the options doesn't
matter for mysqlimport, except that they should
all precede the database name.
|
As of MySQL 3.22.10, you can use hex
notation to specify arbitrary format characters for
FIELDS and LINES
clauses. Suppose a datafile has lines with Ctrl-A between fields and
Ctrl-B at the end of lines. The ASCII values for Ctrl-A and Ctrl-B
are 1 and 2, so you represent them as 0x01 and
0x02:
FIELDS TERMINATED BY 0x01 LINES TERMINATED BY 0x02
mysqlimport understands hex constants for format
specifiers as of MySQL 3.23.30. You may find this capability helpful
if you don't like remembering how to type escape
sequences on the command line or when it's necessary
to use quotes around them. Tab is 0x09, linefeed
is 0x0a, and carriage return is
0x0d. Here's an example that
indicates that the datafile contains tab-delimited lines terminated
by CRLF pairs:
% mysqlimport --local --lines-terminated-by=0x0d0a \
--fields-terminated-by=0x09 cookbook mytbl.txt
|
|