Recipe 13.10 Reading Records with a Pattern Separator
13.10.1 Problem
You want to
read in records from a file, in which each record is separated by a
pattern you can match with a regular expression.
13.10.2 Solution
Read the entire file into a
string and then
split on the regular expression:
$filename = '/path/to/your/file.txt';
$fh = fopen($filename, 'r') or die($php_errormsg);
$contents = fread($fh, filesize($filename));
fclose($fh);
$records = preg_split('/[0-9]+\) /', $contents);
13.10.3 Discussion
This breaks apart a numbered list and places the individual list
items into array elements. So, if you have a list like this:
1) Gödel
2) Escher
3) Bach
You end up with a four-element array, with an empty opening element.
That's because preg_split( )
assumes the delimiters are between items, but in this case, the
numbers are before items:
Array
(
[0] =>
[1] => Gödel
[2] => Escher
[3] => Bach
)
From one point of view, this can be a feature, not a bug, since the
nth element holds the nth
item. But, to compact the array, you can eliminate the first element:
$records = preg_split('/[0-9]+\) /', $contents);
array_shift($records);
Another modification you might want is to strip new lines from the
elements and substitute the empty string instead:
$records = preg_split('/[0-9]+\) /', str_replace("\n",'',$contents));
array_shift($records);
PHP doesn't allow you to change the input record
separator to anything other than a newline, so this technique is also
useful for breaking apart records divided by strings. However, if you
find yourself splitting on a string instead of a regular expression,
substitute explode( ) for preg_split(
) for a more efficient operation.
13.10.4 See Also
Recipe 18.6 for reading from a file; Recipe 1.12 for parsing CSV files.
|