coderholic

Beware PHP's split()

Many programming languages have have a join function that takes an array of strings and concatenates them, and a corresponding split function that does the opposite: takes a string and splits it up into an array.

Python has join and split, C# has join and split, Javascript has join and split, and Java has a split (although it's sadly missing a join). PHP, though, has implode and explode. They do the same thing as join and split in other languages, it's just that PHP has decided not to to follow the standard naming convention for these functions.

Fortunately for developers like me who are used to the "join" and "split" names PHP has aliases. Therefore you can happily call join and split in PHP and it'll basically call implode and explode for you... or so I though! Unfortunately this isn't the case! Join is an alias of implode. From the PHP documentation:

join — Alias of implode()

And here's the documentation for implode:

implode — Join array elements with a string string implode ( string $glue , array $pieces )

That's pretty clear. It's not quite the case for split though:

split — Split string into array by regular expression array split ( string $pattern , string $string [, int $limit ] )

Notice how it isn't defined as an alias of explode! Here's the documentation for explode:

explode — Split a string by string array explode ( string $delimiter , string $string [, int $limit=-1 ] )

Notice how it's ever so slightly different? The explode function takes a string delimiter, but split takes a regular expression string! This means that in the majority of cases you'll get exactly the same result from both functions, but you'll suffer a slight performance hit with split. Using my profile class I worked out that explode is just over twice as fast as split.

The real problem arises when your delimiter string contains characters that mean something different when interpreted as a regular expression (such as ".", "|", "?", "+" and "*"). See the following example with a ".":

<?php
$sentence = "One. Two.";
var_dump(split(".", $sentence));
/* Output:
array(10) {
  [0]=>
  string(0) ""
  [1]=>
  string(0) ""
  [2]=>
  string(0) ""
  [3]=>
  string(0) ""
  [4]=>
  string(0) ""
  [5]=>
  string(0) ""
  [6]=>
  string(0) ""
  [7]=>
  string(0) ""
  [8]=>
  string(0) ""
  [9]=>
  string(0) ""
} */
var_dump(explode(".", $sentence));
/* Output:
array(3) {
  [0]=>
  string(3) "One"
  [1]=>
  string(4) " Two"
  [2]=>
  string(0) ""
} */
?>

So you've been warned! Feel free to use join instead of implode, but use split with caution!

Posted on 29 Jan 2009
If you enjoyed reading this post you might want to follow @coderholic on twitter or browse though the full blog archive.