PHP and Backslash Hell

Backslashes were a fun way to escape certain characters in languages like C (I don’t know if earlier languages have been using the idea, not that I care anyway), for example "\"\\" represents a string containing a double-quote and a backslash character.

Well, it would also make sense to use this for regular expressions (preg_*) functions too.  It’s just that… it gets a little awkward at times.  Once, I wanted to use a regular expression to match a double-backslash (\\) in a string.  For just that, you’d need to put in 8 backslashes in the regular expression – '\\\\\\\\'.  There’s two levels of escaping here – the first one is PHP string parsing (8 -> 4 backslashes) then regular expression parsing (4 -> 2 backslashes).

Nothing much, but, well, dealing with backslashes in regular expressions gets crazy sometimes, especially as most regexs look really confusing anyway.  For example, a bit of code used in my Reverse MyCode Parser, to match captured patterns in regexes:

while(preg_match('~.*(?:^|[^\\\\](?:\\\\\\\\)*)(\(([^?\\\\]|[^?].*?(?:[^\\\\](?:\\\\\\\\)*))\)([.*]\\??|\\?|\\{\d+(?:,\d*)?\\})?)~s', $pattern, $match, PREG_OFFSET_CAPTURE))

Leave a Reply