Perl Basics   «Prev  Next»

Lesson 8Regular expressions
ObjectiveFamiliarize yourself with Perl's usage of regular expressions.

Perl Regular Expressions

The regular expressions that Perl supports are documented in many places. The purpose of this section is to familiarize you with features of Perl's regular expression usage that are particularly useful for web programming. Perl uses a number of extensions to the egrep-style regular expressions that are common in Unix utilities. In fact, Perl 5 has introduced some new extensions that are helpful in making regular expressions more readable. Mastering regular expressions will be an extremely useful skill for you. The more you know about them, the better you will be able to solve common programming problems. Regular expressions will be used in virtually every program we write, including both programs in the course project.

The Regular Expressions of Perl

There are a number of general documents available about regular expressions. Here are a couple:
  1. dispersednet Regular-expressions

Perl's regular expressions are documented in these places (as well as many others):
  1. Perl Regular Expressions Man Page
  2. Tom Christiansen's excellent (and very technial) PERL5 Regular Expression Description


Perl Quantifiers

If you just want to match an exact string, using the index()builtin is faster:
my $word = 'dabchick';
if ( index $word, 'abc' >= 0 ) {
print 'Found 'abc' in $word\n';
}

But sometimes you want to match more or less of a particular string. That is when you use quantifi ers in your regular expression. For example, to match the letter a followed by an optional letter b, and then the letter c, use the ? quantifi er to show that the b is optional. The following matches both abc and ac:

if ( $word =~ /ab?c/ ) { ... }

The * shows that you can match zero or more of a given letter:
if ( $word =~ /ab*c/ ) { ... }

The + shows that you can match one or more of a given letter:
if ( $word =~ /ab+c/ ) { ... }
This sample code should make this clear. Use the qr() quote-like operator. This enables you to properly quote a regular expression without trying to match it to anything before you�re ready.


Common Regular Expressions
Table 2-8: Common Regex Modifiers

/g modifier

You already know about the /x and /i modifi ers, so now look at the /g modifier. That enables you to globally match something. For example, to print every non-number in a string, use this code:
my $string = '';
while ("a1b2c3dddd444eee66"=~/(\D+)/g ) {
 $string .= $1;
}
print $string;
And that prints abcddddeee, as you expect. You can also use this to count things, if you are so inclined