Tuesday, August 6, 2013

Regular Expressions: Action, Pattern, Modifier

Introduction to Regular Expressions

In our introductory article on Perl Regular Expressions, we showed how regular expressions can cut down the code that you have to write. This article discusses the format of regular expressions.


The regular expression format is like this:


The action can be any of the following:
 Action  DescriptionFormat
Search once /pattern/modifier 
 sSubstitute s/oldpattern/newpattern/modifier 
 m Match m/pattern/modified 
 y Translate y/replacecharacter/withthese/ 
 qSingle quote; similar to'string'q/string/ 
 qqDouble quote; similar to"string" qq/string/
 qrCompile the regular expression qr/regexp/ 
 qwCompile as a list qw/item1 item2 item3 item4 item5 .../


The pattern is made up of characters that Perl recognizes: Perl uses metacharacters to denote something within the pattern. Listed below are the more common metacharacters.

 MetacharacterDescripion Example Explanation 
Generally, treat the next character as is\*\+This represents the *+ characters as is
Previous character 0 or more timesA*The letter A or more times
Previous character 1 or more timesA+The letter A or more times
Previous character 0 or 1 timeN?The letter 0 or 1 time
String being matched should start with the pattern following the ^^ConceptString should begin with the string Concept
Represents any character...Any 3 characters
String being matched should end with the pattern preceeding the $Corporation$String should end with the stringCorporation
[ ]
Any characters enclosed between the two brackets[1358]Character should be 1, 3, 5 OR 8
{x, y}
Character should occur x minumum number of times and not more than y times4{1, 3}
Character occurs 1 to 3 times
Character occurs 3 times
Character occurs at least 4 times


Modifiers define how a pattern is to be used. Below are the more common modifiers.

 ModifierUsed in Explanation 
iMatch and substituteCase insensitive
Match and substitute
Search all occurences of the pattern
Evaluate the first pattern


$_ = 'Concept*Solutions*Corporation';

Regular ExpressionExplanation Result 
/*/Invalid regular expression because * should be preceded by a character
Script error
/\*/Search for an asterisk (*)
/Solutions/Search for a string 'Solutions'
/solutions/Search for a string 'solutions' - not found because the search is case sensitive
/solutions/iSearch for a string 'solutions' - found because the search is NOT case sensitive
s/C/K/Substitute the first upper case C with a K
Result is 1
$_ will have:
s/C/K/gSubstitute all upper case C with a K
Result is 2
$_ will have:
s/C/K/giSubstitute all C with a K regardless of case
Result is 3
$_ will have:
y/con/KUD/Change c to K, change o to U, change n to D
Result is 10
$_ will have: CUDKept*SUlutiUDs*CUrpUratiUD
m/^[con]/giFirst character of string should be c, o or n, case is NOT sensitive
s/.n/\+\^/giIf the substring has a character followed by an n, change the two characters to a + and @
Result is 3
$_ contains:

These are just some of the basic regular expressions. The only way you can learn this is to try out combinations yourselves. To do this, you can write a simple perl script based on this:

$_ = 'Concept*Solutions*Corporation'; $a = your-regular-expression; print "Result: $a: $_\n";

Have fun!

No comments:

Post a Comment