Tuesday, August 6, 2013

Regular Expressions: Action, Pattern, Modifier


Introduction to Regular Expressions

In our introductory article on Perl Regular Expressions, we showed how regular expressions can cut down the code that you have to write. This article discusses the format of regular expressions.

Format

The regular expression format is like this:

action/pattern/modifier

The action can be any of the following:
 Action  DescriptionFormat
Search once /pattern/modifier 
 sSubstitute s/oldpattern/newpattern/modifier 
 m Match m/pattern/modified 
 y Translate y/replacecharacter/withthese/ 
 qSingle quote; similar to'string'q/string/ 
 qqDouble quote; similar to"string" qq/string/
 qrCompile the regular expression qr/regexp/ 
 qwCompile as a list qw/item1 item2 item3 item4 item5 .../

Pattern

The pattern is made up of characters that Perl recognizes: Perl uses metacharacters to denote something within the pattern. Listed below are the more common metacharacters.


 MetacharacterDescripion Example Explanation 
\
Generally, treat the next character as is\*\+This represents the *+ characters as is
*
Previous character 0 or more timesA*The letter A or more times
+
Previous character 1 or more timesA+The letter A or more times
?
Previous character 0 or 1 timeN?The letter 0 or 1 time
^
String being matched should start with the pattern following the ^^ConceptString should begin with the string Concept
.
Represents any character...Any 3 characters
$
String being matched should end with the pattern preceeding the $Corporation$String should end with the stringCorporation
[ ]
Any characters enclosed between the two brackets[1358]Character should be 1, 3, 5 OR 8
{x, y}
Character should occur x minumum number of times and not more than y times4{1, 3}
g{3}
L{4,}
Character occurs 1 to 3 times
Character occurs 3 times
Character occurs at least 4 times

Modifiers

Modifiers define how a pattern is to be used. Below are the more common modifiers.

 ModifierUsed in Explanation 
iMatch and substituteCase insensitive
g
Match and substitute
Search all occurences of the pattern
e
Substitute
Evaluate the first pattern

Examples

$_ = 'Concept*Solutions*Corporation';

Regular ExpressionExplanation Result 
/*/Invalid regular expression because * should be preceded by a character
Script error
/\*/Search for an asterisk (*)
1
/Solutions/Search for a string 'Solutions'
1
/solutions/Search for a string 'solutions' - not found because the search is case sensitive
0
/solutions/iSearch for a string 'solutions' - found because the search is NOT case sensitive
1
s/C/K/Substitute the first upper case C with a K
Result is 1
$_ will have:
Koncept*Solutions*Corporation
s/C/K/gSubstitute all upper case C with a K
Result is 2
$_ will have:
Koncept*Solutions*Korporation
s/C/K/giSubstitute all C with a K regardless of case
Result is 3
$_ will have:
KonKept*Solutions*Korporation
y/con/KUD/Change c to K, change o to U, change n to D
Result is 10
$_ will have: CUDKept*SUlutiUDs*CUrpUratiUD
m/^[con]/giFirst character of string should be c, o or n, case is NOT sensitive
1
s/.n/\+\^/giIf the substring has a character followed by an n, change the two characters to a + and @
Result is 3
$_ contains:
C+@cept*Soluti+@s*Corporati+@

These are just some of the basic regular expressions. The only way you can learn this is to try out combinations yourselves. To do this, you can write a simple perl script based on this:

$_ = 'Concept*Solutions*Corporation'; $a = your-regular-expression; print "Result: $a: $_\n";

Have fun!

No comments:

Post a Comment

Follow by Email