Perl
Pratical Extraction and Reporting Language
Welcome
Marcos Rebelo [email protected]
Perl - Larry Wall
LarryWall,programmer,linguist, author,bornMarch10,1949in Duncan,BritishColumbia,Canada. Perlcomputerlanguagecreationin 1987. Quotes:
Thethreechiefvirtuesofa programmerare:Laziness,Impatience andHubris There'sMoreThanOneWaytoDoIt
[email protected]What is Perl?
Perlisageneralpurposeprogramminglanguage originallydevelopedfortextmanipulationandnow usedforawiderangeoftasksincludingweb development,systemadministration,network programming,codegenerationandmore.
It'sintendedtobepracticalratherthanbeautiful. SupportsbothproceduralandOOprogramming. Ithaspowerfulbuiltinsupportfortextprocessing Ithasoneoftheworld'smostimpressivecollectionsof thirdpartymodules(http://www.cpan.org).
[email protected]Perl Pros
Quickdevelopmentphases OpenSource,andfreelicencing Excellenttexthandlingandregularexpressions Largeexperiencedactiveuserbase Fast,foraninterpretedlanguage Codedevelopedisgenerallycrossplaform Veryeasytowritepowerfulprogramsinafewlines ofcode
[email protected]Perl Cons
LimitedGUIsupport Canlookcomplicated,initially,particularlyifyou're notfamiliarwithregularexpressions
[email protected]
Consider using for
Webapplications Systemadministration Mostcommandlineapplications Batchprocessing Reports Databaseaccessandmanipulation Anyapplicationinvolvingsignificanttext processing
[email protected]Hello World
inafile:
print "Hello, World!\n" #> perl -e 'print "Hello, World!\n"' #> perl -E 'say "Hello, World!"'
ontheshell
withperl5.10
[email protected]
Basic syntax overview
Perlstatementsendinasemicolon. say("Hello, world!"); Youcanuseparenthesesforsubroutinesarguments oromitthemaccordingtoyourpersonaltaste. say "Hello, world!"; Commentsstartwitha#andruntotheendofthe line. # This is a comment
[email protected]
Basic syntax overview
Doublequotesorsinglequotesmaybeusedaround literalstrings: say("Hello, world!"); say('Hello, world!'); However,onlydoublequotesinterpolatevariables andspecialcharacterssuchasnewlines(\n): print("Hello, $name!\n"); # interpolate $name print('Hello, $name!\n'); # prints $name!\n literally
[email protected]
Basic syntax overview
Whitespacesareirrelevant: print( "Hello world"); ...exceptinsidequotedstrings: print("Hello world"); # this would print with a # line break in the middle.
[email protected]
10
Use feature qw(say)
Perl5.10asaddedanewfeature
use feature qw(say); say 'Hello, world!'; print 'Hello, world!', "\n"; { local $\ = "\n"; print 'Hello, world!' }
11
Isequivalentto:
Wewillseelaterthatthecorrectexpressionis:
[email protected]
Running Perl programs
TorunaPerlprogramfromtheUnixcommandline: #> perl progname.pl Alternatively,putthisshebanginyourscript: #!/usr/bin/perl andrunyourexecutablescript #> progname.pl -e:allowsyoutodefinethePerlcodeinthe commandlinetobeexecuted,Etoget5.10
#> perl -e 'print("Hello, World!\n")'
[email protected]
12
Perl variable types
Scalars
my $animal = 'camel'; my $answer = 42; my $scalar_ref = \$scalar;
Arrays
my @animals = ('camel', 'lama'); my @numbers = (23, 42, 69); my @mixed = ('camel', 42, 1.23);
AssociativeArray/HashTables
my %fruit_color = ( apple => 'red', banana => 'yellow' ); [email protected]
13
Scalar Type
Scalarvaluescanbestrings,integers,floatingpoint numbersorreferences.Thevariablenamestarts witha$.
my $animal my $answer
= 'camel'; = 42;
my $scalar_ref = \$scalar;
Perlwillautomaticallyconvertbetweentypesas required.
print('3' + 4.5); # prints 7.5
[email protected]14
Array Type
Anarrayrepresentsalistofscalars.Thevariable namestartswitha@.
my @animals = ('camel', 'llama'); my @numbers = (23, 42, 69); my @mixed = ('camel', 42, 1.23);
Arraysarezeroindexed,negativeindexstartby1 attheendofthearray.
print($animals[0]); # prints "camel" print($numbers[-1]); # prints 69
[email protected]15
Array Slice
Togetmultiplevaluesfromanarray:
@numbers[0,1]; # gives (23, 42); @numbers[0..2]; # gives (23, 42, 69); @numbers[1..$#numbers];
Thesingleelementwe'regettingfromthearraystart witha$,thesubarraystartwith@. @numbers[2, 1] = @numbers[1, 2]; ($scalar1, $scalar2) = @numbers;
[email protected]16
Array Type
Thespecialvariable$#arraytellsyoutheindexof thelastelementofanarray:
print($mixed[$#mixed]); # prints 1.23
Youmightbetemptedtouse$#array+1totell youhowmanyitemsthereareinanarray.Using @arrayinascalarcontextwillgiveyouthe numberofelements:
if (@animals < 5) { ... } if (scalar(@animals) < 5) { ... }
[email protected]17
Associative Array Type Hash table
Ahashrepresentsasetofkey/valuepairs:
my %fruit_color = ( 'apple', 'red', 'banana', 'yellow' );
Youcanusewhitespaceandthe=>operatortolay themoutmorenicely:
my %fruit_color = ( apple => 'red', 'banana' => 'yellow' );
[email protected]18
Associative Array Type Hash table
Togetathashelements:
$fruit_color{'apple'}; # "red" $fruit_color{banana}; # "yellow"
Youcangetatlistsofkeysandvalueswith keys()andvalues().
my @fruits = keys(%fruit_colors); my @colors = values(%fruit_colors);
Hasheshavenoparticularinternalorder,thoughyou cansortthekeysandloopthroughthem.
[email protected]
19
Hash Slice
Toget/setmultiplevaluesfromanhash
@fruit_color{'watermelon', 'orange'} = ('green', 'orange'); my @array = (1,2,9,5,2,5); my %hash; @hash{@array} = (); @array = keys(%hash); say "@array"; # 1 9 2 5
Removingrepetitionfromanarray
[email protected]
20
Variable Reference
Scalarreferences
\$scalar,\'',\123 \@array,['camel',42,1.23] \%hash, {apple=>'red',banana=>'yellow'} $$scalar_ref,@$array_ref,%$hash_ref
[email protected]Arrayreferences
Hashreferences
Unrefavariable
21
Complex data types
Morecomplexdatatypescanbeconstructedusing references:
my $cities = { IT => [ 'Milan', 'Rome', ...], PT => [ 'Lisbon', 'Oporto', ...], ... }; print($cities->{'IT'}->[1]); print($cities->{'IT'}[1]); my @citiesPT = @{$cities->{PT}};
[email protected]
Youcanaccessitthrow:
22
Lexical Variables
Throughoutthepreviousslidestheexampleshave usedthesyntax:
my $var = 'value'; $var = 'value';
Themyisactuallynotrequired,youcoulduse:
However,theaboveusagewillcreateglobal variablesthroughoutyourprogram.
mycreateslexicallyscopedvariablesinstead.The variablesarescopedtotheblockinwhichtheyare defined. 23 [email protected]
Lexical Variables
Nothavingthevariablesscopedisusuallyabad programmingpractice.
my $a = 'foo'; if ($some_condition) { my $b = 'bar'; print($a); # prints "foo" print($b); # prints "bar" } print($a); # prints "foo" print($b); # prints nothing; $b has # fallen out of scope
[email protected]
24
Lexical Variables
Itsrecommendedtousemyincombinationwitha use strictatthetopofyourprograms.Using thepragmastrictishighlyrecommended.
use strict; my $first = "first"; # OK $second = "second"; # NOT OK
Alsohighlyrecommendedisthepragma warningsforhavingaditionalwarnings.Youcan enableordeseablethiswarnings.
[email protected]25
Magic variables
Therearesomemagicvariables.Thesespecial variablesareusedforallkindsofpurposes.
$_:isthedefaultvariable. @ARGV:thecommandlineargumentstoyourscript. $ARGV:containsthenameofthecurrentfilewhen readingfrom<>orreadline(). @_:theargumentspassedtoasubroutine. $a,$b:Specialpackagevariableswhenusingsort(). $/:Theinputrecordseparator,newlinebydefault.
[email protected]26
Most common operators
Arithmetic
+ * / && || !
+= -= *= /= and or not
addition subtraction multiplication division
Booleanlogic
[email protected]
27
Most common operators
Miscellaneous
=assignment .stringconcatenation xstringmultiplication ..rangeoperator(createsalistofnumbers)
[email protected]
28
Most common operators
Numeric String equality == eq inequality != ne lessthan < lt greaterthan > gt lessthanorequal <= le greaterthanorequal => ge
[email protected]
29
Conditional constructs
Perlhasmostoftheusualconditional
if COND BLOCK if COND BLOCK else BLOCK if COND BLOCK elsif COND BLOCK if COND BLOCK elsif COND BLOCK else BLOCK
TheCONDshallbeconditionalstatementsurround by(and),andBLOCKshallbeoneoremore statementssurroundby{and}.
if ( is_valid( $value ) ) { }
[email protected]
30
Conditional constructs
There'salsoanegatedversionofif(don'tuseit):
unless ( is_valid( $value ) ) { ... }
Thisisprovidedasa'morereadable'versionof
if ( not( is_valid( $value ) ) ) { ... }
0,'0','',()andundefareallfalseinaboolean context.Allothervaluesaretrue.
[email protected]
31
Conditional constructs
NotethatthebracesarerequiredinPerl,evenif you'veonlygotonelineintheblock.However, thereisacleverwayofmakingyouroneline conditionalblocksmoreEnglishlike:
Thetraditionalway if ($zippy) { print("Yow!"); } ThePerlishpostconditionway print("Yow!") if $zippy; print("No cubes") unless $cubes;
[email protected]
32
while looping constructs
Perlhasmostoftheusualloopconstructs.
WhileLoop: while ( is_valid( $value ) ) { } There'salsoanegatedversion(don'tuseit): until ( is_valid( $value ) ) { } Youcanalsousewhileinapostcondition: print("xpto\n") while condition; Goingthrowahash: while (my ($key,$value) = each(%ENV)){ print "$key=$value\n"; }
[email protected]
33
for looping constructs
for (my $i=0; $i <= $max; $i++) { ... }
TheCstyleforloopisrarelyneededinPerlsince Perlprovidesthemorefriendlyforeachloop.
foreach my $i (0 .. $max) { ... }
[email protected]
34
foreach looping constructs
Passingallelementsofanarray,foreachisan aliastofor.
foreach my $var (@array) { ... } for my $value (values(%hash)) { ... } foreach (@array) { print "$_\n" }
Bydefaultthevaluegoeson$_
Changingthevariable,changesthevalueinsidethe array.$varisanalias.
for my $var (@array) { $var += $var }
[email protected]35
Jump on loops
LINE: while ( defined(my $line = <>) ) { next LINE if not is_valid( $line ); #... }
Jumponloops:
last LABEL:immediatelyexitstheloopinquestion next LABEL:startsthenextiterationoftheloop redo LABEL:restartstheloopblockwithout evaluatingtheconditionalagain IftheLABELisomitted,thecommandreferstothe innermostenclosingloop.
[email protected]
36
Inverting the cycle
Withasinglecommand
print foreach (0..9) do { ... } while ($true) Warning:last,next,andredodon'tworkinthis case.
Withmultiplycommands
[email protected]
37
Exercises 1 - Scalars
1)ImplementtheGuesstheluckynumber.The programshallchosearandomnumberbetween0and 100andasktheuserforaguess.Noticetheuserifthe guessisbigger,smallerorequaltotherandom number.Iftheguessiscorrecttheprogramshallleave otherwisereaskforanumber.
[email protected]
38
Exercises 1 - Array
2)Createanarraythatcontainsthenamesof5 studentsofthisclass. 2a)Printthearray. 2b)CreateanewArrayshiftingtheelementsleftby onepositions(element1goesto0,)andsetting thefirstelementinthelastposition.Printthearray. 2c)Askausertoinputanumber.Printthenamewith thatindex.
[email protected]
39
Exercises 1 - Hash
3)HomerFamily
my %relatives Lisa => Bart => Maggie => Marge => Homer => Santa =>
= ( 'daughter', 'son', 'daughter', 'mother', 'father', 'dog');
3a)Printallthecharactersnames. 3b)Demandforanameandprintisposition. 3c)Printallthecharacterspositions,norepetitions. 3d)Demandforapositionandprintthename.
[email protected]40
Subroutines
[email protected]
41
Subroutines
ThePerlmodelforsubroutinecallandreturnvalues issimple:
allsubroutinearepassedasparametersonesingleflatlist ofscalars(goesto@_) allsubroutineslikewisereturntotheircalleronescalaror alistofscalars. listsorhashesintheselistswillcollapse,losingtheir identitiesbutyoumayalwayspassareference. Thesubroutinenamestartwitha&,forsimplification canbeomitted.
[email protected]42
Subroutine - example
sub max { my $mval = shift(@_); # my ($mval, @rest) = @_; # big copy foreach my $foo (@_) { if ( $mval < $foo ) { $mval = $foo; } } return $mval; }
my $my_max = max(1, 9, 3, 7); print $my_max; # prints 9
[email protected]43
Subroutine input and output
Theparametersarepassedtothesubroutineinthe array@_,changingthevaluesofthearraywill changethevaluesoutsidethesubroutinecall.
sub double { $_[0] *= 2; } my $b = 5; double($b); print($b); # prints 10 print(double($b)); # prints 20
[email protected]Thelaststatementvalueisreturnedbydefault.
44
Subroutine input and output
Using@_isdangerouseandshallbecarefully considered.It'salwayspossibletodoacopy.
sub double { my ($a) = @_; $a *= 2; return $a; } my $b = 5; print( double( $b ) ); # prints 10 print($b); # prints 5
[email protected]
45
Persistent Private Variables
Justbecausealexicalvariableisstaticallyscopedto itsenclosingblock,eval,orfile.
{ my $secretVal = 0; sub gimmeAnother { return ++$secretVal; } } print gimmeAnother; # OK ++$secretVal; # NOT OK
46
[email protected]
state variables
Fromperl5.10youmayusestaticvariables.
use feature 'state'; sub gimmeAnother { state $secretVal = 0; return ++$secretVal; } print gimmeAnother; # OK
Somefeaturesinperl5.10havetobeactivatedto avoidcolisionswitholdcode.Activatingallthem:
use feature ':5.10';
[email protected]use v5.10; # use perl 5.10
47
Named Parameters
Wecanimplementnamedparametersusingahash. Thisprovidesanelegantwaytopassinparameters withouthavingtodefinethemformally.
sub login { my %param = @_; ... } login( user=>'User', pass=>'Pass' );
[email protected]
48
Named Parameters
Wemaypassthehashdirectly.
sub login { my ($param) = @_; ... } login({ user=>'User', pass=>'Pass' });
[email protected]
49
Named Parameters
Wecaneasilygivedefaultvaluesbycheckingthe hash.
sub login { my ($param) = @_; $param->{user} = $DEFAULT_USER if not exists $param->{user}; $param->{pass} = $DEFAULT_PASS if not exists $param->{pass}; $param->{host} = $DEFAULT_HOST if not exists $param->{host}; ... }
[email protected]
50
Named Parameters
Wecaneasilygivedefaultvaluesbycheckingthe hash.
sub login { my $param = { 'user' => $DEFAULT_USER, 'pass' => $DEFAULT_PASS, 'host' => $DEFAULT_HOST, %{shift(@_)} }; ... }
[email protected]
51
Named Parameters
Wecanalsowritethesubroutinesothatitaccepts bothnamedparametersandasimplelist.
sub login { my $param; if ( ref($_[0]) eq 'HASH' ) { $param = $_[0]; } else { @{$param}{qw(user pass host)}=@_; } ... } login('Login', 'Pass'); login({user => 'Login', pass => 'Pass'});
[email protected]
52
Exercises 2
1)Createanewsubroutinethatcalculatesthe Fibonacciseries.Usingthissubroutine,doa programthatreceivesmultiplenumbersasargument andprintstheFibonaccivalue. F(0)=0 F(1)=1 F(n)=F(n1)+F(n2) 1a)withpresistentvariable 1b)withstatevariable
[email protected]
53
IO
[email protected]
54
Read a File
open(FH, '<', 'path/to/file') or die "can't open file: $!"; while ( defined( my $line = <FH> ) ) { chomp($line); } close(FH);
[email protected]
55
Open a Filehandler
Openingafileforinput.
open(INFH, "<", "input.txt") open(INFH, "<input.txt") open(INFH, "input.txt")
or die $!; or die $!; or die $!;
Openingafileforoutput.
open(OUTFH, ">", "output.txt") or die $!; open(OUTFH, ">output.txt") open(LOGFH, ">>", "my.log") open(LOGFH, ">>my.log")
[email protected]
or die $!; or die $!; or die $!;
56
Openingafileforappending
Open a Filehandler
Youcanalsouseascalarvariableasfilehandle:
open(my $inFH, "input.txt") or die $!; alexicalscalarvariableclosesattheendoftheblockifit wasn'tclosedbefore open(my $net, "netstat |") or die "Can't netstat: $!"; open(my $sendmail, "| sendmail -t") or die "Can't open sendmail: $!";
[email protected]
It'spossibletopipefromacommand:
It'spossibletopipetoacommand:
57
Write to a Filehandle
We'vealreadyseenhowtoprinttostandardoutput usingprint().However,print()canalsotakean optionalfirstargumentspecifyingwhichfilehandle toprintto:
print STDERR ('Are you there?'); print OUTFH $record; print { $FH } $logMessage; Note:Thereisno,betweenfilehandleandthetext. close($inFH);
ClosingFileHandles:
[email protected]
58
Read from a Filehandle
Youcanreadfromanopenfilehandleusingthe<> operatororthereadline()subroutine.
Linebyline: my $line = <$fh>; my $line = readline($fh); Slurp: my @lines = <$fh>; my @lines = readline($fh);
[email protected]
59
Read from a Filehandle
Slurpingafilecanbeusefulbutitmaybeamemory hog(usuallyifyouareprogramminginPerlthatis notaproblem,theactualcomputersalreadyhavea lotofmemory).Mosttextfileprocessingcanbe donealineatatimewithPerl'sloopingconstructs.
while ( defined(my $line = <$fh>) ) { print "Just read: $line"; } foreach my $line (<$fh>) { # slurps print "Just read: $line"; }
[email protected]
60
Input Record Separator
$/istheinputrecordseparator(newlineby default).Youmaysetittoamulticharacterstringto matchamulticharacterterminatorortoundefto readthroughtheendoffile.
open(my $fh, '<', $myfile) or die $!; my $txt = do{local $/ = undef; <$fh>}; close($fh); my $txt = do { local (@ARGV, $/) = ($myfile); readline(); };
[email protected]
61
Read from @ARGV file list
Themagicfilehandle*ARGVcanbeusedtoeseally processtheinput.Perlactuallydoesanimplicitopen oneachfilein@ARGV(processparameters). Usually<ARGV>issimplyfiedto<>.Inthiscase:
while (my $line = <>) { # use $line } while (<>) { # use $_ }
If@ARGVisemptywhentheloopfirstbegins,Perl pretendsyou'veopenedSTDIN. $ARGVhastheopenfile,or''ifreadingSTDIN.
[email protected]62
*DATA Filehandle
Thisspecialfilehandlereferstoanythingfollowing eitherthe__END__tokenorthe__DATA__token inthecurrentfile.
The__END__tokenalwaysopensthemain::DATA filehandle,andsoisusedinthemainprogram. The__DATA__tokenopenstheDATAhandlein whicheverpackageisineffectatthetime,sodifferent modulescaneachhavetheirownDATAfilehandle, sincethey(presumably)havedifferentpackagenames.
63
[email protected]
Exercises 3
Createaprogramtoprintafiletothestdout.The programshallreceivetwoflags(file,line) followedbyalistoffilenames:
--file:printingthefilenamebeforetheline. --line:printingthelinenumberbeforetheline. Createaversionwithopen/closeandonewithout.
perl -w printFile.pl (--line|--file)* files
Theprogramshallbecalledlike:
Note:$.isthelinenumberforthelastfilehandle accessed. 64 [email protected]
Exercises 3
Createaprogramthathasamappingofintegerto integerintheDATAsection,2integerineachline separatedbyaspace.FromtheARGV/STDINwill getaTSVfile.Theprogramshallprintthesecond fileaddingasecondcolumnwith:
IfthefirstcolumnvalueisakeyintheDATAsection, setthevalueintheDATAsection. Otherwise,set'MAPPINGNOTFOUND'.
Tosplitastringuse: split(/\t/,$str)
[email protected]
65
Regular Expressions
[email protected]
66
Regular Expressions
Thesimplestregexissimplyaword.
"Hello World" =~ m/World/ Orsimply:"Hello World" =~ /World/ print "Matches" if $str =~ /World/;
Expressionslikethisareusefulinconditionals:
Thesenseofthematchcanbereversedbyusing!~ operator:
print "No match" if $str !~ /Word/;
[email protected]67
Regular Expressions
Regexpsaretreatedmostlyasdoublequotedstrings, sovariablesubstitutionworks:
$foo = 'house'; 'cathouse' =~ /cat$foo/; # matches 'housecat' =~ /${foo}cat/; # matches foreach my $regexp (@regexps) { my $comp = qr/$regexp/; foreach my $str (@strs) { print '$str\n' if $str =~ /$comp/; } }
[email protected]
qr//compilestheRegularExpression.
68
Tospecifywheretheregexshouldmatch,wewould usetheanchormetacharacters^and$.
The^meansmatchatthebeginningofthestring. print 'Starts with Hello' if /^Hello/; The$meansmatchattheendofthestringorbeforea newlineattheendofthestring. print 'Ends with World!' if /World!$/;
[email protected]
69
Character Classes
Acharacterclassallowsasetofpossiblecharacters, ratherthanjustasinglecharacter.Characterclasses aredenotedbybrackets[].
/[bcr]at/matches'bat','cat',or'rat /[yY][eE][sS]/match'yes'caseinsensitively. /yes/ialsomatch'yes'inacaseinsensitiveway.
Thespecialcharacter''actsasarangeoperator,so [0123456789]become[0-9]:
/item[0-9]/matches'item0'or...or'item9 /[0-9a-f]/imatchesahexadecimaldigit
[email protected]
70
Character Classes
Thespecialcharacter^inthefirstpositionofa characterclassdenotesanegatedcharacterclass, whichmatchesanycharacterbutthoseinthe brackets.Both[...]and[^...]mustmatchone character,orthematchfails.Then
/[^a]at/doesn'tmatch'aat'or'at',butmatchesall other'bat','cat',.... /[^0-9]/matchesanonnumericcharacter /[a^]at/matches'aat'or'^at';here'^'isordinary
[email protected]71
Character Classes
Characterclassesalsohaveordinaryandspecial characters,butthesetsofordinaryandspecial charactersinsideacharacterclassaredifferentthan thoseoutsideacharacterclass.Thespecial charactersforacharacterclassare-]\^$andare matchedusinganescape:
/[\]c]def/matches]deforcdef $x = 'bcr'; /[$x]at/matchesbat,cat,orrat /[\$x]at/matches$atorxatandnotbat, /[\\$x]at/matches\at,bat,cat,orrat 72
[email protected]Character Classes
Perlhasseveralabbreviationsforcommoncharacter classes:
\disadigitcharacter[0-9] \Disanondigitcharacter[^\d] \sisawhitespacecharacter[\ \t\r\n\f] \Sisnonwhitespacecharacter[^\s] \wisawordcharacter[0-9a-zA-Z_] \Wisanonwordcharacter[^\w] Theperiod'.'matchesanycharacterbut[^\n]
[email protected]73
Character Classes
The\d\s\w\D\S\Wabbreviationscanbeused bothinsideandoutsideofcharacterclasses.Here aresomeinuse:
/\d\d:\d\d:\d\d/matchesahh:mm:sstime format /[\d\s]/matchesanydigitorwhitespacecharacter /..rt/matchesanytwochars,followedby'rt' /end\./matches'end.' /end[.]/samething,matches'end.
[email protected]74
Exercises 4
CreateaRegularExpressionthatmatchwithastring thatcontainsaorbfollowedbyany2characters followedbyadigit.Thestrings(adc34)and rdb850matches,butalfadoesntmatch. CreateaRegularExpressionthatmatcha5digit integerinoctalformat. Createaprogramthatreceivesoneregexpasthe firstparameterandafilelistandprintsthelines matchingtheregexp.
[email protected]75
Quantifiers
Quantifierscanbeusedtospecifyhowmanyofthe previousthingyouwanttomatchon,wherething meanseitheraliteralcharacter,oneofthe metacharacterslistedabove,oragroupofcharacters ormetacharactersinparentheses.
* + ? {3} {3,6} {3,}
zeroormore oneormore zeroorone matchesexactly3 matchesbetween3and6 matches3ormore
[email protected]76
Quantifiers
Ifyouwantittomatchtheminimumnumberof timespossible,followthequantifierwitha?.
'a,b,c,d' =~ /,(.+),/;
# match 'b,c'
'a,b,c,d' =~ /,(.+?),/; # match 'b' '[1234567890]' =~ /\[\d++\]/
Avoidunnecesarybacktracing:
[email protected]
77
Matching this or that
Wecanmatchdifferentcharacterstringswiththe alternationmetacharacter'|'.Perlwilltrytomatch theregexattheearliestpossiblepointinthestring.
Eventhoughdogisthefirstalternativeinthesecond regex,catisabletomatchearlierinthestring.
'cats or dogs' =~ /cat|dog/; # matches 'cat' 'cats or dogs' =~ /dog|cat/; # matches 'cat'
Atagivencharacterposition,thefirstalternativethat allowstheregexmatchtosucceedwillbetheonethat matches.
'cats' =~ /c|ca|cat|cats/; # matches 'c' 'cats' =~ /cats|cat|ca|c/; # matches 'cats'
[email protected]78
Grouping things and hierarchical matching
Thegroupingmetacharacters()allowapartofa regextobetreatedasasingleunit.Partsofaregex aregroupedbyenclosingtheminparentheses.
/(a|b)b/matches'ab'or'bb'. /(^a|b)c/matches'ac'atstartofstringor'bc' anywhere. /house(cat|)/matcheseither'housecat'or'house'. /house(cat(s|)|)/matcheseither'housecats'or 'housecat'or'house'.
[email protected]79
Extracting matches
Thegroupingmetacharacters()alsoallowthe extractionofthepartsofastringthatmatched.For eachgrouping,thepartthatmatchedinsidegoesinto thespecialvariables$1,$2,etc.
# extract time in hh:mm:ss format $time =~ /(\d\d):(\d\d):(\d\d)/; my ($hour, $min, $sec) = ($1,$2,$3);
Inlistcontext,amatch/regex/withgroupingswill returnthelistofmatchedvalues($1, $2, ...).
my ($hour, $min, $sec) = ($time =~ /(\d\d):(\d\d):(\d\d)/);
[email protected]80
Extracting matches
Togetalistofmatcheswecanuse:
my @listOfNumber = ($txt =~ /(\d+)/g);
Ifthegroupingsinaregexarenested,$1getsthe groupwiththeleftmostopeningparenthesis,$2the nextopeningparenthesis,etc.Forexample,hereisa complexregexandthematchingvariablesindicated belowit:
/(ab(cd|ef)((gi)|j))/; 1 2 34
[email protected]
81
Named capture
Identicaltonormalcapturingparentheses()but%+ or%maybeusedafterasuccessfulmatchtorefer toanamedbuffer.
'Michael Jackson' =~ /(?<NAME>\w+)\s+ (?<NAME>\w+)/ %+is('NAME' => 'Michael') %-is('NAME' => ['Michael','Jackson']) $1is'Michael' $2is'Jackson'
[email protected]
82
Search and replace
Searchandreplaceisperformedusing s/regex/replacement/.Thereplacementreplacesin thestringwhateverismatchedwiththeregex.
$x = "'Good cat!'"; $x =~ s/cat/dog/; # "'Good dog!'" $x =~ s/'(.*)'/$1/; # "Good dog!"
Withtheglobalmodifier,s///gwillsearchand replacealloccurrencesoftheregexinthestring:
$x = $y = '4 by 4'; $x =~ s/4/four/; # 'four by 4' $y =~ s/4/four/g; # 'four by four'
[email protected]83
The split operator
split(/regex/, string)splitsthestring intoalistofsubstringsandreturnsthatlist.The regexdeterminesthecharactersequencethatstring issplitwithrespectto.
split(/\s+/, 'Calvin and Hobbes'); # ('Calvin', 'and', 'Hobbes')
Iftheemptyregex//isused,thestringissplitinto individualcharacters.
split(//, 'abc'); # ('a', 'b', 'c')
[email protected]84
The split operator
Iftheregexhasgroupings,thenthelistproduced containsthematchedsubstringsfromthegroupings aswell:
split(m!(/)!, '/usr/bin'); # ('', '/', 'usr', '/', 'bin')
Sincethefirstcharacterofstringmatchedtheregex, splitprependedanemptyinitialelementtothe list.
[email protected]
85
Magic Variables
Wehavealreadyseen$1,$2,...Thereisalso:
$`Thestringprecedingwhateverwasmatched. $&Thestringmatched. $'Thestringfollowingwhateverwasmatched.
Thesevariablesarereadonlyanddynamically scopedtothecurrentBLOCK.
'abcdef' =~ /cd/; print("$`:$&:$'\n"); # prints ab:cd:ef 'abcdef'=~/^(.*?)(cd)(.*)$/; # $1, $2, $3 [email protected]
MaketheRegexpslow
86
Switch
use qw(switch say); given($foo) { when (undef) {say '$foo is undefined'} when ('foo') {say '$foo is the str "foo"'} when (/Milan/) {say '$foo matches /Milan/'} when ([1,3,5,7,9]) { say '$foo is an odd digit'; continue; # Fall through } when ($_ < 100) {say '$foo less than 100'} when (\&check) {say 'check($foo) is true'} default {die 'what shall I do with $foo?'} }
[email protected]
87
Smart Matching
given(EXPR)willassignthevalueofEXPRto $_withinthelexicalscopeoftheblock. when($foo) isequivalentto when($_ ~~ $foo)
[email protected]
88
Smart Matching
$a Code Any Hash Hash Hash Hash Array Array Array Array Any Any Code() Any Num Any Any Any
$b Code Code Hash Array Regex Any Array Regex Num Any undef Regex Code() Code() numish Str Num Any
Matching Code $a == $b # not empty prototype if any $b->($a) # not empty prototype if any [sort keys %$a]~~[sort keys %$b] grep {exists $a->{$_}} @$b grep /$b/, keys %$a exists $a->{$b} arrays are identical, value by value grep /$b/, @$a grep $_ == $b, @$a grep $_ eq $b, @$a !defined $a $a =~ /$b/ $a->() eq $b->() $b->() # ignoring $a $a == $b $a eq $b $a == $b $a eq $b
[email protected]
89
Exercises 5
Createaprogramthatprintoutthetitleinsideahtml file. Createaprogramthatprintoutallthelinesinafile substitutingthenumbersby#. Intheprintfile.pladdoneadicionalflaglike:
regexp=REGEXPtheprogramshallonlyprintthe linesthatmatchwiththeREGEXP.
[email protected]
90
Exercises 5
Foreachofthestrings,saywhichofthepatternsit matches.Wherethereisamatch,whatwouldbethe valuesof$MATCH,$1,$2,etc.?
'The Beatles (White Album) - Ob-La-Di, ObLa-Da' 'Tel: 212945900' '(c) (.+)\s*\1'
RegExp
/(\(.*\))/ and /(\(.*?\))/ /\d{4,}/ /(\w\w)-(\w\w)-(\w\w)/ /\W+/
[email protected]
91
Core subroutines
[email protected]
92
Useful scalar subroutines
lc(EXPR), lcfirst(EXPR), uc(EXPR), ucfirst(EXPR)lowercase,lowercasefirst, uppercaseanduppercasefirst. length(EXPR)Returnsthelengthincharacters ofthevalueofEXPR. sprintf(FORMAT, LIST)Returnsastring formattedbytheusualprintfconventionsofC. abs(EXPR), cos(EXPR), int(EXPR), log(EXPR), sin(EXPR), sqrt(EXPR) normalnumericsubroutines. 93
[email protected]
chop/chomp
chop(VARIABLE), chop(LIST):Chopsoff thelastcharacterofastringandreturnsthe characterchopped. chomp(VARIABLE), chomp(LIST):Removes anytrailingstringthatcorrespondstothecurrent valueof$/.
chomp( $line ); chomp( $line = <> ); chomp( @lines = <> );
94
[email protected]
substr
substrEXPR,OFFSET,LENGTH,REPLACEMENT substrEXPR,OFFSET,LENGTH substrEXPR,OFFSET
ExtractsasubstringoutofEXPRandreturnsit
my $var = 'Good dog'; say substr($var, 5); # 'dog' substr($var, 5) = 'cat'; # 'Good cat' substr($var, 5, 5, 'cow'); # 'Good cow'
[email protected]
95
Useful scalar subroutines
oct(EXPR)InterpretsEXPRasanoctalstring andreturnsthecorrespondingvalue.IfEXPR happenstostartoffwith:
0xinterpretsitasahexstring. 0binterpreteditasabinarystring.
defined(EXPR)Returnsabooleanvaluetelling whetherEXPRhasavalueotherthantheundefined valueundef. Note:IfEXPRisomitted,uses$_.
[email protected]96
Useful list subroutines
pop(ARRAY), push(ARRAY, LIST) Pops/Pushesavaluefrom/totheendoftheARRAY. shift(ARRAY), unshift(ARRAY, LIST) Pops/Pushesavaluefrom/tothestartoftheARRAY. Note:inthepopandshiftifARRAYisomitted,pops the@ARGVarrayinthemainprogram,andthe@_ arrayinsubroutines.Avoidtouseit.
[email protected]
97
join
join(EXPR,LIST)Joinstheseparatestringsof LISTseparatedbythevalueofEXPR.
join(':', (1..5)) eq '1:2:3:4:5'
[email protected]
98
reverse
reverse(LIST)
Inlistcontext,returnsalistvalueconsistingofthe elementsofLISTintheoppositeorder.
print reverse(1,'ab'); # prints "ab1"
Inscalarcontext,concatenatestheelementsofLISTand returnsastringvaluewithallcharactersintheopposite order.
print reverse(1,"ab").""; # prints "ba1" print scalar(reverse(1,"ab")); %by_name = reverse(%by_address);
[email protected]Invertingthekeys/valuesinahash.
99
map
map BLOCK LISTormap(EXPR,LIST) EvaluatestheBLOCKorEXPRforeachelementof LIST(locallysetting$_toeachelement)andreturns thelistvaluecomposedoftheresultsofeachsuch evaluation.
@double = map {$_*2} @nums; my @double; foreach (@nums) { push(@double, $_*2) }
[email protected]Note:$_isanaliastothelistvalue.
100
grep
grep BLOCK LISTorgrep(EXPR,LIST) FilterstheelementsinLISTusingtheBLOCKor EXPRforeachelementofLIST(locallysetting$_ toeachelement).Inscalarcontext,returnsthe numberoffilteredelements.
my @even = grep {$_ % 2 == 0} (1..100); my @even; foreach (1..100) { push(@even, $_) if $_ % 2 == 0; }
[email protected]Note:$_isanaliastothelistvalue.
101
sort
sort(LIST)Inlistcontext,thissortstheLIST andreturnsthesortedlistvalue.
Bydefaultcomparestheelementsasstrings. sort(10,9,20) #10, 20, 9 Providingaclosure,theelementscomein$aand$b. sort {$a <=> $b} (10,9,20) # 9, 10, 20 Schwartziantransform
@sorted = map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { [ $_, foo($_) ] } @unsorted;
[email protected]
102
each
each(HASH)Whencalledinlistcontext,returns a2elementlistconsistingofthekeyandvaluefor thenextelementofahash,sothatyoucaniterate overit.Whencalledinscalarcontext,returnsonly thekeyforthenextelementinthehash.
while (my ($key,$val) = each(%ENV)) { print("$key=$val\n"); }
[email protected]
103
exists
exists EXPRGivenanexpressionthatspecifies ahashoranarray,returnstrueifthespecified elementhaseverbeeninitialized,evenifthe correspondingvalueisundefined.
my @a = (1, undef); $a[3] = undef; exists($a[1]) # true exists($a[3]) # true exists($a[2]) # false my %a = ('a' => 1); exists($a{'a'}) # true exists($a{'b'}) # false
[email protected]
104
delete
delete(EXPR)Givenanexpressionthat specifiesahashelementorarrayelementdeletesthe specifiedelement(s)fromthehashorarray.
my @array = (a => 1, b => 2, c => 3); delete($array[2]); # ('a',1,2,'c',3); my %hash = (a => 1, b => 2, c => 3); delete($hash{b}); # (a => 1, c => 3);
Inthecaseofanarray,ifthearrayelementshappen tobeattheend,thesizeofthearraywillshrinkto thehighestelementthatteststrueforexists().
[email protected]105
eval / die
evalEXPR:compilesandevaluatesthe expressionandcatch'stheexception. evalBLOCK:evaluatestheexpressionandcatch's theexception. dieEXPR:ifoutofevalprintstheerrorandexits withthevalueof$!,otherwisesetsthevaluein$@ andexitsevalwithundef. eval{$answer=$a/$b;}; warn$@if$@;
[email protected]106
Exercises 6
Createasubroutinethatreceivesanarrayand returnsahashreferencewiththeelementsaskeys andtheinversedstringasvalues. Createasubroutinethatreceivesanarrayofscalars andreturnanewonejustwiththestringswith lengthsmallerthan20. Createaprogramthatreadsafileandprintsallthe stringscapitalized.Thismeans:
Firstcharacterinuppercase. Theremaininginlowercase.
[email protected]
107
Modules and OO
[email protected]
108
Package
Theideaistoprotectedeachpackagevariables.
package Dog; our $c = 1; my $d = 1; sub inc {$c++; $d++} package main; our $c = 0; my $d = 0; sub inc {$c++; $d++} print("$d-$c-$Dog::d-$Dog::c\n");# "0-0--1" Dog::inc(); print("$d-$c-$Dog::d-$Dog::c\n");# "0-0--2" inc(); print("$d-$c-$Dog::d-$Dog::c\n");# "1-1--2"
[email protected]
109
use Modules
Themoduleisjustapackagedefinedinafileofthe samenamewith.pmontheend. Perlmodulesaretypicallyincludedbysaying:
use MODULE LIST; BEGIN { require MODULE; MODULE->import( LIST ); }
use Data::Dumper; # exports Dumper print(Dumper({1 => 3}));
[email protected]110
use Modules
Anydoublecolonsinthemodulenameare translatedintoyoursystem'sdirectoryseparator,so ifyourmoduleisnamedMy::Mod,Perlmightlook foritasMy/Mod.pm. Perlwillsearchformodulesineachofthe [email protected] directoriesyoucanusethelibpragma.
use lib '/path/to/modules';
[email protected]
111
Write Modules
Tostartatraditionalmodulecreateafilecalled Some/Module.pmandstartwiththistemplate:
package Some::Module; # package name use strict; use warnings; # use always use Exporter; our @ISA = qw(Exporter); our $VERSION = 05.22; our @EXPORT = qw($var1 &func1); our @EXPORT_OK = qw($var2 &func2); our ( $var1, $var2 ) = ( 1, 2 ); sub func1() {print("func1\n");} sub func2() {print("func2\n");} 1; # has to finnish with a true value
[email protected]
112
Write Modules
use Exporter; our @ISA = qw(Exporter); ImportExportermodule.Derivethemethodsof Exportermodule.Eachelementofthe@ISAarrayis justthenameofanotherpackage,thepackagesare searchedformissingmethodsintheorderthatthey occur. use base qw(Exporter); # is similar our $VERSION = 05.22; Setstheversionnumber.Importinglike: use Some::Module 6.15;
[email protected]
113
Write Modules
our @EXPORT = qw($var1 &func1); Listsofsymbolsthataregoingtobeexportedby default(avoidtouseit). our @EXPORT_OK = qw($var2 &func2); Listsofsymbolsthataregoingtobeexportedby request(betterpractice). our ( $var1, $var2 ) = ( 1, 2 ); sub func1() {print("func1\n");} sub func2() {print("func2\n");} Definitionofthesymbols.
[email protected]
114
Write Modules
Thisisuglybutcanbeusedtocallsubroutineas well.
my $name = "Some::Module"; Thepackagenamepassasthefirstparameterofthe subroutine. Some::Module->func(); $name->func(); Thiswillnotpassthemodulenameinsidethesubroutine Some::Module::func(); &{"${name}::func"}();
[email protected]115
Perl Objects
Therearethreeverysimpledefinitions.
Anobjectissimplyareferencethathappenstoknow whichclassitbelongsto. Aclassissimplyapackagethathappenstoprovide methodstodealwithobjectreferences. Amethodissimplyasubroutinethatexpectsanobject reference(orapackagename,forclassmethods)asthe firstargument.
[email protected]
116
Object constructor
Perldoesn'tprovideanyspecialsyntaxfor constructors.Aconstructorismerelyasubroutine thatreturnsareferencetosomethingblessedintoa class.Usuallythesameclassthesubroutineis definedon.
package Animal; sub new { bless({}) } package Animal; sub Animal { bless({}) }
[email protected]
Thatwordnewisn'tspecial.
117
Objects Inheritance
Ifyoucareaboutinheritancethenyouwanttouse thetwoargformofblesssothatyourconstructors maybeinherited.
package Animal; sub new {return bless({}, shift @_);} package Dog; use base qw(Animal); # use 'Animal'; # +- true # push @ISA, 'Animal'; my $dog = Dog->new();
[email protected]
Thiswouldbecalledlike:
118
get/put method
Theget/putmethodinperl
sub property { my ($self, $value) = @_; $self->{property} = $value if @_>1; return $self->{property}; }
Uselike: $obj->property(1); print $obj->property();
[email protected]119
Method overwriting
Themostcommonwaytotocallamethodfroman objectis:
print("Dog: ".$dog->fly()."\n"); print("Bat: ".$bat->fly()."\n"); Perlwilllooktothescalarreferenceandseethepackage nameoftheblessedreference. package Animal; sub fly { return 0; } package Bat; use base qw(Animal); sub fly { return 1; }
MethodImplementation
[email protected]
120
Method overwriting
Ifyouneedto,youcanforcePerltostartlookingin someotherpackage
$bat->Insect::fly(); # dangerous
Asaspecialcaseoftheabove,youmayusethe SUPERpseudoclasstotellPerltostartlookingfor themethodinthepackagesnamedinthecurrent class's@ISAlist:
$bat->SUPER::fly();
[email protected]
121
Object Destroy
Theobjectisautomaticallydestroyedwhenthelast referencetoanobjectgoesaway.Ifyouwantto capturecontroljustbeforetheobjectisfreed,you maydefineaDESTROYmethodinyourclass.Itwill automaticallybecalledandyoucandoanyextra cleanupyouneedtodo.Perlpassesareferenceto theobjectunderdestructionasthefirstargument.
[email protected]
122
Exercises 7
Createamodulewiththesubroutinesmin,maxand in. Createasetofclassestorepresenttheanimalfly capabilities.Shallhavetwomethodsflyand name(get/put),theconstructorreceivestheanimal name.Considerthefollowingrules:
dog is a animal bird is a animal penguin is a bird animal doesnt fly bird flies penguin doesnt fly
Createaprogramtotestthem.
[email protected]
123
Standard Modules
[email protected]
124
pragma
Theusualones
strictRestrictsunsafeconstructs warningsControloptionalwarnings libManipulate@INCatcompiletime baseEstablishanISArelationshipwithbaseclass constantDeclareconstants(considerReadonly) subsPredeclaresubnames integerForcesintegermath
[email protected]Perl5.10givesnewfeatures,seeperlpragma
125
Usual guilties
Data::Dumperstringifiedperldatastructures Carpbetterdie()andwarn() Cwdpathnameofcurrentworkingdirectory ExporterImplementsdefaultimportmethodfor modules POSIX IPC::Open3openaprocessforreading,writing,and errorhandling Time::HiResHighresolutionalarm,sleep, gettimeofday,intervaltimers
[email protected]
126
World-Wide Web
LWPTheWorldWideWeblibrary LWP::UserAgentWebuseragentclass HTTP::RequestHTTPstylerequestmessage HTTP::ResponseHTTPstyleresponsemessage LWP::SimpleSimpleproceduralinterfacetoLWP LWP::Simple::PostSinglemethodPOSTrequests HTTP::AsyncprocessmultipleHTTPrequestsin parallelwithoutblocking.
[email protected]127
World-Wide Web
WWW::MechanizeHandywebbrowsinginaPerlobject Net::SLPaccessingtheServiceLocationProtocol Net::POP3PostOfficeProtocol3Client Net::SMTPSimpleMailTransferProtocolClient MIME::LitelowcalorieMIMEgenerator JSONJSONencoder/decoder JSON::XSfastJSONserialising/deserialising
[email protected]
128
Apache/mod_perl packages
Apache2::ConstPerlInterfaceforApacheConstants Apache2::RequestIOPerlAPIforApacherequest recordIO Apache2::RequestRecPerlAPIforApacherequest recordaccessors Apache2::RequestUtilPerlAPIforApacherequest recordutils CGIHandleCommonGatewayInterfacerequestsand responses
[email protected]129
Security
Data::UUIDGeneratingGUIDs/UUIDs MIME::Base64Encodinganddecodingofbase64 strings Digest::SHAPerlextensionforSHA 1/224/256/384/512 Digest::MD5PerlinterfacetotheMD5Algorithm Crypt::DESPerlDESencryptionmodule Net::SSHPerlextensionforsecureshell
130
[email protected]
Test
Testprovidesasimpleframeworkforwritingtest scripts Test::Moreyetanotherframeworkforwriting testscripts Test::ExceptionTestexceptionbasedcode Test::OutputUtilitiestotestSTDOUTand STDERRmessages.
[email protected]
131
Other
DBIDatabaseindependentinterfaceforPerl DBD::*PerlDBIDatabaseDriver DBD::SQLite, DBD::CSV, DBD::Google, ... TemplateTemplatetoolkit HTML::TemplatePerlmoduletouseHTMLTemplates
[email protected]
132
Advanced Perl
[email protected]
133
DBI
use DBI; my $dsn = "DBI:mysql:database=$database;" . "host=$hostname;port=$port"; my $dbh = DBI->connect($dsn, $user, $password); my $sth = $dbh->prepare( "SELECT * FROM person WHERE name = ?") or die $dbh->errstr; $sth->execute('oleber') or die $dbh->errstr; while (my $ref = $sth->fetchrow_hashref()) { print Dumper $ref; } $sth->finish(); $dbh->disconnect;
134
[email protected]
AUTOLOAD a method
Whenthemethodisn'tfound,theAUTOLOADwillbecalled.
sub AUTOLOAD { my ($self, @params) = @_; my $name = $AUTOLOAD; $name =~ s/.*://; # strip name die "Can't access '$name' field" if not exists $self->{_p}->{$name}; ( $self->{$name} ) = @params if @params; return $self->{$name}; }
[email protected]
135
tie variable
package SConst; sub TIESCALAR { my ($pkg, $val) = @_; bless \$val, $pkg; return \$val; } sub FETCH { # read return ${shift()}; } sub STORE { # write die "No way"; } 1;
use SConst; my $var; tie $var, 'SConst', 5; print "$var\n"; $var = 6; # dies
[email protected]
136
Q/A
[email protected]
137