Sei sulla pagina 1di 12

#!/usr/bin/perl ################################################################ ########### ################ Professor Liang's Perl Tutorial In Perl ########### ###### ############## Original Version 10/2004.

Modified 10/2007 ############### # ########################################################################### # # # # # # # # # # # # This document is meant for relatively advanced computer science students to quic kly begin using Perl. Our purpose is to study the characteristics of the Perl la nguage in comparison with other languages. The reader is urged to consult what I consider to be the standard reference, "Programming Perl" by Larry Wall, et. al ., as well as other online tutorials and references to learn all the capabilitie s and uses of the language. Perl is available from www.perl.org. You should run the code in this tutorial with "perl perltutorial.txt" so you can see what exact ly each code fragment is doing. Before each example I print a number in the form "1: ", "2: ", etc..., so you can correlate the output with the source code. You should also experiment by making changes to the code. Strings and More Strings ######## I. print "1: "; print "2"*3; # # # # # # # # In any other language, you would think me crazy: how can you multiply a string b y an integer? Surely this is a mistake only a beginner will make. However, in Pe rl, you will in fact see 6! How? Because strings are the basic data type in Perl . Most everything is treated as strings, including the unquoted 3. Perl is best in practice for text processing, such as with the html source of web pages. It i s actually quite awful for manipulating binary data (C would be best for that pu rpose). Strings therefore have a special status in Perl. Scalar Variables and St atic Scope ######## II. # # # # # # # # # # Perhaps one reason for the popularity of Perl is all those dollar signs! Most va riables (except for file handles) in Perl require a special symbol in front to d esignate its type. The most common symbol is "$". The $ symbol in Perl signifies the presence of a scalar value. Scalar values include numerical values, strings , and perhaps most importantly, pointers (memory addrs). All variables containin g scalar values must be prefixed with $. This is a characteristic that Perl inhe rited from Unix scripting languages, though it has transcended that simple role long ago. If you are familiar with Java, scalars correspond to fixed, primitive types (plus strings). print "\n2: "; $x = 2; # assigns 2 to global scalar variable $x { my $x = 3; # a ssigns 3 to a scoped, "local" variable print "\nmy x is $x\n"; # prints local va riable 3

} print "...and mine is $x\n"; # prints global variable 2 # As the above program segment indicates: # # # # # # 1. Variables do not have to be declared as in "i nt x;" before being used 2. "my" introduces a lexically scoped variable whose sc ope is defined by the enclosing {}s. It is a bit ironic to call it a "local" var iable because "local" is a keyword used for something else. 3. When a scalar var such as $x is placed inside string ""s, Perl will in fact expand its value. To prevent this from happening, use print '3: x is still $x inside single quoted st rings!', "\n"; # 4. There's no "main" in Perl. It's run as a script. I # If you're new to Perl, it's natural to forget the $ before variables. # still do sometimes. But they are necessary. ######## III. Booleans and if-else # Like C (and unlike Java), Perl has no special type for booleans. The # null po inter, 0, "", or even "0" all represent false. Everything else # represents true . You'll sometimes see Perl code such as print "4: "; if ($val) { print $val; } else { print "\$var is undefined\n"; } # $val returns false if it was not define d. # It is important that, in the if-else statement, {}'s enclose the # two case s. They are not optional. Why? Think you know C++/Java? # What does the followin g code print? # if (2<1) # if (1<2) cout << "me"; # else cout << "no, me"; # # # # It won't print anything, but you'll have to know that the "else" is by conven tion associated with the closest (innermost) "if", otherwise you may get confuse d. This is the classic "dangling else" problem. The required {}'s of Perl help t o eliminate this confusion. ######## IV. # # # # # Arrays and Hash Tables In addition to scalar values, the @ symbol is used to prefix arrays, and the % s ymbol prefixes hash arrays. Perl arrays are not really arrays in the sense of C or even Java in that they don't necessarily represent a fixed segment of memory. In fact, arrays in Perl are more appropriately called linked lists in that they can be expanded and shrunk. # declare an array or list of three integer element s print "5: "; my @l = (3,5,7);

@l = (2,@l); push(@l,4); pop(@l); print "@l\n"; # # # # adds an element in front of l adds element (4) destructively onto right end of l ist deletes rightmost element and returns it. you don't have to write a loop to print an array # Why are these things called arrays and not just lists? Because Perl # gives th e user the convenient syntax of accessing list elements using # the familiar bra cket notation: $l[2] += 4; # # # # # # increments the third element of the array by four. NOT FAIR! If l is an array/list then why did we still put $ in front of it? This is a point of contention in the Perl community, and may change in the future. T he reason for the $ is that, although l is an array, l[2] is an integer, which i s a scalar. That's just the way things are now with Perl. If the value of l[2] i s also an array, we would use @l[2]. # Here's how you use a for loop to print an array backwards print "6: "; for(my $i=$#l; $i>=0; $i--) { print $l[$i], " "; } print "\n"; # # # # # The expression $#l is the last valid index of l, or the length of l minus one. Note that $i is local (by virtue of "my") within the loop. An alternative is just to say my $i = @l-1; Perl will automatically infer from the given context of assigning an arr ay to a scalar that by @l you really mean the length of the list. ## Hash tables are also very basic data structures in Perl. Here's a table # one might use to store those Hofstra student id's: my %id; # declares hash table $i d{"larry"} = 700123456; $id{"mary"} = 700654321; # etc ... print "7: mary's id i s ", $id{"mary"}, "\n"; # The % symbol prefixes the hash table, while the {}'s ( as opposed to []'s) # signify that you're accessing a hash table instead of an o rdinary array. # The function "keys" returns a list containing all the keys of a hash table: print "8: here are my keys: ", keys(%id), "\n"; print "9: they look better separated by a comma: ", join(", ", keys(%id)), "\n"; # The "join" built -in function separates the elements of a list using a # given string (in this ca se ", "). It's commonly used for formatting # output. ##### The _ variable # Per l has a special variable "_" which basically represents "whatever # is most rele vant in the current context." For example, inside a

# procedure, it represents the list (array) of parameters passed to # the proced ure. print "10: "; # try this: foreach (keys(%id)) { print $id{$_}, "--"; } # th e foreach loop goes through every element of a list, and inside the # body of th e loop $_ refers to the current value of the list being examined. # The above fo reach loop can also be written by associating a variable # with each element, as in foreach $x (keys(%id)) { print $id{$x}, "--"; }. ######## V. Subroutines: Lambda Terms by Another Name # Here's the non tail-recursive ("naive") fibonacci function: sub fib1 { my $n = $_[0]; if ($n<2) {1} else {fib1($n-1) + fib1($n-2)} } # # # # # # # # # Several things are important to point out. The parameters of the subroutine "fib1" are contained in the implicit array "_". Thus $_[0] is the first argument, and $_[1] would be the second, and so on. Secondly, the "return" keyword is optional in p erl: whatever is the last expression evaluated determines the value returned by the function. You might be wondering: why did I have to declare a local variable $n? Can't I just use $_[0] throughout? Well, for this function it doesn't matte r, but Perl passes variables to a function in a different way than what you migh t expect: sub swap { my $temp; $temp = $_[0]; $_[0] = $_[1]; $_[1] = $temp } # The semicolon is optional on the last line in {}'s $x = 2; $y = 3; swap($x,$y); print "\n11: the values of \$x and \$y are now $x a nd $y: they got swapped!\n"; # # # # # # I use the swap function to remind peopl e that, conventionally, a function's parameters are local variables within the f unction. The swap function wouldn't swap anything in Java, but the Perl program above does! By default, Perl passes parameters by REFERENCE. If you know C++, it 's the same as saying void swap(int& x, int& y) # That is, whatever you do to a parameter variable WILL be persistent # even aft er the function exits. This may be a desirable behavior, such

# # # # # as with the swap function above. However, in general, the standard call-by-value method is recommended. By assigning $_[0] to a locally declared variable (via " my $n=$_[0]"), I am making the function behave in the "conventional" way. That i s, as a self-contained programmatic unit. Whatever you do to $n will be local wi thin the function. I don't have to declare variables one # Here's a nice feature of Perl. # at a time: my ($x,$y); # declares two variables $x and $y at once using a list. # Similarly, here's an easier way to swap two values: ($x,$y) = ($y,$x); # Here's the tail-recursive (i terative) fibonacci function: sub fib2 { my ($n,$a,$b) = @_; # localize all argu ments if ($n<2) {$b} else {fib2($n-1,$b,$a+$b)} } print "\n12: the 100th fibonac ci number is ", fib2(100,1,1), ".\n"; # The naive fibonacci function will give y ou the same answer, but # you'll have to wait around 20,000 years to see it. # p rint fib1(100); #uncomment at your own risk # At the end of this tutorial we will use Perl's extraordinary power # to make t he naive fibonacci function almost as fast as the tail-recursive # version. # # # # # # You might be wondering: what if I only passed one or two arguments to a function like fib2, which expects 3? The answer is that the result becomes unpre dictable. This is a contrast between strongly typed (Java) languages and weakly typed ones (Perl, Scheme). You can expect less errors to be caught at compile-ti me with Perl. That's the price you pay for the dexterity of weakly typed languag es. # Just to be complete, here's "fib3", which uses a while loop (happy now?) sub f ib3 { my $n = shift; # alternative to my $n = $_[0]; my ($a,$b) = (1,1); # initi al values for $a and $b while ($n>1) { ($a,$b) = ($b,$a+$b); $n--; } $b # the ; is optional for the last line inside {}'s } # Perl's design philosophy is to giv e programmers a variety of styles to # choose from. For example, print "13: "; p rint "1<2\n" unless (1>2); # is the same as if (!(1>2)) {print "1<2\n"}. Beginne rs # Perl is for experienced programmers who love programming.

# should stay away from Perl as they would end up using it in only # uninteresti ng ways and develop lots of bad habits. # # # # # Finally, you'll see function a pplication sometimes written as &fib3(100). & is the symbol that prefixes functi on variables just as $, @ and % prefixes scalars, arrays and hash tables respect ively. You may also see (fib3 100) sometimes, which is application in the lambda calculus/scheme style. Pointers ######## VI. # # # # # # Pointers (aka references or memory addresses) are an important datatype in Perl. For Java programmers who are not familiar with the generic use of pointers, thi s section may seem a bit difficult. It is possible, however, to avoid trouble wi th pointers by using them in a uniform way, just as in Java. The next section, o n pointers to functions, will adopt this approach. my $x = 3; my $z = \$x; # sets $z to point to $x. "\" works like "&" in C. $$z + = 1; # to buy back the value from the pointer, you need two dollars :-) print "1 4: the value that $z points to is ", $$z, "\n"; # You can also have pointers to complex structures: my $x = \%id; # points x to the id hash array we used earlie r. print "15: $x points to ", %$x, "\n"; # # # # Note that $, not %, still prefi xes x. The pointer itself is a scalar, that is, a 32 bit memory address. To dere ference a pointer back to its value, as the above examples indicate, we use anot her $, %, or @ infront, depending on the type of the item being pointed to. # To illustrate when pointers are needed, let's first look at a function that # does NOT require them. The following function returns the index of an # element $x inside a list @L, returning -1 if it doesn' exist: sub indexof { my ($x,@L) = @_; # returns position of x inside L my $i = 0; while (($i <= $#L) && ($x != $L [$i])) {$i++;} if ($i<=$#L) {$i} else {-1} # return -1 if $x not found in list. } # indexof(3,(4,3,6,8,7)) will return 1, the index of the "3" inside the list. # # # # # # This function did not need pointers because Perl nicely separates th e head (or "car") of the list from the rest ("cdr") of the list in the way you'd expect. However, sometimes you may want to pass in something else AFTER the lis t, or pass two distinct lists to a function. The next function returns the inter section of two lists. Note the use of pointers, and deduce for yourself why they 're needed. sub intersection { my ($A,$B) = @_; # assigns two POINTERS to the args my @I = ( ); # intersection list to be constructed, initially null foreach my $x (@$A) # f or each element x in A,

{ foreach my $y (@$B) # check if it's also in B { if ($x == $y) { @I=($x,@I) } # add x to I list (can also use push) } # inner loop } # outer loop @I; # return the I list that was built. } my @l = (1,3,4,7,2,8); my @m = (3,9,6,4,1); print " 16: the intersection of @l and @m is ", intersection(\@l, \@m), "\n"; # # # # # # Look at the code carefully to see where pointers are making a difference: For example, @$B retrieves the list from the pointer $B. Also, $$B[$i] is required t o access the (scalar) values of the list. In order to pass a complex structure t o a function, in general you'll have to use pointers. In the above function, at least the first list had to be passed in as a pointer. # Here's another way to have hash tables, using pointers: $myhash->{"key1"} = "v alue1"; $myhash->{"key2"} = "value2"; # # # # # # Perl infers from {} and -> tha t $myhash is a pointer to a hash table. C/C++ programmers should know that "A->B " is really "(*A).B". That is, it dereferences the pointer A and at the same tim e retrieves the field B from the dereferenced struct/object. Perl expands this m eaning of "->" to the case of arrays, hash tables (and as you will see in the ne xt section, even functions). For arrays you can similarly have $A->[0] = 1; $A->[1] = 2; print "17: referenced array: ", @$A, "\n"; # # # # # # # prints contents of array Just as in C/C++, one way to avoid confusion with pointers is to use them in a C ONSISTENT manner In fact, this observation led to the uniform treatment of point ers in the Java language. That is, if you adopt the policy to: 1. Never use poin ters to scalar values 2. Always use pointers to complex structures # then you'll be emulating the approach of Java (except for strings). ######## VII. # # # # # # # Pointers to Functions Now we finally get to what I consider to be the funnest part of Perl: its abilit y to be used as a fully general, higher-order language that's (nearly) as expres sive as Church's lambda calculus. A function (or "subroutine") in Perl can be us ed like any other value. It can be passed to another function, returned by a fun ction, and assigned to a variable. Here's the (naive) fibonacci function express ed as a Perl lambda term assigned to a variable:

$fib = sub { my $n=shift; # same as my $n = $_[0]; if ($n<2) {1} else {$fib->($n -1) + $fib->($n-2)} }; # Note the ";" at the end, since this is just an assignme nt statement! # No name follows "sub" - it's just "lambda" for Perl. To apply th e # function pointed to by $fib, we use $fib->(args). So now you see: # # # $A-> [$i] $A->{$i} $A->($i) accesses the array pointed to by $A at index $i accesses the hash array pointed to by $A for key $i accesses the function pointed to by $ A and applies it to $i # A characteristic of a well-designed language is generality. Once you # get use d to all the $#%@ (not an explicative) you'll see that most # everything in Perl simply MAKES SENSE. # # # # # # # # # # # # # # Having said that however, I sho uld point out one subtlety: the definition of $fib wouldn't have worked if I had used my $fib = ...; because the recursive calls to $fib would refer to somethin g not defined yet. "my" in Perl corresponds to "let" in functional languages suc h as Scheme. However, Scheme contains another construct "letrec" that allows one to bind recursive definitions. Perl lacks this construct, but fortunately it's not a big deal. To bind $fib to a local var, simply declare it first on a separa te line: my $fib; $fib = sub { ... }; When we assign a function to a local we ef fectively get a locally defined fibonacci function uses this ability initially p ass two additional values variable inside a function, function. The following ta il-recursive to hide the fact that you need to (1's) to the recursive function: $tfib = sub { my $f; # local recursive function $f = sub { my ($n,$a,$b) = @_; i f ($n<2) {$b} else {$f->($n-1,$b,$a+$b)} }; $f->($_[0], 1, 1); # call internal f unction }; # Now to call $tfib, we can just say $tfib->(10), without having to p ass in # the two 1's. # # # # Defining local functions, in addition to hiding im plementation detail, can in fact also give us a form of object-orientation. Howe ver, I will leave that discussion out of this tutorial. You may consult my docum ent "Bank Accounts in Perl" to see how this is done. Higher Order Functions ######## VIII. # Being able to pass a function as an argument to another function can # be a ve ry useful feature. In fact, graphical user interface API's # commonly rely on th em in defining "callback" functions that handle

# asynchronous events. The following function is a classic: it applies # a given function to a list of values: sub mapfun { my ($f,@L) = @_; # separate car,cdr of @_ into function and list my @M =(); # new list to be built foreach my $x (@L ) { push( @M, $f->($x) ); } @M # return new list } $f = sub { 2**$_[0]; }; # fun ction to return 2 to nth power (** = Math.pow) @powers = mapfun($f,(1,2,3,4,5,6, 7,8,9,10)); print "18: this is how computer people count: ", join(" ",@powers), "\n"; # It's also possible to inline a function when passing it, without definin g # it first: @squares = mapfun(sub{$_[0]*$_[0]}, (1,2,3,4,5)); print "19: squar es: @squares \n"; # A function can also return a function. The following example composes # two functions that are passed in as arguments: sub compose { my ($f, $g) = @_; # parameters are functions $f and $g sub { $f->($g->(@_)); } # fog(x) = f(g(x)) } # compose returns a function that applies g, then f to its arguments . $f = sub { $_[0] * $_[0] }; # lambda x. x*x $g = sub { $_[0] + 1 }; # lambda x . x+1 $fog = compose($f,$g); print "20: applying a dynamically generated functio n: ", $fog->(4), "\n"; #### Automatic Memory Management. # note that: @l = mapfun($f,@l); # # # # # # # # # will effectively replace @l w ith a new list, namely the list built by mapfun. If you're a C/C++ programmer, y ou may be wondering what happened to the original @l list - doesn't it need to b e deallocated? The answer is that like Java, Perl is a modern programming langua ge that does automatic memory management or "garbage collection". Scheme was the language used to develop this important technology. Far from just a convenience , memory management frees the programmer to think at a higher level, and gives r ise to a style of programming previously considered impractical. # To (temporarily) bring an end to this tutorial, I will now write a function # that can optimize the performance of a function passed to it. The # idea is to a void redundant computation by storing the results of function

# # # # # # # # # calls in a hash table. Then, the next time the function is called on the same ar guments, the hash table is first checked to see if a result already exists. It's important to point out that this technique only works for a certain kind of fun ctions: it doesn't work for functions that change some external state, e.g, it w on't work for any "void" functions. But it works wonderfully on recursive functi ons such as the naive fibonacci function, as it would eliminate all redundant re cursive calls. The function takes a function as an argument and returns an optim ized version of it: sub makehashfun { my $f = shift; # function to be optimized my $hash; # local ha sh table to store results sub { # new version of function my @args = @_; my $jar gs = join ",",@args; # join multiple args into hash key my $val = $hash->{$jargs }; # look up hash table if ($val) {$val;} # if value exists, we're done! else { # need to call function $val = $f->(@args); # calls function $hash->{$jargs} = $ val; # store result in hash table $val; # return value } # else } # returned sub routine of makehashfun } # makehashfun $fib = makehashfun($fib); # optimizes nai ve fibonacci function (see Sec. VII) # comment out the above line at your own ri sk! print "21: Now you won't have to wait 20,000 years to see ", $fib->(100), "\ n"; # # # # # # # The function returned by makehashfun is called a "closure". In addition to being a lambda term, it also carries with it an "environment", name ly its hashtable. The hash table is "stateful" - that is, it retains its values between separate calls to the function. In this sense, the returned subroutine b ehaves more like a "method" in an oop language than a pure "function". This topi c, however, is out of the scope of this "kick start" tutorial. # Having seen higher-order functions, you are now ready to read my # document "L ambda Calculus in Perl" for the greatest spiritual journey # in computer science . ######################################################################### # # # # # # # # # There's a lot more to talk about in Perl. As my purpose here is to introduce the essential characteristics of the programming language, I've not t ouched on some features that make Perl so popular in practice, such as its I/O m odel and its facility with regular expressions and parsing. I've also not touche d on the recently-added support for a rudimentary form of object-orientation (Pe rl packages). A large number of ready-made Perl modules are available from www.c pan.org. You may find these topics in other Perl references, or stay tuned for a future, expanded edition of this tutorial. # In addition, I also have a number of programs that further illustrate

# the uses and characteristics of Perl. You should consult the following # files on my programming languages class homepage: # # # # # # # lambdaperl2.txt : Lam bda Calculus in Perl dynamic.pl: Explains the use of "local" in contrast to "my" . perlbank.txt: Uses closures, alluded to above, for a style of OOP blessed.txt : Uses the new Perl Package-based OOP webclient.pl : TCP client to download html pages from a web server byteordering.pl : Binary data manipulation in Perl (not for the weak) submitprog2.txt : CGI form for uploading files to a web server print "\n ... Do cool stuff with Perl ...\n";

Potrebbero piacerti anche