[Perl 6 page]

A Romp around addn

The romp articles explore Perl 6 with an effort to explaining some language feature, and furthermore to explain all the features encountered, down to the primitives. Starting with a simple example, it drills down and explains it in terms of the most elementary code. So, the writing with meander a bit and is intentionally not “focused” like a normal article.

I’m inspired by a section in the 1995 book ANSI Common Lisp by Paul Graham.

New Tools

Why learn Lisp? Because it lets you do things that you can’t do in
other languages.  If you just wanted to write a function to return
the sum of the numbers less than n, say, it would look much the
same in Lisp and C:

 ; Lisp                   /* C */
 (defun sum (n)           int sum(int n){
   (let ((s 0))             int i, s = 0;
     (dotimes (i n s)       for(i = 0; i < n; i++)
       (incf s i))))          s += i;
                            return(s);
                          }

If you only need to do such simple things, it doesn’t really matter
which language you use.  Suppose instead you want to write a function
that takes a number n, and returns a function that adds n to its
argument:

 ; Lisp 
 (defun addn (n)
   #'(lambda (x)
       (+ x n)))

What does addn look like in C?  You just can’t write it.

You might be wondering, when does one ever want to do things like
this?  Programming languages teach you not to want what they cannot
provide.  You have to think in a language to write programs in it,
and it’s hard to want something you can’t describe. …

He goes on to explain that lexical closures are one of the abstractions not found in so many other popular languages. You may know that Perl 5 has them, and that feature alone sets it so far apart from C++, Java, C# (prior to 2.0), etc. that it makes me wish I could use Perl for more things.

our Int sub sum(Int $n)
 {
  my Int $i=1, Int $s=0;
  loop ($i=0; $i < $n; $i++) {
     $s += $i;
     }
  return $s;
 }

Point is, you can write addn in Perl. But first things first. Look at the sum program. Translating that literally to Perl 6 would give the function on the left. This is a close to a literal translation from the C listing as I could make it. The keyword our or my comes before the type when declaring things, the Int type is capitalized, variables have sigils, and the keyword for changes to loop. But it looks and reads the same. So Perl 6 isn’t hard at all, is it‽

In case you are wondering, the our keyword before the return type is like saying extern in C. It is a scoping declarator, and its meaning isn’t exactly the same but it is the same kind of thing, and in the broadest sense it is the direct translation for “this function is not just for the stuff that follows but is available elsewhere”. In C you usually omit the extern keyword in this case. But in the Perl 6 grammar, you have to have a declarator keyword before a type, so if I left off the our (which I could, without changing the meaning) I could not declare the return type before the function name.

The use of my before the variables is the same as declaring them auto in C. Do you even use auto in C, or always leave it off? In Perl 6 you can’t leave it off.

sub sum($n)
 {
  my $s=0;
  loop (my $i=0; $i < $n; $i++) {
     $s += $i;
     }
  return $s;
 }

The Lisp version of the code doesn’t declare any types. So the listing on the right shows that in Perl 6 you don’t have to declare the types either. It also moves the declaration of $i into the loop initializer, like you can do in C++, and is more like the Lisp code which has i local to the loop body.

The use of sum will still check that exactly one argument is passed (unlike in Perl 5), but does not care what type of value you pass. The function will work correctly with Int and Num (Num is floating point) and specialized flavors thereof, works if you pass a string that contains a formatted number, but will give bogus results if you pass a string like "Bob", and will throw an exception if you pass it an object that can’t be converted to a Num at all (when it gets to the < operator). So there is something to be said about declaring types, for documentation and automatic checking. But in Perl 6 it is optional, and you can type things when you want to and leave it off when you want to.

The first examples are a direct and fairly literal translation, which shows you that you can write C in Perl if you want. That can be very handy when translating old functions, since you don’t have to redesign everything up front. However, there are easier ways to do things in Perl 6

How many times have you written this counting loop in C or C++? In Perl 6 there is a more direct way to express that: for ^$n -> $i { $s += $i }. The prefix ^ means count from 0 to the specified value, which writing it out in full would be 0..($n-1) using the range operator of dot-dot. the range operator also has forms ^..^, ^.., and ..^, where the caret means to exclude that endpoint from the range. So rather than needing to write the −1 with added parens, you could simply say 0..^$n which means “a list of all the number from 0 up to but not including $n”. So, you can see that ^$n is a further shortcut for that, when the starting point is zero.

Looking at the for construct again, you can pretty much guess what the -> $i means. Obviously, that is how you specify the loop variable that gets each value in turn. But, this syntax is not a special part of the for statement syntax. Rather, it is a general thing that can be used anywhere. It declares the parameter list of the block that follows. The for calls the block on each iteration, so declaring $i as a parameter, in exactly the same way as a function has a parameter list, gets you the loop value.

If you don’t want to declare $i at all, you can just use the default which is the so-called topic in Perl 6, and is $_. That is a default variable that is used in a lot of places. So you could write for ^$n { $s += $_ } and mean the same thing.

In C or C++ you might leave off the braces since the body of the loop is only one statement. In Perl, the loop body must be a block. But, there is an alternate form in which you put a statement modifier after a single statement. And for can be used as such a modifier. That gives you $s += $_ for ^$n;. First you say up front “do this”, and then you modify it with “do that n times”. Putting the modifier last rather than first can be clearer to read. In Perl, you have a choice. I don’t mind using $_ in this case, since it is set and used in such a small region. It is exactly like the word “it” in English. Use it only when the antecedent is both close and clear. Just for completeness, let me show you that besides using the default name you can use implicitly-declared parameters, and write: for ^$n { $s += $^i } where the ^ twigil indicates that this is an implicit block parameter, that would be in the parameter list signature if I had bothered to write it. That is useful for short blocks that have more than one parameter. As I said, in this case I’ll stick with $_.

sub sum($n)
 {
  my $s=0;
  $s += $_  for ^$n;
  return $s;
 }
sub sum($n)
 {
  [+] ^$n
 }

The listing on the left shows the current state of the code. But, I can do even one better. Why do you need to write an explicit looping construct at all? Perl 6 has reduction operators which operate on a list. Basically, [+] @list will put the + between the items in the list. For example, [+] (1,2,3) becomes 1 + 2 + 3, and [*] ($a,$b,$c,$d) becomes $a * $b * $c * $d. So rather than a for loop, just use the reduction operator to say “add up all these numbers”. That means you don’t need to declare $s either, and the whole function collapses into one line as seen on the right:

I don’t need the return keyword because if you fall off the end of a function, the last thing is returned. The rules are different from Perl 5, but the idea is the same. Normally I use return for clarity, as in the earlier listings which don’t need them either. But for a one-liner, I leave it off.

Since Paul Graham is preaching that shorter means more powerful and is better for programmer productivity, the last Perl 6 example has topped his Lisp example. However, I don’t he pulled out all the stops. If Common List doesn’t have a reduction operator concept as part of the standard library, it is easy enough to create one. The Lisp program would still be longer, only because it has more parentheses.

After all that, realize that it is not a real example. Executing that loop billions of times would be slow, and there is a better way to get the answer. Summing the first m integers is easily computed as (m+1)×(m/2). Since I don’t want to include m itself, make n equal to m−1 and get: n×((n−1)/2. Do a little algebra so I can do the division last, and I can compute with integers and not have to worry about fractions. In real life, I would have the function:

sub sum(Int $n)
returns Int
is inline
 {
  return ($n ** 2 - $n) / 2;
 }

So even with a more powerful and expressive language, you still get the most mileage, performance-wise, out of optimizing the algorithm.

Now let’s go on to the function that can’t be written in C, “that takes a number n, and returns a function that adds n to its argument”. The straightforward way of doing that in Perl 6 is:

sub addn($n is copy)
 {
  my $retsub = sub ($x) { $x + $n }
  return $retsub;
 }

The body of the inner sub is a closure which remembers the actual $n that was around when the assignment statement is executed. Each time addn is called, it will create and return a different sub. You can write this in Perl 5 as well.

The parameter $n is declared is copy because the default parameter passing would be a read-only alias. If the caller changes the original variable, it would affect the closure, and that is not what was intended.

To be more terse, just write:

sub addn($n is copy)
 {
  sub { $^x + $n }
 }

Here I did use the implicit parameter notation $^x rather than just using $_, because the former will generate a signature for me that takes one parameter named $x, and thus I will get error checking when calling it. Using $_ would allow anything to be passed without complaint, as with Perl 5.

Again, this is shorter than the Lisp code even though it says exactly the same thing, because so much can go unsaid. Matching up the items, defun is sub, parameter list and function name are the same, the lamda is the second sub, and the inner body is the same. But I don’t need the #' or the declaration of the inner parameter list, and one less pair of delimiters beyond that.

Closing Thoughts

The lexical closure is a powerful language feature that is sorely missing from many contemporary mainstream programming languages. People who don’t know about it don’t miss it, but the rest of us do. Closures have been in Perl since Perl 5, which was released more than a decade ago. In Perl 6, they are more uniform and consistant, and are promoted as a concept to a core part of the language philosophy. Every block is a closure. Control structures such as the for loop are not ad-hoc features, but more like functions that operate on a block. The syntax for blocks, such as you saw for block parameters, can be used naturally and the for loop does not require its own special syntax.

Interestingly, Perl 6 programs are shorter than Lisp programs even when written the same way. An analysis of these examples show several reasons why:

The Lisp notation requires delimiters for every function call. Infix notation does not use delimiters, except parens to change the precidence. In Perl, you don’t even need parens for passing arguments in function notation, either.

Perl can leave off delimiters (curly braces in this case) around the contents of a loop or other control structure, when using the statment-modifier form.

Perl 6 can leave off the declared parameter list, and its associated delimiters, from a block. Either use the default name, or the placeholder syntax. The placeholder syntax is tamer than languages that don’t have you declare things at all, because you still are explicit about marking the name as being a parameter. You just don’t have to collect them all manually and list them at the top.

More subtly, Perl 6 is more context sensitive. Why does the Perl translation not have an equivilent for the #' Lisp construct? Because Perl knows to make a closure and save the block for later execution, while in Lisp it can’t tell the difference between a block and a function call. Having the list in the scope of the lambda won’t make it think to do something different. Perl has no problem with the meaning being what you generally want in context.