[Perl 6 page]

Failure

Contents

Failure vs undef

In Perl 5 and earlier, you are familiar with the related but different concepts of undef vs false. A function may return a false value such as numeric 0 or an empty string, and in addition a function may return undef. Since undef also tests as false, you can write code like:

foo or die;

if (!foo) { die }

$x ||= $default;

But sometimes you specifically want to test for undef only, not for other false values.

die  unless defined foo;

This is common enough that Perl 6 (and Perl 5.10) has the // operator that works like || or “or”, except test for undef rather than any kind of false value. Perl 6 also defines “orelse” and “andthen” operators that test for undef.

foo orelse die;

$x //= $default;  # doesn't clobber zero.

In Perl 6, we have to take this a step farther. A function can “fail”, and failure can either throw an exception or return a special Failure object (which is an unthrown exception), depending on the caller’s preference.

In either case, the same object is presented back to the caller. The choice is whether it will be thrown to the CATCH block, or stapled onto an undef and returned from the call normally. (See more on but in A Romp Through Infinity.)

falseundefFailureunhandled Failure

As drawn here, all Failure objects also test as undef, just as all undef values test as false. Just as a regular boolean test can’t distinguish the number 0 from undef, the definedness test can’t distinguish a plain undef from a Failure.

sub f1 ($n)
 {
  given ($n) {
     when 1 { return false }
	 when 2 { return 0 }
	 when 3 { return }
	 when 4 { fail }
	 when 0 { return "hello world" }
     }
 }

no fatal;
for ^5 -> $n {
   my $result = f1($n);
   my @tests;
   @tests.push("false") unless $n;
   @tests.push("undef") unless defined $n;
   @tests.push("failure")  if $n ~~ Failure;
   say $n, ": ", @tests.join(", ");
   }

This will report:

0: 
1: false
2: false
3: false, undef
4: false, undef, failure

The caller gets to specify whether a fail statement returns a Failure object or throws it. The latter provides different code paths so you don’t need to explicitly tell them apart:

use fatal;
try {
   f1($n);
   say "Function returned.";
   CATCH {
      say "Function failed.";
      }
   }

If you end up writing code where every individual call is wrapped in its own try block, you find it annoying and lament the inefficiency of using exceptions for “common” cases. This happens with reusable code because there might be substantial differences in how it gets used and what, in context, is “exceptional”. So, Perl 6 allows the caller to decide, and you can specify that failures return rather than get thrown. But, that means the caller then needs to distinguish the fail result from a non-failure result.

Sometimes, in this situation, checking for undef is good enough. The caller simply wants to distinguish getting a valid value from not. But, sometimes undef is used to extend the domain in non-error ways. That is, undef might be a perfectly good value for the result. So in general you want to correctly distinguish failure from everything else.

In the above code, you can see that the smart match operator lets you do that. But, really, it’s the false vs undef issue all over again! Programmers wished for //. But here we find that most of the Perl 5 uses for it won’t apply, because functions call fail rather than returning undef to report non-applicable situations. So, I propose that there be a /// operator that specifically checks for failure, and that infix:<orelse> and infix:<andthen> notice Failure only rather than any undef.

In fact, you never need to say foo orelse die because foo could be told to throw in the first place.

Failure as a return code

Sometimes, the best way to indicate a problem is with a special return value, not by throwing an exception or using any kind of alternate status channel. Consider this example, using floating-point arithmetic:

sub f1 (float @a, float @b, float @c, float @d) returns float @
 {
  use float intermediate => float;
  my float @result = @a »*« @b »+« @c »*« @d;
  return @result;
  }

The use float pragma configures it so the intermediate results in the expression are also held as float precision, as opposed to double precision. This is to enable the use of vector or SIMD instructions on what follows; e.g. the XMM instructions ADDPS and MULPS. The alternative would be to keep all intermediate results in double or extended precision and only convert back to single precision at the end of the expression.

For those not used to the hyper operators, the logic is the same as:

   my $n = @a.elems;
   fail unless $n == @b.elems == @c.elems == @d.elems;
   for ^$n {
      @result[$_] = @a[$_] * @b[$_] + @c[$_] * @d[$_];
      }

So, what happens if, when $n is 129 out of a thousand iterations, that the first multiplication overflows? @a[129] * @b[129] will produce a value of +∞. Meanwhile, the @c[129] * @d[129] term is also computed, and, without any conditional code inserted for the error checking, the final addition produces either +∞ or the “indefinite value” NaN in the case where the right-hand term turned out to be -∞. The error did not cause any different programming paths to be taken, either in this expression or in the overall loop. @result[129] will contain NaN and that’s that.

The reason that a lengthy expression does not have to do error checking after each step is because all operations return NaN if given NaN as input. So if one of several terms is NaN, the whole sum becomes NaN and you don’t have to interrupt the efficient vector processing or floating point pipeline. The error (NaN) propagates through the normal parameter passing and return flow.

So, at some point, the caller of f1 will use the various result elements for something. Does it contain code to check for and notice a NaN value? If so, then it can treat them specially. If not, then any expression involving that value will itself produce a NaN, and the code is not interrupted. The issue is simply further propagated.

The use of Failure as returns generalizes this idea to work with any type. Obviously, with any kind of arithmetic type and operations you could do it in just the same way. But what about something that is not doing simple arithmetic, but dealing with complicated objects?

Consider an example of reading an XML file, accessing some information from it, and then summarizing those results. Something like this:

my %summary;
for @filenames {
   my XMLDoc $doc = read_document(@_);
   my $name = $doc.getname;
   my $ordernum = $doc.get-order-number;
   push(%summary{$name}, $ordernum);
   }

In this case, you could put the whole body of the loop inside a try, and any problem would simply skip the rest of the steps in the loop. But what happens under no fatal, when errors return Failure objects rather than throwing them?

Suppose the read_document call fails, because the XML is malformed, the disk is bad, or whatever reason. It will return a properly typed XMLDoc of undef, but furthermore has the Failure (the unthrown exception, with the details of what the problem is) attached as a property, and all methods of the object overridden to fail with that same error, if called.

So, you could check the result in $doc to see if it is undef or more specifically a Failure object, and take some action. But this code does not. So, when $doc.getname is called, that method immediately fails with the same error. The variable $name not contains the Failure object, too. Likewise for $ordernum. And finally, the push gets Failure for its arguments and itself fails in turn, but you ignore any return value.

If the problem wasn’t with $doc, and getname worked correctly, but get-order-number failed, then the push would append a Failure object to the list associated with that name, which further propagates it down stream, and seems like a very useful thing to do.

This is similar to exception handling, but instead of being interrupted to propagate straight up out of the containing block, the error propagates through the regular flow of code. What’s the difference? For the floating-point example, it was important not to have an alternate flow in order to take advantage of vector hardware. In the @filenames example, a try block would not be so awkward. But, consider the case of performing operations in parallel on multiple threads. If the error handling had to be back in the caller’s main thread, not in the individual worker threads, then it would be awkward indeed. With the return mode of error propagation, the Failure will be collected along with the normal results, and no separate mechanism is needed to marshal that back or figure out how to reconcile the error with the specific loop iteration.

??? how to specify asynchronous / parallel looping?<-->

Suppose we had a parallel-map function that did the same thing as map only instead of iterating the loop to do it, it just used the block to form an iterator and used that to produce a lazy list.

@lazy_list := parallel-map { get_info($_) } @filenames;

So the logic of the algorithm is the same, but the program specifies that the bulk of the work be done lazily when that specific result is needed, or on background worker threads. We don’t want an exception with one iteration to interfere with the others, and we want to deal with the errors later, not during the background thread execution.

It is certainly possible to include a CATCH inside the braces, so it becomes part of the worker thread too. The stuff in that CATCH can refer to local variables back in this main routine, thanks to closures. But it may be very inefficient to do that, or the error handling may be interactive and we want to prompt the user or summarize the errors in the main routine. Jumping out of a regular loop via an exception would stop the other iterations, and throwing exceptions out of worker threads is a thorny issue at best.

In short, sometimes having multiple execution paths is a problem, so indicating failure by return is preferable to throwing exceptions.

Immediate Checking for Errors

Perhaps you’ve had the misfortune to write code that essentially puts every statement in its own try block.

try {
   statement1;
   CATCH {
      when LibError {
         given .errorcode {
            when 1 { ... }
            when 2 { ... }
            when 3 { ... }
            }
         }
      }
   }
try {
   statement2;
   CATCH {
      when LibError {
         given .errorcode {
            when 3 { ... }
            when 4 { ... }
            when 7 { ... }
            }
         }
      }
   }
try {
   statement3;
   CATCH {
      when LibError {
         given .errorcode {
            when 8 { ... }
            when 20 { ... }
            when 21 { ... }
            }
         }
      }
   }

At least, in Perl 6, assuming the library writer used fail rather than explicit calls to die, you have a choice as to whether it will throw or return.

$err= statement1;
if ($err ~~ LibError) {
   given $err.errorcode {
      when 1 { ... }
      when 2 { ... }
      when 3 { ... }
      }
   }
$err= statement1;
if ($err ~~ LibError) {
   given $err.errorcode {
      when 3 { ... }
      when 4 { ... }
      when 7 { ... }
      }
   }
$err= statement1;
if ($err ~~ LibError) {
   given $err.errorcode {
      when 8 { ... }
      when 20 { ... }
      when 21 { ... }
      }
   }

At least this simplifies the syntax a little, and prevents the more expensive “stack unwinding” operation for common cases.

The real problem is that users of a library won’t agree on what is “exceptional”. So deciding in advance when the reusable code was written which problems are anomalies that are definitely worthy of exceptions, and which are errors that the caller may half way be expecting and be willing to deal with, is impossible. Code will be reused in new situations, even extended. So, in Perl 6 the concept of failure is separated from the transport mechanism of how the caller wants to work with failure cases.

But what if some errors are considered, by the caller, to be exceptions, and others need to be dealt with immediately? You can combine the two approaches and decide at the caller’s site which are which. You might also be able to group multiple calls together more like you do inside try blocks, rather than having to write conditional code after each one:

{
   statement1;
   statement2;
   statement3;
   CATCH {
      when LibError { 
         given .errorcode() {
            when 1 { ... }
            when 2 { ... }
            when 3 { ... }
            when 4 { ... }
            when 7 { ... }
            when 8 { ... }
            when 20 { ... }
            when 21 { ... }
            }
         .fail-as-return();
         }
      }
   }

That is, if any of the three statements throw a LibError, some corrective action is taken and then, unlike a normal throw/catch situation, it continues with the next statement rather than jumping out of the block and skipping the rest. This takes advantage of the situation that the handling code is common to all three calls. But even without that benefit, this shows that we want the LibError to return a failure code, and all others to throw.

How this mechanism lets you choose which things you want returned and which you want thrown is (will be) explained on the page on exceptions.

Footnotes

The use float pragma

Although the need has been identified to make the detailed semantics of floating-point arithmetic not only well specified but configurable, no proposal has been produced yet. My point is to ensure that the semantics are suitable for using the vector hardware, not to illustrate the exact command to do so. I expect it to look something like this.