[Perl 6 page]

A Romp Through Infinity

In Perl 6, Infinite is a distinct type and Inf is a constant of that type.

my $a = Inf;
my $b = 5;
my $c = 3.34567;

say "\$a is of type $a.WHAT().";  # Tells us Infinite.
say "\$b is of type $b.WHAT().";  # Tells us Int.
say "\$c is of type $c.WHAT().";  # Tells Num.

You can compare Infinite against other types, and that is not too surprising. After all, you can code $b > $c even though $b and $c are of different types, and asking whether $a > $b is no more difficult. Presumably the > operator works between different types of arguments, and indeed it does.

Specifically, the single built-in > operator takes arguments of any type, and just defers to the general <=> comparison operator. <=> returns enumerations Increase, Same, or Decrease, for -1, 0, or +1 to indicate the ordering relationship between its two arguments, like strcmp in C.

proto sub infix:«>» ($left, $right, *%adverbs)
 returns Bool
 is inline
 {
  return infix:«<=>»($left, $right, |%adverbs) eqv Order::Increase;
 }

Let’s take a look at that a piece at a time.

The keyword sub is for defining a subroutine, and that is the same as in earlier versions of Perl. The keyword proto means that multiple subs of the same name will be allowed, and if named arguments are in a different order to use this one to decide how to map the parameters before figuring out which function to call based on their types. In this case, the proto just means that additional overloaded definitions will be allowed. For your own class, you might want to write a > operator directly rather than only writing a <=> operator to do everything. So, if you were to declare a sub infix:«>» (Dog $left, Dog $right, authority=>KennelClub) {...} it would work and be taken in preference to the general form if you passed in two Dogs.

The name of the subroutine is infix:«>», which is the syntax for naming operators. This is analogous to operator> in C++. The word or symbols forming the operator are wrapped in delimiters, and normally we use <> for that, but the symbol we are defining conflicts with that closing delimiter. But, the syntax is actually the same as for any hash access, so <whatever> is actually just shorthand for {'whatever'}. So, <xxx> and «xxx» generally mean the same thing if xxx does not contain any variable names.

Now look at the declared arguments: ($left, $right, *%adverbs). The first two are obvious enough given knowledge of other programming languages: they declare $left and $right to be the first two positional parameters, which will be named $left and $right within the scope of the function. In Perl, the $ sigil is for scalar (item) variables. No other type information is stated, so these two variables may hold anything, hence you can pass any two values to this function.

The name %adverbs by itself would mean a hash value is to be passed in that position, as % is the sigil for hashes (associative arrays or dictionaries). But we are not passing a third parameter, and there is that extra * before it.

The leading * means “slurpy” and the syntax means that the variable %adverbs will collect any additional named parameters that were passed but not declared in the formal parameter list. So, if you called infix:«>»($x, 5, authority=>'AKC') or infix:«>»($x, 5, :authority<AKC>) then the body of the function would see $left bound as a read-only alias to the caller’s $x, $right containing 5, and %adverbs containing a single element whose key is "authority" and value is "AKC". Calling infix:«>»(4,5,6) would be a compile-time error because it won’t accept a third positional parameter. Only extra named arguments are collected in %adverbs.

But what good are extra arguments to an infix operator, anyway? You see that the function can be called using the function-call notation and the operator’s funny name, but that’s not how infix operators are generally used. Well, it turns out that you can indeed pass extra named arguments to operators that are invoked with the ordinary operator notation. To do so, you use the colon form and put them after the main part of the expression.

if infix:«>»($fido, $scooter, :authority<AKC>)
   { say "$fido is ranked higher than $scooter" }
#is the same as
if $fido > $scooter :authority<AKC>
   { say "$fido is ranked higher than $scooter" }

The presence of the colon form of a literal pair in an expression like that is taken as an adverb for the most recent that it follows, at the same bracketing level. The adverb modifies the meaning of the verb, so this is used to provide behavior options to operators. You’ll most often find them used in string comparisons, such as if $x ge $y :lc to do a case-insensitive comparison. In Perl, strings are compared with gt rather than >, but still as an infix operator.

In any case, the > operator (and other relational operators) doesn’t care about the types of the normal arguments, and doesn’t care about any adverbs. It just passes everything along to the <=> operator.

The next line in the listing is returns Bool. The meaning should be clear. But, note that there are many other ways this could have been written: using the keyword of instead of returns, putting the return type(s) in the argument list inside the parenthesis by using the --> symbol, or putting the return type before the function name as in C.

The line is inline is also self-explanatory. We are asking the compiler to inline this function, essentially rewriting the call to > with the call to <=>, and optimizing across the expanded code. This is just a hint to the compiler though, and does not change the semantics.

Now, the body of the function. It calls infix:«<=>»($left, $right, |%adverbs), passing all the arguments to the other function. But notice the syntax for the third argument. Writing foo($left, $right, %adverbs) would mean pass the whole %adverbs object (a dictionary, or hash as it is called in Perl) as a third positional parameter. That is not what we want; rather, we must take all the entries out of %adverbs and turn them into named arguments in the new call. The | prefix does exactly that. Parameter lists can be manipulated as first-class objects, created using introspective code and not just written literally. The prefix | operator interpolates something into a Capture object, which is what is used to pass parameters to a function. In older versions of Perl, an ordinary List was used and everything had to be flattened into it. Now, Capture is a distinct type and there is no confusion about passing arrays as items. (Try that in C# !)

The result of the call to <=> is compared against the enumeration constant Order::Increase. As explained in detail in this article, the infix eqv is for value equality comparison, just like operator== in C++. We want to be careful of our use of relational operators inside the generic implementation of the relational operators! We must be sure that the implementation of eqv with an argument of type Order is not so generic that we wind up with an infinite recursion loop calling the <=> operator! The enumeration types inherit an implementation of eqv from role Enumeration, and that calls cmp which works just like <=> in that it returns an Order object that must be compared. So Order must at least implement eqv directly, returning a Bool, rather than leaving it to call the generic form. Generic code can’t pass the buck forever — eventually there must be a case that is actually implemented.

It’s interesting that the implementation of > is untyped, but the implementation of <=> is typed. That means, essentially, that this is “generic” code, even though no special syntax or declarations were used.

There are several functions named infix:«<=>», in what would be called “overloading” in C++. But, the decision of which function to call is made at run time, not compile time, just like with the receiver object of a virtual function call. Only it works for any and all of the arguments, not just the first. This is known as MMD, or Multi-Method Dispatch, in Perl 6.

So, the following functions (among others) are present:

multi sub infix:«<=>» (Int $left, Int $right --> Order) { ... }
multi sub infix:«<=>» (Num $left, Num $right --> Order) { ... }
multi sub infix:«<=>» (Inf $left, Inf $right --> Order) { ... }

multi sub infix:«<=>» (Infinite $left, £ ComparableNumeric --> Order) { ... }
multi sub infix:«<=>» (£ ComparableNumeric, Infinite $right --> Order) { ... }
multi sub infix:«<=>» (Infinite $left, Infinite $right --> Order) { ... }

Actually, many of them are probably written as methods in the class so that their implementation can access private data. But that doesn’t matter here, since the “self” argument is not so special as in other languages. Run-time dispatch takes place on the first argument even if it is not a “self”, and can take place on any of the arguments, not just the first. Regardless of how it was defined, the list of candidates is normalized into a list of functions as shown above, before finding the best match.

If you code $b > $c, with the variables from the first listing, this ends up asking $b <=> $c eqv Order::Increase, which needs to call infix:«<=>»($b,$c). So the MMD mechanism looks for a match for parameters of (Int,Num) and decides to promote the Int to a Num implicitly and call the second one. Note that you can disable the implicit conversion if you prefer, or make it show a warning. Anyway, this is like overloading only the types are determined at run time, using the actual types of the values. That’s why the > operator could be generic without needing to expand out templates for every possible case.

Now look at the case of $a > $b. This calls infix:«<=>»($a,$b), which will try to dispatch on arguments of type (Infinite,Int). Since Int does the ComparableNumeric role, this will dispatch to the 4th form as an exact match with “like” relationships. (Without the £ symbol it would match only on stricter “isa” relationships, as explained in another article.) It does not need to convert the Inf to an Int first. It works even if you compare Inf against something that can’t hold infinity itself. The implementation is something like:

multi sub infix:«<=>» (Infinite $left, £ ComparableNumeric) 
returns Order
is inline
 {
   return $left.positive ?? Order::Decrease !! Order::Increase;
 }

We don’t bother naming the right hand argument because we don’t care what it is. Inf is simply greater than anything, or -Inf is less than anything, no matter what. There is a more specialized form for comparing two Infinite values, which not only removes ambiguity but assures us that the other value, although anything, is actually anything but Infinite.

The ?? !! syntax works like ? : in earlier versions of Perl, which is just like the same syntax in C. The symbols have been doubled to provide for richer use of those characters. A prefix ? will convert to Bool and establish Bool context, and the : is all over the Perl 6 grammar so :: just would not do.

If this worked like the first mixed-type case, it would mean converting Inf to an Int and then calling the (Int,Int) form. As it turns out, the Int type can indeed hold an infinite value. But it is not required that all numeric types be able to represent infinity. And, the same concept is used for non-numeric types. For example,

my @words = read_word_file;
my $min = Inf;
for @words {
   $min = $_  if $_ lt $min :lc;
   }
say "The first word is $min."

Even though the Str type cannot be infinite, the string comparison operator lt is written with MMD forms as above, so any string compares less than Inf (or greater than -Inf). That means that Inf is a useful “bigger than any possible value” initializer and the loop does not need a special case. (Oh, and remember what the :lc is there for? It’s an adverb to indicate case-insensitive comparisons and modifies the meaning of the lt operator.)

This is a good time for a break

In the first part, we looked at how infinity works with untyped variables. When writing $a = Inf; the container $a simply points to the Inf constant, which is of type Infinite. If you later write $a = 1; then $a will point to an object of type Int. No conversions or coercions take place because $a can hold any type of object.

my Int $ivar = Inf;
my Num $nvar = Inf;
my Str $svar = Inf;  # error

Here we see that Inf can be assigned to an Int variable and can also be assigned to a Num variable. Both of those types are capable of representing ∞. All types that can represent infinity can share the same Inf constant because each type that wants to use it defines an implicit conversion from the Infinite type to itself.

class Int {
   ...
   multi sub conversion:<Int> (Infinite $x)
   is implicit
   is export
    {
     my Int $retvalue;
     #Sets internal bits of the object to represent infinity
     ...
     return $retvalue;
    }

The implicit trait means that the implementation may use this function automatically without requiring an explicit cast. The export trait means that this function, which is defined in the Int class and thus the Int package (namespace), will be exported into the package that’s using the Int class. That is, it will be visible in your scope when you need it.

So, the compiler knows how to automatically convert Infinite to Int, and Inf is a constant of type Infinite. Likewise with Num, and any other types that want to do this.

The third line is an error because this is lacking. There is no defined mechanism to convert Infinite to Str, implicit or otherwise, because Str simply doesn’t have the concept of infinity in its domain.

my Str @words = read_word_file;
my Str $min = Inf;  # breaks!
for @words {
   $min = $_  if $_ lt $min :lc;
   }
say "The first word is $min."

Yet it is still possible to compare a Str value against Inf, because the comparison functions are overloaded. That is why the words example works. The variable $min can hold an Infinite value one moment and a Str value the next, because it is untyped. So do we give that up if we add typing?

In Perl 6, typing is not all-or-nothing. You might run into such issues when adding type declarations, due to ripple effects. Rather than refactoring or figuring out how to write a full-blown solution using generics, you can simply stop typing when you reach such an issue, or introduce partial type information.

The easiest thing here, other than not adding a type declaration to $min at all, is to simply say that it can hold either type!

my Str|Infinite $min = Inf;

Rather than only Str objects, or anything at all, you specify that $min can hold either Str objects or Infinite objects. Partial typing might not give you the same performance optimization benefits as full strong typing, but it adds error checking and allows you to bridge the problem area better than turning off type checking completely.

And you still might get some performance benefits. In the case of the lt operator, the implementation can figure out at compile time that there are only two candidates, and emit a much simpler run-time check to determine which function to call, rather than having to consider all forms of the function at run time every time.

There is another way you can get a Str to hold an infinite value, while (mostly) keeping its type. You can assign properties to values arbitrarily.

my Str $min = undef but Inf;

Nominally, the value of $min is undef. But, the “but” is stuck onto it like a sort of footnote. Any code that wants to check for this property can sense its presence. To see exactly what it is doing, let’s start with a slightly simpler case:

my Str $min = "-empty-" but Inf;

The but operator will copy the Str object into a new object of a type that has another role mixed in, via multiple inheritance. Let’s examine how to define such a thing manually:

role Helper
  {
   method conversion<Infinite> ()
    { return $infvalue; }
   has Infinite $.infvalue is rw;
  }
  
class Str_but_Inf
  is Str
  does Helper
  {
  }

my Str_but_Inf $min = Str_but_Inf.new("-empty-");
$min.infvalue = Inf;

# later...
my Str $s = $min;
my Infinite $test = $s;
if defined $test { say "it worked!" }

This starts with the Str class and adds some other members to make the Str_but_Inf class. It does it by defining that new stuff in Helper rather than putting them directly in the new class, because Helper is reusable now. Don’t worry about the difference between is and does in this case; the contents of both Str and Helper are inherited into the Str_bit_Inf class.

Normally you create a Str object by using a string literal. "xxx" produces an object of type Str at compile time. So writing my Str_but_Inf $min = "-empty-"; will not work, as it tries to assign a Str object to a container that wants a Str_but_Inf. The way to create an object of this type is to call the built-in .new class method. Str objects are immutable, so the only way to set the string bits into it is at construction time. Here the Str class’s .new method can take a single positional argument for such a purpose, and the derived class inherits that ability.

Then, once an object of type Str_but_Inf exists, you can set its .infvalue property. This is intended to be set to either Inf or -Inf, or some exotic value that is not covered in this article.

Later, trying to convert the object to type Infinite will succeed, as a method for that is present.

You might wonder why we use a conversion operator rather than just checking the .infvalue property. If you wrote $s.infvalue you would get a compile-time error because $s is of type Str. It is actually holding something that “isa” Str that has more abilities, but with static typing the compiler only knows that $s is of type Str, nothing more. A type conversion, on the other hand, always does a run-time check because it allows you to downcast. So, by using a type conversion rather than a regular method call, we bypass the strong typing and get the extra information easily. It also nicely returns undef if the conversion could not be performed, which is perfect since we are seeing if it might be possible with our call.

There is another reason why it’s done this way, and that can be seen with the construct:

my Int $x = 0 but True;
if $x { say "’tis true." }
else { say "Nope." }

This idiom is expressly designed for Perl 6, because of experience in earlier versions of Perl. Basically, you can override the normal meaning of truth being anything non-zero, and a function can return any number and simultaneously either True/False value it wants. If that doesn’t impress you, don’t worry about it. But, look at how $x is used later: it is converted to Bool as part of the if statement. True and False are constants of type Bool, and using but to add a property of that type causes the convert-to-Bool to be overridden. Inf is a constant of type Infinite, so adding that as a property will supply a conversion to that type. It is the same idea: using but to add a value will store the value as a property and also add a conversion operator to that value’s type that will return that value.

The object returned still behaves exactly like the Str of the value "-empty-" and code that is unaware of the new properties and is not polymorphic on the actual type will be none the wiser.

Now that you see how it’s done in detail, look at taking shortcuts.

my $min = "-empty-" does Helper;
$min.infvalue = Inf;
my Str $s = $min;

Using does as an operator like this will create an anonymous class that “is” derived from the type of the left hand argument, doing the same thing we did with Str_but_Inf. Then it clones the left argument into the base subobject of a new object of that anonymous type. So, it copied the Str object into a Str_but_Inf object with the same value, as far as the Str portion of the object is concerned. Since the type of this new object is anonymous, $min is not declared with a type because I can’t say it!

A further refinement would be:

my Str $min = "-empty-" does Helper(Inf);

Which combines the assignment to the .infvalue property with the main part of the does functionality. If you provide an argument like this, and the role being mixed in has exactly one attribute, it will assign the argument to that attribute after cloning the left hand side. Since it’s done internally, I don’t have to worry about the type of this new object.

Finally, we can see how "-empty-" but Inf works. Basically, we want the infix operator but to expand into the proper “does” call.

multi sub infix:<but> ($left, Infinite $right)
 is inline
 {
  my $retval = $left does Helper($right);
  return $retval;
 }

The code in this function knows what role needs to be mixed in, and this code is called when the right-hand argument is of type Infinite. We can also give the function a return type that matches the left argument, by using generics:

multi sub infix:<but> (::T $left, Infinite $right)
 returns T
 is inline
 {

  my T $retval = $left does Helper($right);
  return $retval;
 }

The $left argument is declared with a type ::T. Using the :: sigil declares T as a generic type. Rather than saying that anything passed to $left must be of type T, it is saying that T becomes whatever type is actually bound to $left. Then, T can be used elsewhere, as with the return type.

The code for providing 5 but False is similar. So similar in fact that a single function definition can be written that handles both the Bool and the Infinite cases, and any but in general as described above!


role ValueButHelper [ ::V ]
 {
  method conversion<V> ()
   { return $mixedvalue; }
  has V $.mixedvalue is rw;
 }

multi sub infix:<but> (::T $left, ::V $right)
 returns T
 is inline
 {

  my $retval = $left does ValueButHelper[V]($right);
  return $retval;
 }

That is, make the right hand side a generic type as well, and use the same type (whatever that was) in the definition of ValueButHelper by passing it as a type parameter in square brackets.

Now, back to the line

my Str $min = undef but Inf;

The call to the but operator mixes Inf into the undef value, which is not of type Str.

my Dog $spot = undef;
my $x = $spot;
$x.new("fido");  # knows what to do!

undef used as a term produces an object of type Undef, and Undef can be converted to any object type to become an undefined yet typed object. An object can be typed but uninitialized by using protoobjects. That is why you can call class methods on values that are undef, without strong type information known at compile time.

The code that converts Undef to a protoobject is smart enough to copy any added properties that were added to it already. So it gives the same result as writing:

my Str $temp = undef;  # first convert undef to Str protoobject
my Str $min = $temp but Inf; # then mix in Infinite

And since the name of a type is actually a list operator that constructs an object of that type, with no arguments returns the undefined protoobject of that type, you can also write:

my Str $min = Str but Inf;

Now we are ready to go back to the words example.

my Str @words = read_word_file;
my Str $min = undef but Inf;
for @words {
   $min = $_  if $_ lt $min :lc;
   }
say "The first word is $min."

The initial value of $min is of type Str, so no problem. It is actually an object that is derived from the usual undefined Str protoobject that has the Infinite property mixed in. Generally, nothing will care or even know that Inf has been mixed in.

Only code that checks, by trying to convert the object to Infinite, will be aware of the extra property, and code written to do that must suspect it in the first place.

As it turns out, the lt operator does check. Or rather, lt, gt, le, etc. all call the leg operator, which still manages to pass the buck. These operations are specific to strings, as leg (less-equal-greater) is just:

proto sub infix:<leg> ($left, $right)
 returns Order
 is inline
 {
  return ~$left cmp ~$right;
 }

That is, it converts both arguments to Str and calls the native object cmp operator on the strings. The cmp operator is the “native” comparison on the object.

All Comparable types must implement the cmp operator. You might suppose that your implementation of cmp must check for the Infinite mix-in. After all, the standard library decrees that any Comparable type that does not support infinite values may have Inf/-Inf but-ed in.

But, your implementation of cmp does not need to handle this at all! It will be taken care of by the MMD mechanism automatically, and not even call your code for this case.

That works because the cmp function has a form that matches the role that was mixed in: ValueButHelper[Infinite]. So it will see that at run time, and the MMD matching considers a role added via the run-time does operator to be a better match than the original class, as if it were matching on the anonymous created class directly.

multi sub infix:<cmp> 
   (Infinite $left, £ Comparable) | 
   (£ Comparable ::T ValueButHelper[Infinite] $left, T)
returns Order
is inline
 {
  my Infinite $leftInf = $left;
  return $leftInf.positive ?? Order::Decrease !! Order::Increase;
 }

multi sub infix:<cmp> 
   (£ Comparable, Infinite $right) | 
   (£ Comparable ::T, T ValueButHelper[Infinite] $right)
returns Order
is inline
 {
  my Infinite $rightInf = $right;
  return $rightInf.negative ?? Order::Decrease !! Order::Increase;
 }

So, even types that don’t have an infinity in their domain can still have it bolted on! As far as the various kinds of relational operators are concerned, all types can hold Inf.

You may notice that each cmp function has two parameter list signatures! This allows you to write the body once and it is as if you wrote two separate functions with the different parameter lists. That is handy when the parameters differ only in type, so the body of the code will be identical.

Here, I made sure the body worked in both cases by defining a local variable to make sure the parameter is converted to Infinite. In the first signature this is redundant and can be optimized away. In the second signature it will access the mixed-in value.

Look at the parameter matching of (£ Comparable ::T, T ValueButHelper[Infinite] $right). The left argument says that the parameter must be something that does the Comparable role, and the £ is necessary because Comparable may have methods with covariant parameter types. Whatever the actual type was, it is remembered as T. Then, the right argument must match T, that is, be the same as the thing on the left. But it also must simultaneously match ValueButHelper[Infinite], so it will match the anonymous class created by but, and is smart enough to make sure the original type matches as well.

The other one, (£ Comparable ::T ValueButHelper[Infinite] $left, T), relies on ::T in the middle of the list remembering the type thus far, not including the constraints named after it.

Summary

In Perl 6, Infinite is a distinct type and Inf is a constant of that type.

You can compare any (Comparable) type against Inf, and MMD handles the mixed-type comparison.

Or, you can store Inf in any type that can be infinite. Comparing objects with that value should still work.

Or, even if the type does not support infinity, you can force Inf into a variable of that type anyway. And comparisons will treat it as infinite anyway.

Orthodoxy

The details of the Infinite class and Inf constants are a current proposal and in my specdoc. The design is faithful to the descriptions in the Synopses.

The details of P6 Standard Library classes in general have not been designed. Some class or role names are mentioned in the Synopses, and I have organized them in a straightforward manner with some eye toward supporting generics. But somebody needs to make a serious coherent design, and these details are all subject to change.

The details of what but does is elaborated and expanded from the brief hints that are in the Synopses. The fleshed-out design is entirely consistent with what is shown in the Synopses.

Details of parametrized classes are not in the Synopses other than to say that square brackets are used. The usage here is simple enough that there should be no questions.

The “like” vs “isa” type matching is not in the Synopses. The Consensus seems to be that more general Polymorphism is needed, and the specific proposal Advanced Polymorphism in Perl 6 — Features of a second-generation type system can be read here.

The details of conversions and coercions are not given in the Synopses; only a mention that the class name is used as a list-op. How do you tell the system what conversions are possible and how to do so? A partially written proposal is in specdoc.

How to inherit from Str and still initialize it is not specified in the Synopses and there is no proposal for details of this class. So I just gloss over it and assume that a suitable constructor works out by itself.

The type of an untyped undef is not explained clearly in the Synopses. It might be specified to be Object, which is also the base class of everything. But, it needs to have the implicit conversion ability that is not inherited, so it should really be a distinct type for that purpose. It is, really; the actual concrete class is derived from Object and has whatever traits mixed-in that make it a protoobject; it is not directly of type Object even if the synopses said that it “isa” Object (everything is!). I'm pointing out that the more-derived class needs to be exposed, so the undef constant can be recognised distinctly from typed protoobjects.

The code that converts Undef to a typed protoobject is smart enough to copy any added properties that were added to it already. That was inferred from examples in the Synopses, but it is not explicit in the document.

The MMD matching considers a role added via the run-time does operator to be a better match than the original class, as if it were matching on the anonymous created class directly. Details of MMD have not been designed, but consider this example an identification of a requirement. A full proposal will follow.

Putting a generic type parameter in the middle of a list binds to the type thus far, not including the constraints named after it. That is not in the Synopses. Consider this a use-case of why it is needed.

Footnotes

the implementation can figure out at compile time

It's actually more difficult to optimize, because there may be classes found at run time that had not been defined yet, and more MMD forms of a function may be added not only later in the compilation but at run time! Still, a clever implementation can save a lot of work, or validate its assumptions at run-time and fall back to the general case only if you did do something tricky and unusual.

inline is a hint to the compiler

Actually, it will be difficult to inline a multi because the dispatch is done at run time. Generally, the compiler needs to be sure of the actual (not just declared) types of the parameters and find an exact match or do global analysis to make sure that a better match is not added later. Global analysis can also help with the actual type vs declared type, if it determines that a class is never derived from.

We can always hope. And the built-in standard library functions might make a special effort.