Friday, July 8, 2011

Fun with Symbolic Reasoning

So I'm about 3-4 weeks into the "great type solver re-write". It really wasn't my plan to re-write the type solver, but things have kind of snowballed. Originally my goal was to allow you to define an anonymous function without having to explicitly specify the types of the parameters - so you could say "fn (x, y) { ... }" instead of "fn (x:int, y:int) { ... }". However, it turns out that in order to make this work, I had to fix a lot of problems in the solver. There were a lot of places in the code where I had hacked around a particular problem instead of thinking it through and doing the correct thing. The new code is more abstract, but in many ways its easier to understand, which is always a good sign.

One interesting aspect of all this is that the type solver now does a kind of deductive reasoning: the input expression gets converted into a set of Constraint objects, and the solver attempts to optimize the constraints before it searches for a solution. An example would be two constraints such as "T is a subtype of List[X]" and "T is a subtype of ArrayList[X]". Since the first constraint is implied by the second, it is unnecessary and can be removed. A different example would be "T is a subtype of ArrayList[X]" and "T is a supertype of ArrayList[X]". Regardless of what X is, these two constraints can be satisfied only if T is exactly ArrayList[X].

The data model for the solver now looks like this:
  • Template
    • TypeVariable
  • Environment
    • TypeAssignment
      • Constraint / ConstraintSet
        • Provision / ProvisionSet
A TypeVariable is essentially a template parameter. Type variables can be bound to values within an environment. For each type variable, a TypeAssignment is created which represents the value of that variable within the environment - essentially it's a slot that is filled in with the value.

Associated with each TypeAssignment are an arbitrary number of Constraints. There are three types of Constraints: UPPER_BOUND, LOWER_BOUND, and EXACT. So UPPER_BOUND(x) says that the value of the TypeAssignment must be X or a subtype of X.

With each constraint there are zero or more "Provisions" (think of provisions of a contract). All provisions must be true, otherwise the constraint has no effect.

Here's an example of how this all works:

var a:int8;
var b:int16;
var c:int32

let d = min(a, b + c);

There are two function calls in the last line: A call to 'min' and a call to '+'. The type solver first identifies all of the overloads of these two functions:

def min[%T](lhs:T, rhs:T) -> T;

def infixAdd(a:int8, b:int8) -> int8;
def infixAdd(a:int16, b:int16) -> int16;
def infixAdd(a:int32, b:int32) -> int32;
(and so on)

After unification, the TypeAssignment for %T looks like this:

%T:
  >= int8
  >= int8  provided [infixAdd(a:int8, b:int8) -> int8]
  >= int16 provided [infixAdd(a:int16, b:int16) -> int16]
  >= int32 provided [infixAdd(a:int32, b:int32) -> int32]

Each '>=' represents a LOWER_BOUND constraint. The first constraint says that T must be at least an 'int8' if not larger - because the first argument to 'min' is 'a', which is an int8.

The second constraint says that T must be at least an int8 if the int8 version of infixAdd is chosen as the overload. Similarly, the third constraint says T must be at least int16 if the int16 version of infixAdd is chosen.

We can immediately eliminate the second constraint because it's implied by the first constraint.

After minimizing the constraints, we can now search for a solution by trying different values for 'T' (as suggested by the constraints), and seeing which solution has the highest ranking (where the ranking is calculated from a number of factors, including whether the assignment would produce a warning message.)

No comments:

Post a Comment