Wednesday, December 30, 2009

The problem with null pointers is that they are too darned useful

It would be nice if null pointers could be eliminated from the language, or at least integrated into the type system in a way that allows null pointer dereferences to be detected in advance. Unfortunately, 'null' tends to get used a lot - it's a great sentinel value for lazy construction, for example. So far all my ideas for making null type-safe also have the side-effect of making the code more complex and verbose.

Right now, the Tart compiler treats the value 'null' as a distinct type 'Null' (which is different from 'void'). You can compare any pointer or object reference with Null, but you can't assign Null to a variable of reference type. What you can do, however, is create a union of type 'SomeType or Null'.

var n:SomeType or Null = null;

In C++, if we wanted to lazily construct 'n', we'd write something like this:

if n == NULL {
  n = new SomeType();
}
      
return n;

However, this won't work in Tart because the final 'return n' returns a 'SomeType or Null', when what we want for it to return is 'SomeType'.

We could put an explicit type cast, except that in Tart type casts are always checked - which means that we're checking for null twice.

Another way to do this in Tart is to use the classify statement:

classify n {
  as r:SomeType { return r; }
  else {
    let r = SomeType();
    n = r;
    return r;
  }
}

This works, but seems fairly clumsy by comparison to the C++ version.

--
-- Talin

Tuesday, December 29, 2009

Status update

Lots of filling in of details lately:
  • Reflection of default constructors now works, allowing objects to be created via reflection.
  • Started implementing the standard hash functions (using MurmurHash). This is needed so I can write a HashTable collection.
  • Sketched out a few more i/o classes and interfaces.
  • Added a new linker pass that attempts to calculate the overhead of reflection. The largest chunk is the strings representing class and method names.
  • Started work on Tart implementation of String.format.
  • Tracking down bugs.
Unfortunately, there are a number of issues that are vexing me and won't go away. This includes:
  • CMake making me crazy.
    • I have two libraries, almost identical - yet when I declare both libs to be a dependency of a custom target, only one gets rebuilt. I can force CMake to build the second one by explicitly giving the name on the command line, but it fails as a dependency.
    • Another CMake problem is that I can't seem to make a rule that causes dsymutil to run on OS X but not on Linux.
    • And yet another issue is that CMake's lack of extensibility means that there's no way to scan a Tart module for dependencies. I'm constantly running into problems where a module is out of sync with one that it is based on. (Eventually I am going to make my own Tart-based built tool, "mint", but that's a long ways off.) I do have the dependency info embedded in the generated module (using metadata nodes), but there's nothing that can read it.
    • However, being able to generate an eclipse / MSVC / makefile project is just too useful to give up. So I don't want to migrate away from CMake.
  • Several of the unit tests fail when LLVM optimization is turned on, and I can't figure out why.
  • Still haven't been able to generate correct DWARF info for parameters and local vars. Also, I sort of figured out how to get my exception personality function to "see" the stack trace (using dlsym), but unfortunately none of the LLVM-generated functions appear on it - only the ones written in C.
  • LLVM IR is so hard to read, what the world really needs is a graphical browser for global variables and functions.
--
-- Talin

Saturday, December 26, 2009

Method reflection and unit tests

I just finished v2 of the simple unit test framework. The v1 version simply looked for global functions beginning with "test" and called them. The v2 version now supports JUnit-like test fixtures, where. Unit tests derive from tart.testing.Test.

Here's a short example of a test class:

import tart.collections.ArrayList;
import tart.reflect.ComplexType;
import tart.testing.Test;

@EntryPoint
def main(args:String[]) -> int {
  return ArrayListTest().runTests(ArrayListTest);
}

class ArrayListTest : Test {
  def testConstruct() {
    let a = ArrayList[int](1, 2, 3);
    assertEq(3, a.length);
    assertEq(1, a[0]);
    assertEq(2, a[1]);
    assertEq(3, a[2]);
  }
  def testImplicitType() {
    let a = ArrayList(1, 2, 3);
    assertEq(3, a.length);
    assertEq(1, a[0]);
    assertEq(2, a[1]);
    assertEq(3, a[2]);
  }
}

This demonstrates a number of interesting language features:
  • Reflection of class methods.
  • "import namespace". The 'assertEq' methods are defined in class "Asserts". Class Test does an "import namespace Asserts", which means that all of the assert methods are available without needing a qualifier (i.e. Asserts.assertEq).
The test output looks like this:

[Running] ArrayListTest.testConstruct
[Running] ArrayListTest.testImplicitType
[OK     ]

Tuesday, December 22, 2009

More progress on tuples

I've made a lot of progress on tuple types in the last several days, including the "unpacked" or "destructuring" assignment:

def returnTupleLarge() -> (int, String, int, String) {
  return 3, "Hello", 2, "World";
}

var d1, e1, d2, e2 = returnTupleLarge();

Unfortunately the "Python swap idiom" doesn't quite work yet:

a, b = b, a;

For some reason the code generator isn't generating the store to 'b'. Still tracking this one down.

In order to fully support tuple types I had to make some significant changes to the way that the code generator treats l-values. You see, LLVM currently has a limitation on the way that it supports first-class aggregates: Only structs that can fit into two registers can be passed by value - if the struct is larger than that, it has to be passed by reference. Eventually LLVM will support abitrary-sized parameters and return types, but that support is not there yet.

So the first problem is how to determine whether a struct will fit into two registers. Since my front-end is target-independent, I don't know how big a register is. For the moment I am using a somewhat dodgy heuristic for this.

The second issue is that we now have two kinds of structs as far as the codegen is concerned: "small" and "large". Similarly, we also have small and large tuples, since a tuple is very much like a struct. However, there is one important difference, which is that structs must always be addressable. The reason for this is the handling of 'self'. When you define a method on a struct, the 'self' argument has to be a pointer to the struct, not the value of the struct - otherwise, it would be impossible to write setter methods, for example.

Tuples, on the other hand, don't have methods, so they don't have to be l-values since there's never a 'self' pointer to worry about. However, if a tuple contains a struct, then that struct has to be addressable, which means that the tuple has to be treated like a struct.

I created a spreadsheet of all the combinations of Tart types and how they are represented. The columns represented various types - primitive, class, etc. The rows represented various forms of use - parameter, return val, variable, constant, intermediate value, member field, and so on. Using this, I was able to boil down all of the runtime representations into just 5 categories, which for lack of a better term I am calling "Shapes":

Shape_Primitive,          // float, int, pointer, etc.
Shape_Small_RValue,       // Small tuples
Shape_Small_LValue,       // Small structs
Shape_Large_Value,        // Large tuples or structs
Shape_Reference,          // Class or interface

Primitive types are always passed by value in a single register. Small_RValue shapes are also passed by value, but may require more than one register. Neither type is guaranteed to have a memory address, since the value may be contained in registers.

Small_LValue is passed and returned by value, but internally it is stored in memory and is guaranteed to have an address.

Large_Value is passed by reference. When used as a parameter, a local allocation is created and the value copied into it. For function return values, the compiler generates a 'struct return' parameter which holds a pointer to a buffer which the caller provides. The return value is copied into this buffer before exiting the function. Similarly, for variable assignments, the contents of the local allocation is copied from one buffer to another.

Reference types are always passed as a single pointer value. Assigning to a reference type merely assigns to the pointer, it does not affect the contents of the memory being pointed to.

The "shape" is determined for each type based on the contents of the type. For example, the shape of a union type may be different depending on what the types within the union are. So if a union contains a Large_Value shaped type, then the union must also be Large_Value as well.

With each type now able to report its shape, I can define the various operations on types - such as load, store, return, argument, copy, and so on - in terms of these shapes. This means that rather than having to have a bunch of special cases for each particular type, the resulting code generator can be considerably simplified.

Saturday, December 19, 2009

Tuple progress

Made some terrific progress today on tuples. Here's the complete unit test so far:

import tart.reflect.Module;
import tart.testing.TestUtils;

@EntryPoint
def main(args:String[]) -> int {
  return TestUtils.runModuleTests(Module.thisModule());
}

def testTupleCreate() {
  var a = 1, 2;
  Debug.assertEq(1, a[0]);
  Debug.assertEq(2, a[1]);
}

def testTupleCreate2() {
  var a = 1, "Hello";
  Debug.assertEq(1, a[0]);
  Debug.assertEq("Hello", a[1]);
}

def testTupleCreateExplicitType() {
  var a:(int, int) = 1, 2;
  Debug.assertEq(1, a[0]);
  Debug.assertEq(2, a[1]);
}

def testTupleAssign() {
  var a = 1, 2;
  var b = a;
  Debug.assertEq(1, b[0]);
  Debug.assertEq(2, a[1]);
}

def testTupleReturn() {
  var a = returnTuple();
  Debug.assertEq(3, a[0]);
  Debug.assertEq("Hello", a[1]);
}

def returnTuple() -> (int, String) {
  return 3, "Hello";
}

Still to do:

-- Test passing tuples as parameters.
-- Test member-wise conversions.
-- Test tuples as members of structs / classes.
-- Write a compilation failure test to insure tuple immutability.
-- Implement unpacking assignment.

I'm really looking forward to being able to write:

   for key, value in map {
     // Do stuff
   }

As opposed to Java style:

   for (Map.Entry<KeyType, ValueType> entry : map) {
     KeyType key = entry.getKey();
     ValueType value = entry.getValue();
     // Do stuff
   }

The former is so much nicer in my opinioin.

Tuesday, December 15, 2009

Doc updates

I recently redid the way that the Tart documentation is being hosted.

Now that Google Code has the ability to host multiple Mercurial repositories per project, I have split off the compiled documentation into a separate repository. The documentation sources are still in the same repo as the compiler sources, as they should be. (Because one should ideally change with the other, although the docs are sadly out of date.)

So here's how it all works: on my laptop I have two local Mercurial repositories, 'tart' and 'tart-docs'. The 'build/html' directory in 'tart' is a symlink to 'tart-docs'. Thus, whenever I do a "make html", it updates the 'tart-docs' repository. Then all I have to do is an hg commit and hg push to update the documentation on the site. Since Google Code's Mercurial support is able to host the raw HTML files straight out of the repository (with images and CSS, no less!), it means that as soon as I check in, the updated docs are immediately visible. Really neat!

You can see the lovely results here:


BTW, did I mention how great Sphinx and Pygments are? And how awesome Georg Brandl is?

--
-- Talin

Monday, December 14, 2009

Status report

There's a pesky bug in the compiler which has been sucking up my time for several days. Essentially I'm having to play whack-a-mole with dependencies whenever I try to enable reflection - the data structures for reflection need certain classes to be imported into the module, and that's not happening for some reason.

A bit of good news is that the tartc compiler now runs under Windows, although the linker does not yet. Even if I get the linker working, there is still the issue of getting the resulting programs to actually run, which will require a lot of work, especially in the area of exception handling.

I have a new language idea that I have been thinking about:

Phil Gossett was trying to educate me recently as to the difference between "adhoc" polymorphism and "parametric" polymorphism, specifically as regards to my template syntax. (He was also trying to get me to drink the 'strictly functional' kool-aid, but I wasn't having any of it.)

Parametric polymorphism (which is what Phil considers the "good" kind) is where you have some form of generic or parameterized type that has one or more type parameters. An example is List[T], where T is a type parameter. In parametric polymorphism, you have to declare 'T' as having some sort of upper bound type, and you can only call methods which are defined in that upper bound. An example would be, say, List[Number], which could then be specialized to List[int], List[float] and so on, but not List[String].

Adhoc polymorphism (which he considers the "bad" kind - as in "that way leads to madness") is more like duck typing - once you define List[T], T can be anything, and if you call something like T.toString(), that works as long as T has a toString() method. There's no requirement that the type bound to T have any particular base class as long as it conforms to the implicit contract.

C++ "concepts" are somewhere in between these two - that is, you make the contract explicit rather than implicit (by requiring that T have certain methods) but you don't require that T derive from some particular base class.

Now, given my experience with Python, where adhoc polymorphism has historically been the only kind, I think that such fears are overblown. Just like dynamic typing, it's something that scares the "bondage and dominance" school of language designers - when in practice the kinds of disasters they worry about almost never happen. (I'm surprised that no one has yet done a PHd thesis on whether the lack of strict type checking makes dynamic languages less reliable in practice - and more interestingly, if the answer is "it doesn't" then why?)

However, it occurs to me that there's one concrete benefit to parametric polymorphism, and that is avoiding template bloat. You see, if a template parameter must derive from some class, and can only use methods of that class, then the exact same generated code will work for all subclasses, assuming that those classes have binary compatibility (which is true for all reference types, but not for all value types.) Thus, if a template only calls "toString()" then the same exact set of machine instructions can be used for any subclass of Object.

Note that the savings in space can in some cases cause a small loss in speed - by re-using the same template instance for both Object and String, say, it means that some method calls have to be dispatched dynamically, whereas if the compiler generated a template instance for class String only, it would then be able to resolve certain method calls at compile time and thus generate faster code.

So it seems to me that there is perhaps a use for both types of polymorphism. One way to implement this would be to have two versions of the "<:" (issubclass) operator - a 'strict' and a 'relaxed' version. By making it an operator, it allows you to specify the strictness for each template parameter without affecting the other template parameters. It means that programmers can decide on a case-by-case basis whether to optimize for size or speed.

How to specify this in syntax I really haven't thought too deeply about yet.

-- Talin

Tuesday, December 8, 2009

Enum.toString()

Last night I got the 'toString()' method working for enum constants:

var x = TestEnum.Rock;
Debug.assertEq("Rock", x.toString());

Speaking of toString, I haven't yet figured out what to call the "formatted toString" function. This is a version of toString that takes a single argument string which is the format spec, the equivalent of PEP 3101's '__format__' method. I don't want to use '__format__' because Tart doesn't use double underscores for special methods. (Rather, I just choose names that are sufficiently unique, such as "infixAdd".) Some variations I have considered are: toStringFmt, toStringF, toStringFormatted, but I'm not that happy with any of them.

I also fixed a couple of bugs in Type.of, so now you can say:

var ty:Type = Type.of(String);
Debug.assertEq("tart.core.String", ty.toString());
Debug.assertTrue(ty isa ComplexType);

... and so on.

Monday, December 7, 2009

TypeLiteral

I've been following Guido's suggestion about keeping the methods that relate to reflection in a separate namespace from the actual class methods. For each user-defined class, there is a separate instance of tart.reflect.Type which contains information about the type name, base classes, lists of methods, and so on.

This reflection structure is entirely separate from the regular "class" object which contains the jump tables and superclass array. in fact, the class contains no references to the reflection data at all, as the relationship is one-way only. This means that if you aren't using reflection, the various data structures can be deleted by the linker during optimization. Unfortunately, this also means that if you do need reflection, you have to explicitly register the classes that you want to reflect. Since you can register whole modules and/or packages with a single call, this is not that much of a hardship.

So now that we have these two data structures, how do you get from one to the other? Here's what I implemented today:

Normally, when a type name is used, it is either used as a constructor: Foo(), or as a namespace for statics: Foo.method. However, what happens if we attempt to use Foo as a value? Another way to ask this question is to say "what is the type of the word 'Foo' in the source code?" The answer is that Foo is a type literal. Specifically, it's type is TypeLiteral[Foo], meaning that it's an type that has a template parameter which is the type represented by the literal.

There are a couple of different ways you can use a type literal. First, you can convert it into a pointer to the type reflection object using Type.of(). So for example, if I say Type.of(String), what I get is the reflection info for class String. (The 'of' method in class Type is an intrinsic that does this.) Note that this use of 'of' automatically pulls in the reflection data for that type, so there's no need to explicitly register that class.

The other way you can use a type literal is by binding it's template argument to a type variable. For example:

   def newInstance[%T](type:TypeLiteral[T]) -> T { return T(); }

If we pass "String" as the argument, what actually gets passed is TypeLiteral[String]. This in turn causes the type variable T to be bound to 'String', at which point the definition of 'T' is now available in the body of the function. Note that we never actually use the value of the type literal in this case, only it's type - which will most often be the case, since TypeLiterals have no properties other than the single type parameter.

Why did we not simply declare the argument as T rather than TypeLiteral[T]? Because if we're passing in, say, a String, then 'T' means an actual instance of a String, whereas TypeLiteral[T] means we are talking about the String *type*.

Sunday, November 29, 2009

What's next?

I still have a long TODO list, but things are steadily progressing. Here are a few of the items that are up next:

* Compile under Visual Studio. I've set up my new thinkpad to dual-boot both Windows and Ubuntu (my normal development machine is a Mac), and installed Visual Studio 2008 Express on it. Using cmake to generate a VS project for LLVM worked beautifully, and compiled without a hitch.

Unfortunately, Tart's cmake config files aren't in such good shape. My FindLLVM.cmake module relies rather heavily on the llvm-config script to generate header and library dependencies - and llvm-config is not available under Windows. I did some code searches, and it appears that what many projects do is to specify the dependencies manually for Windows, and automatically for other platforms. I don't really want to have to maintain the dependencies manually, part of why I migrated over to using llvm-config was to get away from that. (LLVM has like 30 or so separate libs, keeping the dependencies straight is a pain.)

My current thought is to create a Python script that calls llvm-config when running under Unix, and then transforms the output to cmake syntax. This would then be used to keep FindLLVM.cmake up to date.

Note that all this has to be done before I can even *think* about trying to compile Tart. At this point, I have no idea if it can even be done - at the very least, I would assume that my exception personality function will have to be completely re-written.

* Get source-level debugging working. Most of the code to generate source-level debugging is done, but it's currently turned off because LLVM does wrong things when it's turned on.

* Finish up closures.

* Garbage collection. I started sketching out some Tart classes for the garbage collector (Originally I was going to write the GC in C++, but I think I can do it in Tart nearly as well.) The issue that needs to be solved is how to handle platform differences - I can call vmalloc() on linux to get raw blocks of aligned core memory, but on Windows I'll need to call something else I guess. The important thing is that the decision can't be made at runtime, since the program won't link if it contains calls to APIs that aren't present on that platform.

For swapping out code on different platforms, there are generally two strategies: Conditional compilation (#ifdef in C++), and Makefile conditionals (i.e. replace entire source files based on the current platform). Unfortunately in a language like Tart where there's a relationship between the scope hierarchy and the arrangement of files on the filesystem, you have to be extra tricky to support whole-file swapping, although it can be done. (The easiest way is to define a separate package root for each platform and put them in the class path.)

For source-level conditional compilation, I had planned to implement something like D's "static if" - an if-statement that can appear outside of a function (or inside), in which the test expression must be a constant, and which causes the code within the 'if' to not even be compiled if the test is false.

As far as the test expressions themselves, my current thinking is to define a package, tart.config, that contains the configuration options for the current host. One issue is how much do we care about cross-compilation - it would be much easier if we didn't have to think about that just yet. For cross compilation we need lots of command-line options to the compiler; for simple host-based compilation, we can let the config script do most of the work.

* Some sort of command-line options package. I'd kind of like to do some sort of annotation-based command-line options, where each module can define its own options. Ideally something Guice-like. I know this approach can lead to problems with low-level modules exporting options that are of no interest to the user, but I think that problem is manageable.

One thing I haven't quite figured out is how to collect all of the various command-line variables. The brute force approach is to do reflection on every single field of every class, see if it has a "CmdLineParameter" annotation, and if so, process it. Another approach would be to not use annotations at all, but instead have the command-line options that are statically-constructed instances which register themselves at construction time with the command-line arguments processor. This would of course require me to get static constructors working :)

The third approach is to have some sort of link-time process that collects all fields which are annotated with some particular annotation class and then creates a list, which the program can simply iterate through upon startup. This has to be done carefully, because we only want to include in that list fields which are actually used by the program. You see, the linker currently traces through all of the symbols reachable from "main" and discards any that aren't reachable - sort of a link-time garbage collector. But this hypothetical list of annotated symbols is itself a reference, which would prevent the symbol from being discarded, even if no other part of the program referred to that symbol. So we would want to construct such as list after the dead-global elimination phase.

* Tuple types. These are partly done, but we need to finish them so we can do multiple return values and unpacking assignment.

* Reflection work - currently the only thing reflected is global functions, we need to get other declaration types reflected as well.

* Stack dumps - need a way to print out the current call stack. This one is going to be way hard.

* Finish up constructors - lots of work on inheritance, chaining, final checking, etc.

* Compiled imports. Right now when the compiler sees an "import" statement, it actually loads the source file and parses it, although it only analyzes the AST tree lazily (i.e. only stuff that is actually referred to by the module being compiled). What I'd like to do is take all of the high-level info that is lost when creating a .bc file, and encode that using LLVM metadata nodes. The idea is that if a file was already compiled, instead of recompiling it, it would just load the bitcode file and read the metadata. Assuming that the file had not changed, of course.

* "finally" statement. Although exceptions are working great, I haven't done 'finally'. However, this will be much easier now that LLVM allows for local indirect branches.

* Enum.toString(), Enum.parse().

* String formatting functions.

* I haven't even started on the i/o libraries. (No "Hello World" yet, unless you call Debug.writeLn()).

* yield / with statements.

* Docs, docs, docs...

--
-- Talin

Anonymous functions

Anonymous functions are now working, as shown in the following example:

def testAnonFn {
  let f = fn (i:int) -> int {
    return i + i;
  };
  
  Debug.assertEq(2, f(1));
  Debug.assertEq(4, f(2));
}

The keyword 'fn' basically means 'lambda'. In this case, the lambda function is being assigned to the variable 'f', which is then called in the two assert statements below.

Note however, that the lambda is not yet a true closure, although I have started work on that. For now, the anonymous function can't access local variables from the outer scope. (Although it can indeed access variables from global scopes, since otherwise the '+' operation would not compile.)

I have thought about allowing the 'fn' keyword to be dropped entirely if there are no arguments and no return values - essentially the equivalent of a clang block - which would be designated with just { and }. I'm not sure that this is a great idea, though. There are other possible uses of {} (map literals, perhaps, although those could easily be done via [ => ] as well.) From a parsing standpoint, there's no ambiguity problems - it's easy to distinguish a "block" from a "suite", except in the case of a standalone suite that is not part of some other statement, in which case semantically they are the same thing anyway.

The advantage would be that you could do things like "onExit({ print("Bye!"); })" as opposed to "onExit(fn { print("Bye!"); })". Not sure if the small increase in brevity is really worth it.

--
-- Talin

Status update

Lots of work done over the thanksgiving weekend. The biggest news is that reflected methods are now callable. This makes it possible to write a very simple unit tests runner,

Here's how the test runner is invoked:

import tart.reflect.Module;
import tart.testing.TestUtils;


@EntryPoint
def main(args:String[]) -> int {
  return TestUtils.runModuleTests(Module.thisModule());
}

'runModuleTests' simply iterates through all of the global methods defined in the module, and calls any that begin with the letters "test". Here's what the function looks like:


import tart.reflect.Module;


namespace TestUtils {


  /** Simple function to run all of the test functions in a module. */
  def runModuleTests(m:Module) -> int {
    try {
      for method in m.methods {
        if method.name.startsWith("test") {
          Debug.write("Running: ", method.name, "...");
          method.call(m);
          Debug.writeLn("DONE");
        }
      }
    } catch t:Throwable {
      Debug.fail("Unexpected exception: ", t.toString());
      return 1;
    }
  
    return 0;
  }
}


Obviously, there's a lot of room for refinement here, but it demonstrates the basic principle.

The other thing I worked on is finishing up union types - and adding support for unconditional dynamic casting. More sample code:

  var x:int or float; // A union type


  // Integer type
  x = 1;


  // isa tests
  Debug.assertTrue(x isa int);
  Debug.assertFalse(x isa float);
  Debug.assertFalse(x isa String);


  // Type casts
  Debug.assertEq(1, typecast[int](x));
  try {
    Debug.assertEq(1.0, typecast[float](x));
    Debug.fail("union member test failed");
  } catch t:TypecastException {}


  // classify
  classify x as xi:int {
    Debug.assertTrue(xi == 1);
  } else {
    Debug.fail("union member test failed");
  }

The difference between 'classify' and 'typecast' is that the latter throws an exception if it fails, while the former transfers control to a different suite.

Monday, November 23, 2009

Protocols and template conditions

I spent most of the weekend working on protocols and template conditions, and while they aren't working yet, they are very close.

Template conditions allow you specify a restriction on what types a template parameter can bind to. For example, say we wanted to be able to define a template that could only be used with subclasses of Exception:

    class ExceptionFilter[%T <: Exception] { ... }

The '<:' operator is equivalent to the Python function "issubclass". %T is a type variable (the percent sign introduces a new type variable.)

Tart allows you to have multiple definitions of the same template, as long as it is unambiguous which one to use:

    class LogFormatter[%T <: Exception] { ... }
    class LogFormatter[String] { ... }

Note that the second declaration of LogFormatter did not declare a type variable. In this case, the type argument 'String' is a type literal, not a variable - in other words, the template pattern will only match if the type argument is type 'String' exactly.

The template conditions can also be given outside of the template parameter list, using the 'where' clause:

    class LogFormatter[%T]
      where T <: Exception {
      ...
    }

There can be an arbitrary number of 'where' clauses on a template. (Note: Where clauses are not implemented, but they are not hard to do now that I have the basic condition architecture in place.)

One particularly useful template condition is whether a class has a particular method or not. Rather than defining as "hasmethod" operator, Tart allows you declare a pseudo-interface called a "protocol". For example, if we wanted to test whether a class has a "toString()" method, we would start by defining a protocol containing the toString() method:

    protocol HasToString {
      def toString() -> String;
    }

Syntactically, protocols are like interfaces, however they are psuedo-types, not real types - you can't declare a variable whose type is a protocol. You can inherit from a protocol, but all that does is cause the compiler to emit a warning if any of the protocol's methods are not implemented. It does not change the layout or the behavior of the class in any way.

Another difference with protocols is that any class that meets the requirements of the protocol is considered a subtype of the protocol - regardless of whether the class explicitly declares the protocol as a base type or not.

Here's a concrete example: Suppose we want to make a "universal stringifier" that will convert any value to a string:

    def str(obj:Object) -> String { return obj.toString(); }
    def str[%T <: HasToString](v:T) -> String { return v.toString(); }
    def str[%T](v:T) { return "???"; }

The first overload handles all of the types which are subclasses of Object. Since we know "Object" has a toString() method, and since that method is dynamically dispatched (i.e. the equivalent of C++ "virtual"), the toString() call will get routed to the right place.

The next line handles all types that have a 'toString' method, whether or not they explicitly inherit from "HasToString". This includes things like integers and floats, which have a toString() method (in Tart, you can say things like "true.toString()"). Since this is a template, it will create a copy of the function for each different type of T. (That's why we handled 'Object' as a separate case, to limit the number of such copies generated.)

The third overload handles classes which don't have a 'toString' method. (If you are wondering what kinds of types don't have a default toString() method, the answer includes structs, unions, tuples, native C data types, and so on.) In the example, it just returns the string "???" but in fact it could be much smarter - at minimum printing the string name of type T.

The overload resolver will attempt to match the most specific template whose constraints are met. Thus, the first method is preferred over the second since the second requires binding a pattern variable, and the second is preferred over the third because it has a condition.

-- 
-- Talin

Sunday, November 15, 2009

Tart Status

Last week, my new laptop arrived, a ThinkPad T400s, which is turning out to be a pretty nice laptop. I initially had some problems with Wubi, the Ubuntu Windows installer. Wubi creates a Linux filesystem as a file within the Windows NTFS filesystem, avoiding the need to repartition the drive. It then modifies the Windows boot loader to allow dual booting into either Windows or Linux.

All of this worked great until the first apt package update, which installed a new version of GRUB, the boot manager. This apparently messed up the Windows 7 bootloader so that it could no longer boot the Linux filesystem, and my reading the Ubuntu forums didn't inspire much hope of ever getting it to work. In order to recover my 3 days worth of work, I had to create a rescue CD, mount NTFS, and then mount the virtual Linux filesystem within it. Then partition the drive to add a new Linux partition, rsync the data over to the new partition, and configure a boot loader for a more conventional dual-boot system. This took about a day to figure out.

In any case, I now have Ubuntu working well, although I haven't yet learned how to set up the boot table needed to boot back into Windows. That can wait for the moment. I have Google Chrome and Eclipse humming along nicely, with sufficient tweaking and customization that I can be decently productive. The fact that I am able to type this email relatively fast, with no illumination on the keyboard is an indicator that from a form-factor standpoint this laptop was the right choice. And it's nice and light - the case is carbon fiber.

(BTW, with regards to Chrome - at first I had planned to just get by with Firefox, and I spent a couple of days using that. But I'd gotten used to using Chrome at work - in particular, the "control-T + start typing" gesture to open a new tab and search is surprisingly addictive. So I broke down and installed Chrome and I find that I am a lot happier.)

In any case, I was able to spend the rest of the week focused on the reason that I got the laptop in the first place: Getting Tart to compile and run under an operating system other than Darwin/OS X. That took a while, but I was finally able to check in a version that runs all tests on Linux as well as OS X. One issue that bit me a number of times is that the order of link libraries is significant under Linux, but not under OS X - although I think that's probably more a function of the version of GCC that's available on those systems.

In addition to all that, I worked on a number of other minor Tart issues - during those periods where I was stumped on getting it to compile and run on Linux. One thing that I have been working on is a rough sketch of a "pure Tart" garbage collector. Recent changes in the Tart language (or more precisely, in the "unsafe" mode language extensions) have led me to the belief that the performance penalty for writing the collector in Tart might not be as significant as I originally estimated.

Of course, some parts, such as the mutex code, will have to be written in C unless I can figure a way to get Tart to call POSIX and Win32 functions directly. But the collector algorithms might not have to be. At least, when I look at the Tart classes, and their C++ equivalents, I can imagine how the code would be generated in either case and I'm thinking that it might not be that different. The biggest hurdle will be dealing with pointers that have flags in the low bits - i'll probably have to add some special compiler intrinsics to deal with that case. There are plenty of other potential bottlenecks to performance that I can perceive, but those bottlenecks would be present in a C/C++ implementation as well as a Tart one.

The advantage of writing a Tart GC is that it will stress the language design in a way that I would like it to be stressed - that is, the challenge of writing efficient low-level code. The disadvantage is that it's less likely that someone will come along and re-use my GC for another language (and hopefully improve it in the process) - in other words, there might be less eyeballs on the code. However, at this point it's by no means certain that such additional eyeballs will ever manifest.

Another thing I've done (which is not checked in yet) is created an Eclipse plugin for Tart source code. It handles syntax highlighting and a few other minor editing tweaks like auto-indentation. It doesn't handle reformatting or refactoring of any of the advanced stuff. Sadly, it appears that the Eclipse framework doesn't handle this in a generalized way - meaning that each different language has to define it's own AST, Parser, and refactoring engine. Thus, JDT, CDT and PyDev - for example - are all separate plugins that don't share any code other than the basic Eclipse platform.

For now, the plugin is good enough to edit Tart code with, and I'm hoping that some Eclipse enthusiast will come along and finish what I've started :)

--
-- Talin

Saturday, October 31, 2009

So my ideas for const won't work

I'm starting to realize that C-style 'const' and reflection do not play well together.
 
A quick review: C's semantics for const are distinctive in that they are transitive - that is, you can declare a variable as const in such a way that it applies to the members of the object being referred to. Of course, in Java and other languages you can declare a variable as 'final', but that attribute applies to the variable only, it doesn't affect the object being referred to.
 
Another way to think of this is that in Java, the const-ness of an object is self-contained - all of the information about what is mutable and what is not is represented in the object's type, and cannot be changed by an external entity that happens to reference the object.
 
The problem with reflection is that function arguments must be converted into generic Objects in order to be handled uniformly, which means that any information outside of the object is lost. There's no way to represent the const-ness of the object reference when passing it as a generic Object. This means that either you can never pass a 'const' object to a reflected function ever, or passing an const object as an argument to a reflected function silently makes it non-const, which violates the contract. The third alternative is to make the reflection interface more complex and less efficient by wrapping each const object in a special const wrapper object.
One question that should be asked is, what makes C-style const useful?
 
My answer to this starts with the observation that the use-cases for being able to control object mutability fall into two large groups: Group 1 is about object permissions, controlling what parts of the code are allowed to mutate an object. Group 2 is about managing shared state, especially between threads, where being able to declare an object as immutable is a handy shortcut to the general shared state problem.
 
In Java, one generally deals with these problems by explicitly exposing a separate interface that offers a reduced set of methods -- removing all of the methods that could cause a mutation of the object, but leaving the methods which don't cause state chanes. This involves a lot of extra typing. It also doesn't address the Group 2 use cases, since the removal of mutating methods does not guarantee that the data won't be mutated in some other way.
 
The value of transitive const is that it automatically creates such an interface with a single keyword: That is, given a set of methods, it can 'filter out' the ones we don't want, not really removing them but making it an error to call them.
 
Of course, there are many possible such filterings of an interface, but this particular filtering - removal of non-const methods - is so common that it makes sense to support it in the language as some sort of shortcut.
 
One possible solution to the problem would be some way in the languge to "auto-create" an interface - given an interface, return a transformed version of that interface with the mutating methods removed. That doesn't solve the Group 2 use case, but at least it avoids the extra typing. Also, my ideas about this are completely nebulous and fuzzy.

--
-- Talin

Friday, October 30, 2009

Factoring reflection

Now that boxing and unboxing are out of the way, that clears the way for me to start working on function reflection. The reason I needed the boxing support is due to the way that reflected methods are called. For example, suppose we have a method foo of some class Bar:

   class Bar {
     def foo(a:int, b:int) -> int { }
   }

Reflecting 'foo' yields an object of type tart.reflect.Method, which among other things has a "call" method:

   class Method {
     def call(obj:Object, args:Object[]) -> Object;
   }

What 'call' needs to do is typecast the arguments from type Object into the correct argument types. For arguments whose type is a subclass of Object, we need to downcast the argument to that type. For arguments that are primitive types, the input argument must be a boxed value, and 'call' must therefore unbox it. In either case, if the input argument is not the right type, then a TypecastException is thrown. Note that all of this code is generated automatically by the compiler - for each function, there is an instance of 'Method' which is a static immutable global containing the information for that function.

One problem here is that the amount of code and data generated for a single function is large - thus, a simple 'getter' that returns a value might only require a handful of instructions, but the 'call' method needed to typecheck and convert its arguments may be an order of magnitude larger. Generating a 'call' function for every user-defined function would lead to major code bloat.

One obvious optimization is to recognize that the argument type checking logic is the same for functions that have the same type. Thus, if I have two methods a(int, int) and b(int, int), they can share the same type-casting code. We can factor out this code into a function, which I will call "invoke", which takes the arguments of "call" plus a reference to the function itself:

  var invoke:static fn (func:fn (), obj:Object, args:Object[]) -> Object;

The syntax is a bit unfamiliar - the sequence 'static fn' is used to declare a reference to a function that does not have a 'self' argument. What the above declaration represents is a pointer to a function which accepts three arguments, the first of which is a function.

The compiler-generated 'Method' object will now contain two pointers: one to the function itself, and one to the 'invoke' function for that function type. The 'invoke' function is shared by all functions that have the same type. This sharing even works across module boundaries, because each 'invoke' function has a unique name which is derived from the argument types, such as ".invoke(int, int)->int". (LLVM allows function names to contain any printable character, a feature which I rely on heavily.)

Next we need to consider how to handle the 'self' argument. The first approach is to treat it like any other argument to 'invoke'. However, this greatly increases the number of permutations of 'invoke' that get generated. Since the 'self' argument is different for every class, it means that no class can share their 'invoke' methods with other classes.

A better way is to hoist the type-checking of self out of the invoke method and put it in a separate function. So now we have 3 pointers in the Method class: the function pointer, the pointer to invoke, and the pointer to the self type check. Actually the last one doesn't need to be a function pointer, it only needs to be a pointer to data needed to perform an 'isa' test.

As an aside, this same organization can be used for the other data stored in the Method object, such as the list of function parameters. For example, if we store the parameter names in a separate array from the parameter types, then the parameter type descriptor objects can be shared with all other methods that have the same parameter types. Similarly, the array of parameter names can be shared with any function that has the same param names, even if the parameter types are different.

One final issue to address: What if the type of 'self' is not an object? You see, in Tart, there are two major branches on the tree of types: Reference types, which are always passed by reference, and Value types, which are always passed by value. Primitive types are value types; so are structs. Reference types are allocated on the heap and are garbage-collectable; Value types can only live inside of other objects, or on the stack.

Normally any value type can be converted into an Object by boxing, so if you need to pass an int to a function that expects an Object, you can just box it. However, boxing creates a *copy* of the object. That's fine for most parameters, but for 'self' we don't want to create a copy, since that would make it impossible to write a method that mutates the fields of the struct. We need a pointer to the original object.

Tart deals with this by handling the 'self' parameter of value types specially: It is always a reference, even if the type is not a reference type. This has to be handled carefully, however. With Reference types, the 'self' pointer always points to the beginning of a heap allocation, but the 'self' pointer of a value type might point into the middle of an object. The garbage collector doesn't know how to handle such pointers. To avoid memory leaks or data corruption, we can't allow the 'self' argument of a method of a value type to leave the function. We can do this by making 'self' a special type that can be member-dereferenced only - that is, you can access any member, but you can't assign the value to anything. This is for value types only, of course.

But now we want to be able to call methods on value types via reflection, which means we *have* to have some way to pass a struct reference around. For structs which live in the middle of an object, the safe thing to do is to create a wrapper that contains both the struct pointer and a 'base' pointer to the object that contains the struct. That way, the garbage collector can trace the 'base' pointer normally, ignoring the struct pointer.

For structs which live on the stack, there's only one remedy, which is to box them ahead of time, and then pass the boxed reference around. Essentially we will have to have the compiler notice if anyone is using the struct in a way that requires a persistent reference to the object, and convert the stack-based object into a heap-based one at compile time. Note that this is not that different from what happens when we create a closure cell - we notice that someone is using the variable from within a closure, in which case we wrap the variable and put it on the heap.

--
-- Talin

Thursday, October 29, 2009

Progress: Unboxing and explicit specialization

I've made a bunch of progress this last week. Unboxing now works, and what's even better is that it is implemented in pure Tart - although I did need to fix several bugs in the compiler, I did not need to add any special support in the compiler for either boxing or unboxing.

Auto-boxing is done via a coercion function in the Object class:

/** Implicitly convert non-object values to Boxed types. */
static def coerce[%T] (value:T) -> Object { return ValueRef[T](value); }
static def coerce(value:Object) -> Object { return value; }

If the input argument derives from Object then it is returned unchanged, otherwise if it's a value type, such an int, it's wrapped in a ValueRef. This happens automatically whenever you attempt to pass a primitive type to a function that takes an Object as an argument.

The ValueRef class itself is fairly simple:

/** A reference type used to contain a value type. */
class ValueRef[%T] : Ref {
  let value:T;
  
  def construct(value:T) {
    self.value = value;
  }

  /** Return a string representation of the contained value. */
  def toString() -> String {
    return self.value.toString();
  }
}

To unbox a boxed value, use the static "valueOf" method in class "Ref". It requires an explicit type parameter:

var v = Ref.valueOf[int](obj);

The 'valueOf' method is defined for each different primitive type and uses the 'classify' statement to downcast to the correct wrapper type. Here's the entire 'Ref' class:

/** Abstract class used to represent a reference to some value. */
abstract class Ref {
  private static {
    /// Convert signed integers to their unsigned equivalents. Supresses conversion warnings.

    def asUnsigned(v:int8) -> uint8 { return uint8(v); }
    def asUnsigned(v:int16) -> uint16 { return uint16(v); }
    def asUnsigned(v:int32) -> uint32 { return uint32(v); }
    def asUnsigned(v:int64) -> uint64 { return uint64(v); }

    /// Convenience functions to check whether the given input is within the numeric range
    /// of the type specified by the type parameter.

    def rangeCheck[%T](value:int32) {
      if value < T.minVal or value > T.maxVal {
        throw TypecastException();
      }
    }

    def rangeCheck[%T](value:int64) {
      if value < T.minVal or value > T.maxVal {
        throw TypecastException();
      }
    }

    def rangeCheck[%T](value:uint32) {
      if value > asUnsigned(T.maxVal) {
        throw TypecastException();
      }
    }

    def rangeCheck[%T](value:uint64) {
      if value > asUnsigned(T.maxVal) {
        throw TypecastException();
      }
    }

    def rangeCheck[%T](value:char) {
      if uint32(value) > asUnsigned(T.maxVal) {
        throw TypecastException();
      }
    }
  
    def check(value:bool) {
      if not value {
        throw TypecastException();
      }
    }

    /// Check to insure that the input value is not negative.

    def signCheck(value:int32) {
      if value < 0 {
        throw TypecastException();
      }
    }

    def signCheck(value:int64) {
      if value < 0 {
        throw TypecastException();
      }
    }
  }

  /// Convert an object containing a number to the requested number type.

  static def valueOf[bool](ref:Object) -> bool {
    classify ref {
      as v:ValueRef[bool]   { return v.value; }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[char](ref:Object) -> char {
    classify ref {
      as v:ValueRef[int8]   { signCheck(v.value); return char(v.value); }
      as v:ValueRef[int16]  { signCheck(v.value); return char(v.value); }
      as v:ValueRef[int32]  { 
        check(v.value >= 0 and uint32(v.value) <= uint32(char.maxVal));
        return char(v.value);
      }
      as v:ValueRef[int64]  {
        check(v.value >= 0 and uint64(v.value) <= uint64(char.maxVal));
        return char(v.value);
      }
      as v:ValueRef[uint8]  { return char(v.value); }
      as v:ValueRef[uint16] { return char(v.value); }
      as v:ValueRef[uint32] { check(v.value <= uint32(char.maxVal)); return char(v.value); }
      as v:ValueRef[uint64] { check(v.value <= uint64(char.maxVal)); return char(v.value); }
      as v:ValueRef[char]   { return v.value; }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[int8](ref:Object) -> int8 {
    classify ref {
      as v:ValueRef[int8]   { return v.value; }
      as v:ValueRef[int16]  { rangeCheck[int8](v.value); return int8(v.value); }
      as v:ValueRef[int32]  { rangeCheck[int8](v.value); return int8(v.value); }
      as v:ValueRef[int64]  { rangeCheck[int8](v.value); return int8(v.value); }
      as v:ValueRef[uint8]  { rangeCheck[int8](v.value); return int8(v.value); }
      as v:ValueRef[uint16] { rangeCheck[int8](v.value); return int8(v.value); }
      as v:ValueRef[uint32] { rangeCheck[int8](v.value); return int8(v.value); }
      as v:ValueRef[uint64] { rangeCheck[int8](v.value); return int8(v.value); }
      as v:ValueRef[char]   { rangeCheck[int8](v.value); return int8(v.value); }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[int16](ref:Object) -> int16 {
    classify ref {
      as v:ValueRef[int8]   { return int16(v.value); }
      as v:ValueRef[int16]  { return v.value; }
      as v:ValueRef[int32]  { rangeCheck[int16](v.value); return int16(v.value); }
      as v:ValueRef[int64]  { rangeCheck[int16](v.value); return int16(v.value); }
      as v:ValueRef[uint8]  { return int16(v.value); }
      as v:ValueRef[uint16] { rangeCheck[int16](v.value); return int16(v.value); }
      as v:ValueRef[uint32] { rangeCheck[int16](v.value); return int16(v.value); }
      as v:ValueRef[uint64] { rangeCheck[int16](v.value); return int16(v.value); }
      as v:ValueRef[char]   { rangeCheck[int16](v.value); return int16(v.value); }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[int32](ref:Object) -> int32 {
    classify ref {
      as v:ValueRef[int8]   { return v.value; }
      as v:ValueRef[int16]  { return int32(v.value); }
      as v:ValueRef[int32]  { return v.value; }
      as v:ValueRef[int64]  { rangeCheck[int32](v.value); return int32(v.value); }
      as v:ValueRef[uint8]  { return int32(v.value); }
      as v:ValueRef[uint16] { return int32(v.value); }
      as v:ValueRef[uint32] { rangeCheck[int32](v.value); return int32(v.value); }
      as v:ValueRef[uint64] { rangeCheck[int32](v.value); return int32(v.value); }
      as v:ValueRef[char]   { rangeCheck[int32](v.value); return int32(v.value); }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[int64](ref:Object) -> int64 {
    classify ref {
      as v:ValueRef[int8]   { return int64(v.value); }
      as v:ValueRef[int16]  { return int64(v.value); }
      as v:ValueRef[int32]  { return int64(v.value); }
      as v:ValueRef[int64]  { return v.value; }
      as v:ValueRef[uint8]  { return int64(v.value); }
      as v:ValueRef[uint16] { return int64(v.value); }
      as v:ValueRef[uint32] { return int64(v.value); }
      as v:ValueRef[uint64] { rangeCheck[int64](v.value); return int64(v.value); }
      as v:ValueRef[char]   { rangeCheck[int64](v.value); return int64(v.value); }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[uint8](ref:Object) -> uint8 {
    classify ref {
      as v:ValueRef[int8]   { signCheck(v.value); return uint8(v.value); }
      as v:ValueRef[int16]  {
        check(v.value >= 0 and v.value <= int16(uint8.maxVal));
        return uint8(v.value);
      }
      as v:ValueRef[int32]  {
        check(v.value >= 0 and v.value <= int32(uint8.maxVal));
        return uint8(v.value);
      }
      as v:ValueRef[int64]  {
        check(v.value >= 0 and v.value <= int64(uint8.maxVal));
        return uint8(v.value);
      }
      as v:ValueRef[uint8]  { return v.value; }
      as v:ValueRef[uint16] { rangeCheck[uint8](v.value); return uint8(v.value); }
      as v:ValueRef[uint32] { rangeCheck[uint8](v.value); return uint8(v.value); }
      as v:ValueRef[uint64] { rangeCheck[uint8](v.value); return uint8(v.value); }
      as v:ValueRef[char]   { rangeCheck[uint8](v.value); return uint8(v.value); }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[uint16](ref:Object) -> uint16 {
    classify ref {
      as v:ValueRef[int8]   { signCheck(v.value); return uint16(v.value); }
      as v:ValueRef[int16]  { signCheck(v.value); return uint16(v.value); }
      as v:ValueRef[int32]  {
        check(v.value >= 0 and v.value <= int32(uint16.maxVal));
        return uint16(v.value);
      }
      as v:ValueRef[int64]  {
        check(v.value >= 0 and v.value <= int64(uint16.maxVal));
        return uint16(v.value);
      }
      as v:ValueRef[uint8]  { return v.value; }
      as v:ValueRef[uint16] { return v.value; }
      as v:ValueRef[uint32] { rangeCheck[uint16](v.value); return uint16(v.value); }
      as v:ValueRef[uint64] { rangeCheck[uint16](v.value); return uint16(v.value); }
      as v:ValueRef[char]   { rangeCheck[uint16](v.value); return uint16(v.value); }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[uint32](ref:Object) -> uint32 {
    classify ref {
      as v:ValueRef[int8]   { signCheck(v.value); return uint32(v.value); }
      as v:ValueRef[int16]  { signCheck(v.value); return uint32(v.value); }
      as v:ValueRef[int32]  { signCheck(v.value); return uint32(v.value); }
      as v:ValueRef[int64]  {
        check(v.value >= 0 and v.value <= int64(uint32.maxVal));
        return uint32(v.value);
      }
      as v:ValueRef[uint8]  { return uint32(v.value); }
      as v:ValueRef[uint16] { return uint32(v.value); }
      as v:ValueRef[uint32] { return v.value; }
      as v:ValueRef[uint64] { rangeCheck[uint32](v.value); return uint32(v.value); }
      as v:ValueRef[char]   { return uint32(v.value); }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[uint64](ref:Object) -> uint64 {
    classify ref {
      as v:ValueRef[int8]   { signCheck(v.value); return uint64(v.value); }
      as v:ValueRef[int16]  { signCheck(v.value); return uint64(v.value); }
      as v:ValueRef[int32]  { signCheck(v.value); return uint64(v.value); }
      as v:ValueRef[int64]  { signCheck(v.value); return uint64(v.value); }
      as v:ValueRef[uint8]  { return uint64(v.value); }
      as v:ValueRef[uint16] { return uint64(v.value); }
      as v:ValueRef[uint32] { return uint64(v.value); }
      as v:ValueRef[uint64] { return v.value; }
      as v:ValueRef[char]   { return uint64(v.value); }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[float](ref:Object) -> float {
    classify ref {
      as v:ValueRef[int8]   { return float(v.value); }
      as v:ValueRef[int16]  { return float(v.value); }
      as v:ValueRef[int32]  { return float(v.value); }
      as v:ValueRef[int64]  { return float(v.value); }
      as v:ValueRef[uint8]  { return float(v.value); }
      as v:ValueRef[uint16] { return float(v.value); }
      as v:ValueRef[uint32] { return float(v.value); }
      as v:ValueRef[uint64] { return float(v.value); }
      as v:ValueRef[float]  { return v.value; }
      as v:ValueRef[double] { return float(v.value); }
      else {
        throw TypecastException();
      }
    }
  }

  static def valueOf[double](ref:Object) -> double {
    classify ref {
      as v:ValueRef[int8]   { return double(v.value); }
      as v:ValueRef[int16]  { return double(v.value); }
      as v:ValueRef[int32]  { return double(v.value); }
      as v:ValueRef[int64]  { return double(v.value); }
      as v:ValueRef[uint8]  { return double(v.value); }
      as v:ValueRef[uint16] { return double(v.value); }
      as v:ValueRef[uint32] { return double(v.value); }
      as v:ValueRef[uint64] { return double(v.value); }
      as v:ValueRef[float]  { return double(v.value); }
      as v:ValueRef[double] { return v.value; }
      else {
        throw TypecastException();
      }
    }
  }
}

Although this class looks complex, that's mainly because there are a lot of possible combinations of conversions. In terms of usage, it's actually quite simple. It also demonstrates a number of features of Tart, including partial specialization, conversion constructors, type objects (T.minVal) and of course 'classify'.

BTW, I went ahead with Guido's suggestions about the integer type names as you can see.