Monday, January 31, 2011

Finding Static Roots

I'm getting nearer to having a working garbage collector implementation. The part that I worked on this weekend was static roots - that is, pointers embedded in static data structures that are emitted by the compiler. It turns out that I had a lot of cleaning up to do in order to prepare for this.

The general plan is to have the compiler identify the data structures that are static roots, and then have the linker use that information to generate a global table of all roots. The address of this table is then passed to the garbage collector at initialization time. At the beginning of each collection, the garbage collector will trace all of the static roots, so that any object referred to will be considered live.

Each static root contains two pieces of information: A pointer to the static data structure to be traced, and a pointer to the trace table for that data structure. The reason we need the trace table is that not all data types in Tart have embedded type information, for some data types the information about pointer offsets has to be supplied externally.

Tart's code generator has a getTraceTable(Type * type) function which takes a type, and returns the trace table for that type - or NULL if that type has no traceable references embedded in it. Thus, each time the compiler emits a static data structure, it gets the trace table, and if non-NULL, adds it to the list of roots for that module.

Actually, there's an additional optimization that can be applied: If a static data structure is immutable, then there's no need to trace it, since the only thing it can point to are other static data structures, which will themselves be static roots.

The next question is how to communicate the information about static roots from the compiler to the linker. The simple solution would be to simply create an array of pointers to those roots, however this solution has a problem: By referring to global variables directly, you prevent the linker from removing dead globals. I asked about this problem on the llvm-dev mailing list, and the suggestion I got was to instead create a list containing the names of the global variables. The idea is that you would run the dead global elimination pass, and then use the list of names to see which roots remained.

Of course, for this to work, every static root has to have a name, and that name has to be globally unique after the modules have all been combined into one large module. Normally LLVM deals with name collisions either by merging or renaming the symbol, depending on what that symbol's linkage type is, but we can't allow these particular symbols to be renamed, since the name would no longer match the string. So much of the work that I did over the weekend was making sure that each static root had a proper name.

One issue that I haven't quite worked out yet: How to deal with the linker's dead global elimination pass eliminating trace tables. We only want to keep the trace tables for the globals that didn't get eliminated during the dead global elimination pass. Since the globals themselves don't point to their trace tables, we need a way to prevent the trace tables from being prematurely removed, but afterwards we only want to keep the tables for globals that didn't get eliminated.

One other thing I need to figure out is how to handle roots that are in read-only memory. In LLVM, when you create a global variable, there's a boolean parameter that you can pass in that specifies whether this global is a constant. If set to true, it puts the global in read-only memory, so that any attempt to mutate it will generate a fault. I normally set this flag to true for any static global that has no mutable fields.

However, each object has a hidden integer field called __gcstate which is intended for use by the garbage collector. The meaning of this field is unspecified, it's up to the various garbage collector implementations to decide what data they need to store on a per-object basis. A typical example might be a "marked" bit to indicate that the object has been traced. However, garbage collectors will need some way to detect objects in read-only memory and refrain from writing to the __gcstate field of those objects. Probably the simplest solution is to assume that all objects whose address is outside of the collector's memory pool should not be modified. That means that we'll need to keep a record of which objects have been traced in some sort of auxiliary data structure.

(This makes me wonder if it's even worth having a __gcstate field, since the collector can easily create any auxiliary data fields it wants by prefix them to the beginning of the object -- although there are some alignment benefits to having the field be within the object, especially on 32-bit architectures.)

Another possibility is to have the __gcstate of all readonly objects always be zero, and the __gcstate of any objects allocated by the garbage collector be non-zero.

Tuesday, January 25, 2011

Executable size measurements

I've been working on improving my Python script that takes the output of 'nm' and parses it to get a sense of how much space in the executable is being taken up by various data structures output by the compiler. Here's what the output looks like:

Total size:  155824

Category                          Count   Size Average Percent
================================ ====== ====== ======= =======
Reflection                           --  39088      --    25.1
  Composite types                   128  22528     176    14.5
  Enum types                          1     64      64     0.0
  Other types                         9    576      64     0.4
  Type references                     2     80      40     0.1
  Type lists                         45   1616      35     1.0
  Invokers                            8   6240     780     4.0
  Name tables (head)                 42   2880      68     1.8
  Name tables (simple)               41   3591      87     2.3
  Name tables (compound)             29    793      27     0.5
  Field offsets                      39    720      18     0.5

TIB                                  --  12384      --     7.9
  TypeInfoBlocks                    128   7040      55     4.5
  Base class lists                   56   1856      33     1.2
  IDispatch functions                91   3488      38     2.2

Standard Library                     --  94492      --    60.6
  tart.core                         209  15944      76    10.2
  tart.collections                  214  40016     186    25.7
  tart.gc                            16   1012      63     0.6
  tart.io                             0      0       0     0.0
  tart.reflect                      165  30096     182    19.3
  tart.testing                       10   2896     289     1.9
  tart.text                          22   4528     205     2.9

Classes                              --  57976      --    37.2
  tart.core.Array                   177  13928      78     8.9
  tart.core.String                   21   5152     245     3.3
  tart.collections.ArrayList         86  22944     266    14.7
  tart.collections.ImmutableList     69  14672     212     9.4
  tart.collections.List              21   1280      60     0.8

Static strings                      151  10728      71     6.9
Trace tables                         39    736      18     0.5

You'll note that in some cases the percentages add up to more than 100%. That's because not all of the categories are exclusive - for example, a given symbol can be in both the "Reflection" and "Standard Library" subheadings.

Also note that the actual size of the binary file being analyzed was 332049 bytes, not 155824 as is claimed above. The 155824 number only includes symbols that were generated by the Tart compiler.

A few surprising things:
  • CompositeType structures, which serve the function of metaclasses, take up 14% of the space. I have an idea how to make that smaller, however.
  • Invoker functions, which are responsible for boxing and unboxing of parameters when calling a method via reflection, take up 4% of the space. I'm not sure these can be shrunk significantly.
  • ArrayList is much bigger than Array. Note that for templated classes such as Array and ArrayList, the size includes all template expansions for different type parameters.

Type inference and type constraints

Some notes about the way Tart's type inference solver works.

The type solver takes as its input an expression tree, and returns an updated expression tree. The input tree has ambiguous or unspecified types for the expression nodes in the tree, while the output tree has fully-specified types for every expression node. If the input tree is already fully specified, or the solver fails to find a solution, then the original tree is returned unchanged.

The first phase is the "gather constraints" pass. The solver walks the entire tree looking for "constraint sites", that is, nodes in the tree that implicitly constrain the types of their child or parent nodes. The simplest example is a call to a non-overloaded function: Each sub-tree representing an argument is constrained to be the same type as the corresponding function parameter.

In more complex cases, a function may have multiple overloads, or may be a template where the type of the argument is a template parameter. For cases like this, we create a "type constraint object", which is an subclass of Type. A type constraint represents all of the possible types that could exist at that location.

For example, suppose we call "foo(a, b)", and there are two overloads of foo: "foo(int, int)" and "foo(float, float)". Since we don't know yet whether parameter 'a' is an int or a float, we instead assign to 'a' a type constraint representing the set {int, float}.

However, a type constraint is not just a passive collection of types. The constraint remains connected to the original site - so for example, if we decide later to discard one of the two overloads of 'foo', the constraint will automatically update itself to reflect the new set of types that could exist at that point.

Type constraints have many of the same methods that types do. For example, types have a method called canConvert(), which attempts to convert an expression to that type, and returns an integer indicating the compatibility level for that conversion - whether it is possible, whether it would cause a compiler warning, etc. For type constraints, calling canConvert() simply calls canConvert() to each of the types in the set, and returns the worst compatibility level found. Similarly, type constraints have isSubclass(), includes(), isEqual() and other type comparison methods.

There are several different kinds of type constraints. The three most important are:
  • "parameter-of": Given a call site, return all of the possible types for the Nth parameter of the function being called.
  • "result-of": Given a call site, return all of the possible types for the return value of the function being called.
  • "type-parameter-of": Given a set of types, some of which may be parameterized types, return the set of their type parameters. So for the input set {Address[int], Address[float]}, it returns the set {int, float}.
Once the type constraints are in place, the solver starts the next phase, which is unification. This looks for any unbound template parameters in the expression, and binds them to the type of the corresponding sub-node. So if the template function foo(%T) is called with 'int', it binds %T=int. In many cases, the template parameter will be bound to a type constraint.

After unification comes pruning. This is essentially an optimization problem: For each call site, we want to reduce the number of choices available, but only in a way that increases the total compatibility ranking for the entire expression.

There are three pruning stages. Each stage makes use of a backtracking "cull" function. The "cull" function eliminates from a call site one or more choices (i.e. overloaded methods) for that site, but does it in a way that can be backtracked. (It basically just saves an integer backtracking level with each choice.)
  • Cull by ranking: We cull all methods that have a conversion compatibility level less than some threshold. So for example, we remove from consideration all choices that would produce a compiler warning.
  • Cull by specificity: For any given site, we compare the choices, and if one choice is strictly more general than another, we eliminate the more general one. So given foo(String) and foo(Object), both of which have the same compatibility ranking, we eliminate foo(Object) from consideration.
  • Cull by elimination. If neither of the two culling phases produces a solution, we do a brute force search over all possible permutations.
In each culling phase, we keep track of whether the particular combination of cullings is overconstrained or underconstrained. If it's overconstrained (meaning that there are one or more sites with no choices left), then we backtrack. If the solution is underconstrained, meaning that there are some sites with more than one choice remaining, then we proceed to the next phase and tighten the constraints even further.

The solver finishes when there is only one choice remaning at each constraint site.

Monday, January 24, 2011

Tart status update

I got a lot of Tart-related work done over the weekend, and am very close to having a simple garbage collector working.

One new feature is the ability to create objects using a custom allocator. The garbage collector needs this, because it requires some objects to be created before it has finished initializing. These objects are allocated in "permanent memory" outside of the region of memory that the collector manages.

Since Tart doesn't have a 'new' keyword, we can't use the C++ syntax "new(allocator) Type(args...)". Instead, the type's Type object has a method, 'create', which takes an allocator as an optional argument. There are currently two allocators, the "default" allocator, which just calls GC_alloc(), and the "malloc" allocator which calls malloc(). Allocating an object via malloc() looks like this:

let obj = typecast[MyClass](CompositeType.of(MyClass).create(MallocAllocator.INSTANCE));

A bit wordy, but that's fine, because this should only be used rarely. It's also slower than the normal object creation pattern, but again that should be fine.

One question that I've been mulling over is Tart's support of pointers. In the ideal case, a user of Tart should never have to deal with pointers directly, everything should be done in terms of object references (which are essentially pointers underneath, but have a lot of special semantics known to the compiler.) However, because a lot of Tart's runtime is written in Tart, there is a large body of Tart code that does in fact use pointers directly.

Because I don't want to encourage people to use pointers, the syntax for declaring and using a pointer is deliberately somewhat verbose:

import tart.core.Memory.Address;

var p:Address[ubyte];

Similarly, there's no special syntax for dereferencing a pointer - instead you say something like "Memory.deref(ptr) = value" or "ptr[0] = value".

However, because I've been working with pointers so much recently, I've slowly been giving way and adding operator support for pointers - so now you can say "ptr += 3" instead of "p = Memory.addressOf(p[3])".

Yesterday I added comparison operators for pointers - <, <=, >, >=, and so forth. Unfortunately, this confused the type solver, and I'm still tracking that one down. Aside from that, adding operator support for pointers is fairly trivial. It would take me all of 10 minutes to add "type^" or "type*" as a synonym for "Address[type]". The question is, should I do it?

Sunday, January 16, 2011

Anonymous classes

In my day-to-day work using Java, one of the most frequently used and useful features is anonymous inner classes. Here's a code snippet showing a typical use case, in this case an asynchronous load of a resource:

final FormattingLogger logger = getLogger();
resources.loadImageAsync("texture.png", new AsyncCallback<Image>() {
  @Override
  void onSuccess(ImageResource img) {
    // do something with img
  }

  @Override
  void onFailure(Throwable error) {
    logger.logError("Resource load failed: " + error);
  }
}

What this does is create an unnamed subclass of AsyncCallback<Image>. AsyncCallback in this case is a generic interface that has two methods: onSuccess and onFailure.

Since Java doesn't support closure functions (although apparently the next version of Java will), anonymous inner classes are used instead. They also have one obvious advantage over a function pointer, which is the ability to support multiple methods, such as the onSuccess & onFailure case in the example above.

Like a closure, the anonymous inner class has access to the environment in which it is defined, as seen in the use of 'logger' above. Note that Java doesn't support true capturing of variables, however - any variable referenced by the inner class must be declared 'final'. (The reason is that Java makes a copy of the variable, rather than giving shared access - so making it immutable reduces the confusion that would result if you tried to modify the variable.)

Although the example doesn't show it, you can also pass constructor parameters to the anonymous class in the parentheses following the class name. Of course, you would also need to declare a constructor with the appropriate type signature.

Now, there's no reason why you couldn't create a named subclass of AsyncCallback to do exactly the same thing. It's just more typing, and it also requires the subclass to be defined elsewhere, outside of the place where it is used.

It seems logical that we would want a similar feature in Tart. However, the above syntax won't work, because Tart doesn't have the "new" keyword. Simply typing "classname() { ... }" won't work either, because the syntax is ambiguous - as in the case "if classname() { ... }". So we'll need to come up with some other syntax.

Also, I suspect that anonymous classes, while still useful, will be used somewhat less frequently in Tart than in Java, due to the existence of closures. Assuming that both closures and anonymous classes are available, a programmer who is designing a callback API would need to consider the following factors in deciding which to use:
  • If there are multiple methods in the callback, then then an anonymous class would seem a better fit. That being said, it is possible to either use multiple closures (one for each callback method), or to define a single closure whose parameters allow the passing of different meanings (such as an status code + result.) However, this rapidly becomes unwieldy as the number of methods increases.
  • Some would argue that both the class name and the method name of the callback are an important part of the self-documentation of the callback code. If I define an interface that is named "AsyncCallback" with a method "onSuccess", that tells me more than if I use a generic "Function" type.
Still, even with the above caveats, programmers are going to choose closures a large part of the time because of their convenience, especially for interfaces where there's only a single method. So the need to provide anonymous classes isn't quite as crucial as it would be in Java.

Closures are now regular objects

Well, that didn't take long.

Previously, closures were implemented as a special data type consisting of a pair (function-pointer, environment). Now closures are regular objects that implement the Function interface:

/** Interface that represents a callable object. */
interface Function[%ReturnType, %ParamTypes...] {
  def (params:*ParamTypes) -> ReturnType;
}

This means that from the caller's point of view, there's no difference between a closure and any other object that is callable.

There are several language features worth highlighting the the above code snippet:
  • Templates with variadic parameters.
  • *ParamTypes to expand a list of types as a function signature. Works with any tuple type, not just template params. (Although it's currently limited to methods that don't have a body, since I have not yet added code to access the parameter values.)
  • def () syntax for defining a callable object (similar to Python's __call__.)
Unfortunately, bound methods are broken by this change, I'll fix them up later.

Thursday, January 13, 2011

More on statements as expressions

I've been playing around with this idea some more, and I'm starting to like some of the results.

Here's the specific changes that would be made to the language syntax:

  • The following statements will be able to be used as expressions:
    • if
    • switch
    • classify, which will be renamed 'match'.
    • with
  • Statement blocks can be used as expressions.
    • The last statement in the block is the result.
    • The semicolon after the last statement is now optional - meaning that semicolon now becomes a statement separator rather than a statement terminator.
    • Braces are still always required around statement blocks.
    • Return values can be type 'void' if there's no actual value to return.
    • One particularly challenging issue will be making the type inference system smart about statement blocks - that is, the type of the last statement in a block will in some cases be inferrable from the type of expression that block is assigned to.
  • The 'return if' syntax is no longer allowed ("return value if condition") since it's ambiguous.
    • We can still keep "break if" and "continue if" if desired, but it may be best to remove those as well. I need to think about it.
  • 'cond' goes away as it is no longer needed.
    • Macros can stay for the moment, as they are needed for assertions (they make it possible to print the string form of the assertion expression.)
Here's some example syntax to show how the code would look:

// All in one line
var x = if n != 0 { result1 } else { result2 }
var x = if n == 0 { result1 } else if n == 1 { result2 } else { result2 }

// Multi-line if
var x = if n == 0 {
  result1
} else if n == 1 {
  result2
} else {
  doSomethingBeforeReturning();
  result2
}

// if-statement as a function argument:
writer.printLn(

  if n == 0 { result1 }
  else if n == 1 { result2 }
  else { doSomethingBeforeReturning(); result2 });

// Match statement
var x = match n {
  as s:String { "String" }
  as t:Type { "Type" }
  as * { "Other" }
}

// Switch statement
var x = switch n {
  case 1 { "one" }
  case 2 { "two" }
  case * { "none" }
}

// With statement
var str = with f:open(file) {
  ",".join(f.lines())
}

// With statement with multiple values
var str = with (f1:open(file2), f2:open(file2)) {
  ",".join(f1.lines()) + ",".join(f2.lines())
}


In my earlier post, I suggested that try/catch also be usable as an expression, however there's a bit of a problem there, which is what to do with 'finally'? When used without a 'finally' statement, the flow of control in a try/catch always reaches the end of one and only one block - so you know exactly which value is returned. With 'finally', however, it's not clear whether the value returned is the one at the end of the finally block, or one of the other blocks.

Tuesday, January 11, 2011

Tart status update

I've started working on Tart again, after a break of about a month. The reason I stopped was that I was feeling a bit burned out, and I felt like I had hit a wall with respect to some unsolved problems. I figured that if I deliberately kept away from working on it for a while, I'd not only recharge my batteries, but perhaps I could get some new inspiration as well.

And in fact I did have some minor brainstorms during this period, which made me want to work on it again. Here's a summary of the direction my thoughts have been going lately:

Reflection

If you remember from earlier, I had been kind of obsessing about the large amount of reflection metadata that was being generated. Well, I've made a decision, which is that reflection will now be an opt-in rather an opt-out for classes. That means the compiler will not generate reflection information for class methods and fields unless there is an annotation telling it to do so.

The @Reflect annotation will cause a class, and all of its subclasses, to be fully reflected. In addition, any class member which has an annotation will be reflected, if that annotation is retained in the final executable (the reflection structure is what holds the list of annotations, so it needs to present in order for the annotations to be found.)

Initially, I assumed that all classes would want to have detailed reflection information. My rationale for assuming this is that reflection is about discovery - it's about taking a data type that your code never seen before, and extracting information from it. I reasoned that for any given class, there's no way to know what possible future users of that class might want to discover, so the policy would have to be conservative and inclusive.

However, after thinking about this some, I realized that in practice, the use of reflection generally begins from some known starting point, such as a particular interface, or classes annotated with a particular annotation. Take for example JUnit - it uses reflection to identify all of the test methods within the class (any method that begins with the word 'test'.) However, it only does this for classes that are derived from TestCase. So that's a known starting point. Similarly, another classic use case for reflection is to be able to serialize or edit a class by getting a list of it's properties. However, the author of the class to be serialized or edited generally knows in advance that this is going to happen, and designs the properties and fields of the class to facilitate such access - so requiring an extra attribute on the class to enable this is not a big hardship.

Another common pattern is to scan the class members looking for ones that bear a particular annotation. An example is Guice's dependency injection logic - it scans a class for any constructors that are annotated with @Inject or @Provides. It's not interested in class members that don't have annotations, so there's no need to reflect those.

One possible pitfall of this scheme is that a reflected class may have members whose types are not reflected. For example, a class might have a member whose type is List[String]. Neither List nor String are reflected by default. That means that if you were trying to do a recursive traversal of some tree of objects using reflection, you wouldn't be able to traverse fields with these types. That doesn't mean that you would have no information at all - non-reflected classes still have information about their name, module path, base classes, type parameters, and so on. They just lack information about their members.

For the case of List[String], your traversal algorithm could just have a special case for fields of type List, and iterate through the list normally instead of using reflection. You still wouldn't be able to handle fields whose types were both non-reflected and unfamiliar to your code. The question is, how important is this use case? At the moment, I can't think of a situation where this would be important, but I could be wrong.

The advantage of the opt-in strategy is three-fold: First, it means that we don't generate a pile of unneeded metadata each time a template gets instantiated. Secondly, because we no longer have to worry about size, I can rip out a whole bunch of complicated compression / decompression logic from the Tart runtime (although I still keep method descriptors and field descriptors compressed, but not types). And finally, since I'm no longer creating types lazily, it means that reflection metadata can go back to being static constants - which means that every type Tart once again has a unique, canonical, immutable pointer for its runtime information.

Functional Style

After watching the Scala presentation at Google last month, I realized that there might be some advantage to making some statements more functional in style than they currently are. This doesn't mean that I've drunk the functional programming kool-aid, and I certainly don't think that functional programming is the panacea that some of its adherents claim. However, I was pretty impressed with the brevity of the code examples and the clarity of the abstractions involved.

Thinking about those code samples makes me realize that there's a very common coding pattern in imperative languages that kind of annoys me, although my level of annoyance was so minor that I wasn't aware of it before. Here's some examples:

Example 1:

Record record;
if (some_condition) {
  record = dataSource.getRecord(FIRST_ID);
} else if (other_condition) {
  record = dataSource.getRecord(SECOND_ID);
} else {
  record = dataSource.getRecord(OTHER_ID);
}

doSomethingWithRecord(record);

Example 2:

Record record;
try {
  record = dataSource.readRecord(RECORD_ID);
} catch (IOException e) (
  logger.log(e);
  return;
}

doSomethingWithRecord(record);

In both cases, we're assigning to a temporary variable because in Java/Python/C++, expressions are subservient to statements - that is, statements may contain expressions, but expressions can never contain statements. If we want to calculate an expression where there is control-flow involved in the calculation, then we have to break up the expression into multiple statements.

(Note that Example 1 can be generalized to switch statements as well as chained if-else statements.)

A language that supported a functional style, in which statements could return values, would allow these statements to be written more concisely:

doSomethingWithRecord(
    if (some_condition) {
      dataSource.getRecord(FIRST_ID);
    } else if (other_condition) {
      dataSource.getRecord(SECOND_ID);
    } else {
      dataSource.getRecord(OTHER_ID);
    });

doSomethingWithRecord(
    try {
      dataSource.readRecord(RECORD_ID);
    } catch (IOException e) (
      logger.log(e);
      return;
    });

Of course, Tart already lets you do the first with the 'cond' macro:

doSomethingWithRecord(
    cond(some_condition, dataSource.getRecord(FIRST_ID),
         other_condition, dataSource.getRecord(SECOND_ID),
         dataSource.getRecord(OTHER_ID)});

There are several problems with 'cond' however:
  • It looks different from a regular if-statement. In other words, the language has two completely different if-statements, one functional, one procedural. Think about this for a moment from the perspective of an IDE with syntax highlighting - the 'cond' macro will be syntax-colored just like a regular function call, but it would be nice if the editor could syntax-color it like a flow-of-control construct.
  • The 'cond' macro only allows a single expression to be returned as the result of each if-test. It would be better if the 'value' argument could contain multiple statements. (You can do this by making a closure, but that's overkill IMHO.)
  • 'cond' relys on Tart's macro system, a feature which so far is not pulling it's weight in terms of benefits delivered. I think language features which are only useful in obscure edge cases should be removed, and so far macros have exactly 3 use cases: 1) cond, 2) lazyEval, and 3) assert.
For the second example, with the try-catch statement, we have another interesting issue, which is the behavior of the return statement. When used as an expression, 'return' acts as a non-local exit out of the function - it effectively cancels the function call as a side-effect of evaluating the calling arguments. Pretty freaky but completely logical given the original example that it was derived from.

Unfortunately, there's no way that Tart's macro system can implement try-catch as an expression. If we want to support cases like this, then it's got to have some sugar from the compiler.

If we decided to go down this road, one of the first things we would need to deal with is how to handle expressions containing multiple statements:

doSomethingWithRecord(
    if (some_condition) {
      doSomePreperatoryWork();
      dataSource.getRecord(FIRST_ID);
    } else if (other_condition) {
      doSomeDifferentPreperatoryWork();
      dataSource.getRecord(SECOND_ID);
    } else {
      dataSource.getRecord(OTHER_ID);
    });

In the above example, the first if-block has two statements. What is the value that results when that branch of the if-statement is taken? We can't use the 'return' statement at the end of the block, because as we've seen that returns from the function as a whole. We could come up with some new keyword to indicate the result produced (I haven't been able to think of one), or perhaps some new operator, or we could do as some languages do and say that the last statement in a block is always the result. I don't particularly like the implicit last-statement rule, EIBTI.

(Pure functional languages don't generally have this problem because they don't have 'statements' per se, only expressions.)

It also quickly becomes obvious that the block statements { } are in fact expressions. Does this mean that one can omit the 'return' statement at the end of a function, and just put an expression? That's annoyingly perl-ish from my point of view.

There's also the question of whether all those braces are necessary. Tart's rule for block statements up to this point has been that the braces are not optional, but the parens around the condition can be omitted. However, I'm pretty open to a certain degree of design churn for an experimental language in development, which is another way of saying that I'm open to changing my mind.

In any case, I haven't come to any conclusions on this matter as yet, so no action will be taken.

Functions as Objects

I'd like to be able to write the following in Tart:

interface Function[%ReturnType, %ParamTypes...] {
  def (params:ParamTypes) -> ReturnType;
}

I currently can't do this because ParamTypes is a variadic template argument. Ideally, the Function class should be able to take a list of 0 or more function parameter types as template arguments, such as Function[String, int, int, int].

Secondly, I'd like closures and bound methods to implement this interface - that is, if you have a closure, you can assign it to a variable of this type, assuming the parameter types match. 
Also, the type declaration 'fn (params) -> ReturnType" should simply be a shortcut for this type. Doing that involves changing the internal representation of closures to be more like objects, so that when a closure is called it goes through the normal vtable dispatch mechanism.

The advantage of doing this is that you can derive new classes from Function, and they can be used in all of the same places that closures and bound methods are used (similar to __call__ in Python.)

There's another cool feature that I'd like to steal from Scala: Let's say you have a function that takes as its argument an interface with a single callable method, such as Factory[String].build(). In Scala, you can pass a closure as that argument instead of creating a class which instantiates that interface. So for example:

interface Factory[T] {
  def build() -> T;
}

def buildLotsOfStuff(factory:Factory[String]) {
  // ...
}

// Convert a closure to a Factory[String].
buildLotsOfStuff(fn { return "Hello"; });

// In Java, we'd have to do this much more verbosely:
buildLotsOfStuff(new Factory<String> {
  public String build() {
    return "Hello";
  }
});

I'm not quite sure how Scala does this - that is, there are two ways I can think of to do it, but each has disadvantages. The first way is to automatically generate an adapter object when the closure is converted to a reference to type Factory.
So for example the compiler would generate something like this:

class AnonymousFunctionAdapter : Factory[String] {
  var closure:fn -> String;

  def construct(closure:fn -> String) {
    self.closure = closure;
  }

  def call -> String {
    return closure();
  }
}

This works in all cases, but is less efficient because of the overhead of calling the adapter and because you have to allocate the adapter instance.

The other approach would be for the compiler to generate the closure itself as an instance of Factory[String]. All that is required in this case is to add the additional dispatch table entries to the closure's class - that is, the actual method pointer would be copied into two dispatch slots, one for Function.(), and one for Factory.build(). This means that you avoid allocating the extra object and the code is able to jump right into the method body directly without having to pass the arguments through the adapter.

However, there's a drawback, which is that this technique only works if the compiler knows what type the closure is going to be cast to at the point where the closure is generated. If we pass the closure around as a function pointer, and then later decide to cast it to some other callable type, we won't be able to modify the closure to conform to that type's calling conventions.