Monday, October 25, 2010

Switch statement syntax

Here's a few minor design puzzles. C's "switch" statement requires a "break" statement at the end of a switch case. I've long felt that this was a design misfeature, and while it is useful for some clever hacks, it's also a source of much trouble. I decided instead to make each case a first-class block. Here's a sample of what switch statements look like in Tart:

   switch value {
     case 1 {
       // something
     }

     case 2
     case 3 {
       // something
     }

     else {
       // something
     }
   }

Note that cases 2 & 3 go to the same block. Syntactically, the rule is that a case value can be followed by one of two things: Another case value, or the start of a statement block.

One alternative syntax I considered was to use a comma to separate case values: "case 2, 3". However, I discarded this for two reasons: first, because it looks ugly if there are a lot of case values and you need to break the line, and second, because I eventually want to be able to have case values that are tuples.

However, the lack of a delimiter after the case value is a bit troubling. I've discovered at least one bug where I typed a case value and forgot to write the subsequent block, which resulted in the executing falling through to the next block. If a delimiter were required, then the compiler would have caught that.

There's also a problem with the use of the keyword 'else' instead of 'default' - sometimes I like to put the default case first instead of last, and that doesn't read as well with the word 'else'.

Finally, there is the issue of the 'functional-switch'. A lot of functional languages have a switch statement which returns a value (let n = switch value { ... }). I've been wondering if Tart should have something like this. However, it's not clear to me how you would indicate which value to return in a block, especially in an imperative language which supports multiple statements. The choices are (a) return the value of the last statement in the block (if there is one), (b) have a special keyword for returning a value of a block. Both approaches have issues.

Having a functional-switch implies that there should also be a functional-if as well, which is nothing more than the C/Java/Python conditional operator. Tart doesn't have or need such an operator, since the 'cond' macro does the same job without needing special syntactic sugar. However, doing a switch statement as a macro is a little more difficult, since the number of arguments to the macro can grow arbitrarily large.

Tart status update

I've been working pretty hard on Tart the last few weeks. I've been focused on several areas:

DWARF debugging - There's not much progress in this area, despite my best efforts. Lately I've been attacking the problem from a different direction, which is to produce a reproducible test case that doesn't require the entire Tart infrastructure to be present. Unfortunately, doing this requires fixing a bunch of minor issues which would not otherwise be high priority. For example, the Tart linker fails unless optimization is enabled, due to a bug in the LLVM lowering pass for garbage collection intrinsics. Unfortunately, I need an unoptimized test case if I want to get the LLVM experts to help me with my problem. So I have to fix the second problem in order to even start working on the first.

Garbage Collection - The stack crawler is completely working now. The next step is to write a very simple test collector and a test for it.

One issue that concerns me is that I want Tart to be able to support pluggable collectors like Java does. Java has the ability to specify a collector when you launch the program - in other words, the selection of collection algorithm can be deferred until the program begins execution. In order to do this, I'd need some sort of plugin or loadable module system, something I haven't even begun to think about.

The best I can do for the moment is to make the collector algorithm a link-time decision, that is to have different static link libraries for different collectors.

Reflection - I've been working hard on the new reflection system, the one that uses compressed metadata to describe all of the various classes and methods in a module. Right now the compiler emits the metadata for both the old and new systems, which makes the resulting object files much larger than I would like them. Unfortunately, the new system isn't yet up to the task of replacing the old system. For example, the "execute by name" feature - which is used by the unit test framework - isn't ready yet.

I've also been plagued by a number of strange anomalies. CMake has been acting weird lately - a build rule that was working fine for several months suddenly stopped working. I didn't notice at first since the bug only manifests on a clean build. Although CMake is a great tool, there are certain kinds of build rules which really ought to be simple, but which are nearly impossible to do without truly ugly and non-intuitive work-arounds. For example, it's nearly impossible to define an add_custom_command() rule which has a dependency on a build product which was produced in a different directory.

Sigh...it seems like the last 3-4 months has been spent dealing with low-level details of LLVM, DWARF, and so on - I've spent virtually no time thinking about the Tart language per se, which is what I really want to be working on.