Thursday, September 22, 2011

Some Metaprogrammatic Musings

One of the motivations for creating the Tart language was my interest in metaprogramming. One thing I've noticed is that there's a kind of parity between metaprogramming and reflection - that is, both tend to be used to accomplish similar goals, but the former does it at compile time, whereas the latter does it at runtime. And both techniques have space costs - metaprogramming's cost comes from template bloat, whereas reflection's cost comes from all of the static metaclass information that has to be emitted by the compiler.

An example of this parity would be testing mocks. Libraries like mockito use reflection to generate mock implementations - that is, you feed it an interface, and it returns back an object that implements that interface - but the implementation is merely creating a record of all calls and parameter values. One could imagine a metaprogramming approach to the same problem - given an interface, produce at compile time a recording implementation.

One trick that D has that makes metaprogramming easier is the 'static if' construct - having an if that is guaranteed to be evaluated at compile time makes it easier to write functions that are compile-time recursive. An example would be a template that takes a variable number of template parameters - this expands to a template that does something with the first type parameter, and then recursively expands itself with the tail of the type parameter list.

However, I was thinking today, why not a static 'for each' loop? Assume you have a sequence of types whose length is known at compile time. One example is a variadic template - another is tuple types. One could construct a for loop which would iterate over the types in the sequence. Rather than actually looping, what the compiler would do is unroll the loop - that is, for a sequence of N types, what you get is the inner body of the loop replicated N times, each time with a different type parameter.

An example of where this might be useful is in declaring production rules for a parser. Say we have a template "Sequence[%T...]" which takes a variable number of type arguments, where each type argument represents a matcher for some input. So you could do something like:

class Sequence[%Matchers...] {
  static def match(in:TokenStream) -> bool {
    for M in Matchers {
      if not M.match(in) {
        return false;
      }
    }
    return true;
  }
}

So if I invoke the template with something like "Sequence[Number, Identifier, Literal['if']]", what this turns into is code that calls Number.match(), followed by Identifier.match(), and so on. Of course, you can do something similar with object instances instead of types. Instead of "Sequence" being a type, instead it's just a function, and the matchers being passed into it are matcher objects, not matcher types. Again, it depends on whether you want to process this stuff at compile time or run time. (Notice, for example, that the 'match' function of class Sequence is static - no need to create any objects at all.)

Interestingly, this starts to look a little bit like Haskell Monads, which makes me think of another thing: Tart allows overloading of operators, but one operator that is not even defined is "juxtaposition" - that is, putting two expressions next to each other with just whitespace between them. The Haskell "do" statement takes a series of expressions, where each expression is processed by the monad, and the result of that processing is the context for the next expression in the series. The idea of an imperative sequence - that is, a set of steps done in order - is only one possible way to use this. It could just as easily represent a dependency graph, a decision tree or any other idea where you have some chain of ideas, each idea building on the previous one.

One can easily simulate this in Tart (or C++ for that matter) by operator overloading, with the downside that the syntax is not as compressed - so "number >> ident >> literal" could do exactly what "seq(number, ident, literal)" does. The question is, is it worth thinking about an even more compact syntax such as maybe "Sequence :: number ident literal", where you have some prefix (Sequence) that sets up a context, and that context let's you juxtapose the items, getting rid of the operators between the terms of the expression.

Anyway, this is all just speculation for the moment, I don't have any concrete plans to change the language in this direction.

No comments:

Post a Comment