Machine Words: Tart: try/catch/finally

I just got "finally" working for simple cases. It doesn't yet handle nested finally clauses.

Cleanup handlers like finally are a challenge because the flow of control can go to different places on exit from the finally block. For example, a try block could exit with a return, a throw, a break, a continue, or just fall off the end, and each of these could potentially go to a different place depending on which one cause the exit from the try block.

One approach is to simply clone the try block at each exit point. I decided not to go this approach because of concerns about code bloat. (Although I may later add a heuristic that decides whether inlining the finally block is more efficient.)

The approach that I took was to simulate "local procedures". A local procedure is like a subroutine within a function, which you can call like a function - it resumes execution at the instruction after the call when it finishes. Now, there's no such instruction in LLVM IR (And I know why - the 'return' instruction would play merry hell with the optimizer, since it's effectively a branch to a variable address). So even though most processors have a "jump to subroutine" instruction, there's no way to represent that in LLVM language.

Instead, what I did is to write a pass that takes a list of basic blocks containing local procedure calls, and convert it into a state machine. Each local procedure call is converted into a regular branch, preceded by an instruction that sets a state variable. The end of each local procedure is converted into a conditional branch or switch which jumps back to the call site, depending on the setting of the state variable. The transform is smart enough to remove branches-to-branches, and to merge states which end up branching to the same place. (This is important for catch blocks, since normally all catch blocks end up at the same place after calling finally.)

During CFG analysis, finally blocks are stored on a stack of "cleanup" handlers for the current block. Note that this same stack will be used for the "with" statement, or any other future statement that defines a sequence of instructions that needs to be executed before exiting the block.

Which leads me to the next question, which is "how should 'with' statements work in Tart?" C# has the "using" statement, however it is much simpler than Python's "with". The using statement does one and only one thing, which is to call "dispose" on the object when the block exits (the object must implement the IDisposable interface.)

Python's "with" statement, on the other hand, does a lot more:

it has the enter and exit methods.
the expression passed to the with statement is a factory for the context object, not the context object itself.
if an exception occurs, it can pass it to the exit function.
the exit function can cancel the exception.

I don't know which of these features are essential. I'm particularly leery of the last one, the ability to cancel an exception dynamically - it doesn't fit very well with the model of statically-compiled catch blocks.

Another difference is that a Tart exception has less information in it than a Python exception - this is just a limitation of native exceptions. For example, my ability to produce a stack trace is very limited - I can do it, but only in a specially constructed exception handler that intercepts the exception in the search phase. In a regular function that uses try/catch, by the time the catch block is invoked, it's too late to get a stack trace, the stack is already gone. (The other alternative would be to *always* record a stack trace, but this would make all exceptions very expensive.)

Also, a stack backtrace isn't going to be terribly useful in an executable that has debugging symbols stripped out. This is a non-issue in an interpreted language which always keeps the symbol names around.

Machine Words

Monday, June 1, 2009

Tart: try/catch/finally

No comments:

Post a Comment