Wednesday, August 31, 2011

Mutability

I've been reading up on the D language's idea of immutability. It makes some interesting choices, but I'm not certain they are the right choices for me.

D has two keywords, 'const' and 'immutable'. 'const' says that you can't modify the value, but someone else might. 'immutable' is a guarantee that the item will never be modified, and this guarantee may be backed up by storing the item in a protected memory page.

Moreover, both 'const' and 'immutable' are transitive - that is, any data structure that is reachable by a const type is also const. So if you have a pointer inside a const struct, the thing it points to is automatically const as well.

Another design point is that you can't typecast an immutable value to either const or mutable, since that would violate the guarantee of immutability. (Actually, I think there's a way you can, but attempting the mutate the value afterward may throw an exception if the value is in protected memory.)

I like the distinction between const and immutable, however I tend to favor the formulation used in the JodaTime libraries: You have a base type which is 'readable', with two subtypes, mutable and immutable. So for example, you have "ReadableDate" which only defines accessors for reading the value, but does not guarantee that those values will not change; Then there is ImmutableDate, which adds no new methods but does guarantee immutability, and then MutableDate, which adds the accessors for mutating the value.

The only problem with JodaTime is the extra work involved in writing 3 classes to represent the same concept. This is where C/C++/D gets it right, I think: You write the class once, with methods to both read and mutate the fields, and then you use a modifier keyword as a type constructor, which transforms the fully functional type into a more restricted version of that type.

If we combine these two ideas together, we get something that works like this: 'immutable(Foo)' takes the type 'Foo' and subtracts from its definition all of the mutation methods. 'readable(Foo)' does the same, except that you can cast either a Foo or an immutable(Foo) into a readable(Foo), but not vice versa.

How is the compiler to know which methods are mutation methods? Well, we have to tell it - this is one case where the compiler can't infer things automatically (As a general rule, Tart does not allow the compiler to infer an interface from its implementation. I realize that there are some languages which do exactly this, but I think that interfaces should be explicit.) Fortunately, we don't need to distinguish between readable and immutable in this case - all the compiler cares about is whether the method can mutate the state of the object or not. The main challenge here is to come up with a syntax that's easy to type, since approximately 25-50% of all instance methods will want to have this property.

You also need a way for some fields to be mutable even on a readonly object. A typical example of this is reference counting: You want the object to appear to be immutable to the outside world, but internally you are changing the reference count. In C++ we accomplish this by declaring the data member as explicitly mutable, or by declaring the method itself as mutable. The latter can be accomplished better, I think, by allowing the method to cast 'self': So we'd say something like "let mself = mutable(self)", and then use mself to access the various fields.

Finally, there's the question of transitivity. Tart already has a mechanism for defining non-transitive immutable member fields via the 'let' keyword instead. At the same time, I can't help feeling that the D rule is too aggressively transitive. I think that having an immutable array of pointers to mutable objects would be a fairly common use case.

No comments:

Post a Comment