Tuesday, May 10, 2011

Yak shaving and plans

So I'm trying to fix a compilation bug, and as part of this I'm once again attempting to build the tart executables with clang, since it tends to be much stricter than gcc. Moreover, I'm attempting to do this on my iMac instead of my Ubuntu/ThinkPad, since I want to make sure that everything is working on that platform.

But when I attempted to build clang and run it, it crashed on me. So I'm thinking there's something amiss with my llvm checkout - for one thing, it's supposed to automatically update the clang directory when I svn up on the llvm directory and that wasn't working. So I'll just go ahead and blow away the whole directory and get a fresh checkout.

But then I'm thinking that maybe svn is out of date - it's been like a year since I did 'port update', so I go and do that. Only 'port' tells me I should do a "port selfupdate' first, which I proceed to do. When that finishes, it recommends that I do a "port upgrade outdated", so OK fine I'll do that - so now I'm sitting here waiting for 'boost' to compile, after having finished zlib, perl5, expat, libiconv, and a whole bunch of other software which really has nothing to do with the problem I'm trying to solve.

Yak Shaving indeed.

In the mean time, I've been sketching out some of the other tart.io classes that will eventually need to be written. Here's a rough sketch for Path and Directory:

/** Utility functions for operating on file paths. */
namespace Path {
  def absolute(path:String) -> String;
  def exists(path:String) -> bool;
  def normalize(path:String) -> String;
  def isReadable(path:String) -> bool;
  def isWritable(path:String) -> bool;
  def isAbsolute(path:String) -> bool;
  def isDirectory(path:String) -> bool;
  def filename(path:String) -> String;
  def parent(path:String) -> String;
  def extension(path:String) -> String;
  def changeExtension(path:String, ext:String) -> String;
  def combine(basepath:String, newpath:String) -> String;
  def toNative(path:String) -> String;
  def fromNative(path:String) -> String;
}

/** Utility functions for operating on directories. */
namespace Directory {
  def current:String { get; }
  def create(path:String) -> bool;
  def directories(pattern:String = "*") -> Iterator[String];
  def files(pattern:String = "*") -> Iterator[String];
  def entries(pattern:String = "*") -> Iterator[String];
}

First thing to note is all of these methods operate on regular strings. Back about 4 years ago there was a huge argument on python-dev about introducing a "Path" object which would have special methods for concatenation, splitting, and so on - sort of like Java's File object. Although it was cool / cute in some ways, I think there's a lot of value in allowing all of the regular operations on strings to work on pathnames too - and there's likely to be confusion if we overload regular string operators like concatenation to work differently with paths than with strings. So paths are just strings, and the special operations that paths need - like extracting the file extension - are just functions in a namespace instead of being operators.

I'm not completely satisfied with the names. In general, I try to follow certain principles:
  • I like names that are short and succinct.
  • Method names should, in general, be verbs, unless they are merely getters which return a value and don't have side effects, in which case it's OK to use a noun.
  • I tend to avoid names like "getFilename()" in favor of just "filename". I think that having to put the word "get" in front of every method like you often see in Java is needless redundancy - any token or symbol which is repeated too many times eventually becomes mentally tuned out.
    • As a general rule, the expression 'foo(bar)' means either 'do foo to bar' or 'return property foo of bar'.
  • Use namespaces rather than long method names to disambiguate similar names. I prefer "Directory.current" rather than "currentDirectory" or "getCurrentDirectory". Although bear in mind that someone may import your symbol into their module as unqualified names, in which case there's a chance of collision with other unqualified names. This is mostly an issue with class names rather than with methods in a namespace. So the rule is don't make class names too generic, even if they are namespaced.
One problem I have is that the word 'absolute' is neither a noun or a verb - it's an adjective, which confuses the meaning of the method. I suppose I really ought to name it "makeAbsolute".

Another issue to resolve is that there are a bunch of functions that operate equally well on either files or directories. The question I have is where they should go:

def remove(path:String, recursive:bool = false) -> bool;
def move(from:String, to:String);
def lastAccessTime(path:String) -> Time;
def lastModificationTime(path:String) -> Time;
def creationTime(path:String) -> Time;
def setLastAccessTime(path:String, time:Time);
def setLastModificationTime(path:String, time:Time);
def setCreationTime(path:String, time:Time);

These are all functions that actually touch the filesystem. One place to put them is in Path, although it's a little weird to be mixing functions that only do string manipulation with functions that actually hit the disk. I guess that's OK though. The way C# organizes things is to put "getCreationTime" and similar functions in two places, Directory and File, and they are basically the exact same functions, which work on both files and directories. That's kinda kooky.

Now. you may notice that some of these functions return a Time object, which is something else that isn't defined yet. I'm thinking that we need three classes:

struct Time {
  let value:int64;
  let calendar:Calendar;
}

struct TimeSpan {
  let value:int64;
}

class Calendar;

I've omitted all of the methods for clarity.

The 'Calendar' is an object that defines the time base - such as what year is year zero, how many nanoseconds a tick is, and so on. All Time objects are defined with respect to some Calendar.

'Time' represents a particular moment in time, relative to a Calendar. It's a struct (which means it's passed by value) and it's immutable.

'TimeSpan' represents a duration. You can add and subtract TimeSpans from Times.

The actual meaning of the 'value' param in Time is probably platform-dependent - that is, it will use whatever representation that platform uses for time values. The Time struct will have lots of methods to convert that into more familiar units such as seconds, milliseconds, and so on.

One final thing to note about Time, is that I really don't know much about date conversions and other time-related stuff. In other words, I'm totally unqualified to implement these classes.

No comments:

Post a Comment