static-type-systems


Types

Types are information on the behavior of a value. What does it do? What can it do? Where can I use it?

Let's look at a simple example.

 (define (double x) 
   (+ x x)) 

This function has the type Number -> Number, since it maps a number to another number.

Scheme does not know about this type information. That does not mean that this information is not there: The operation above just does not make sense if not applied to numbers. We, as the programmer, know that

 (double "foo") 

does not make sense. Scheme only finds out about that on run time when it throws a runtime error. That is, the type information is there, but Scheme does not take advantage of it. If there indeed was no type information at all, we, the programmers, wouldn't know what we could do with this value - it has to have a type for us to work with it. But in Scheme, that type only exists in our mind.

Static Typing refers to the availability of such types to the implementation before executing the code ("at compile-time"). When calling the function above in a statically typed system, the implementation knows whether we call it with a correctly typed argument. If we don't, it complains loudly.

Taking advantage of types

Since, in a statically typed environment, the implementation knows beforehand what types are useful in a given operation and which are not, it's only natural to use this to our advantage. Since we also know, in our mind, the types of all values in our program, we could just tell the implementation about them, and let the implementation check for this does not make sense errors.

Adding type information is of course tedious. Most statically type languages take advantage of inference rules - they infer from the use of values what types they have. This works so well that, in languages like SML? or Haskell, you rarely if ever have to use any type annotations.

To take full advantage of the static type checker, we do have to annotate our values somewhat. To understand what that means, let's take a look at a normal calculation:

10 km
----- = 5 km/h
 2 h

Here, we can have three points of view.

  1. We can calculate with numbers, and just forget about the units (just assuming we get them right)
  2. We can tag the units on the numbers and check on runtime what to do
  3. We can stop working with numbers and work with entities: Kilometers, Hours, Speeds

To take full advantage of static type checking, we should look at No.3. We aren't using numbers there (as in 1), nor are we attaching runtime-checkable tags to the numbers (as in 2), but working with new data types. This is of course nothing new. It's called Abstract Data Type. In Scheme, we can use SRFI-9 to do this:

 (define-record-type kilometer 
   (make-kilometer km) 
   kilometer? 
   (km kilometer-quantity)) 
  
 (define-record-type hour 
   (make-hour h) 
   hour? 
   (h hour-quantity)) 
  
 (define-record-type km/h 
   (make-km/h km/h) 
   km/h? 
   (km/h km/h-quantity)) 

Now we can work with abstract values, and define procedures which work only on those:

 (define (divide-km-by-hour km h) 
   (make-km/h (/ (kilometer-quantity km) 
                 (hour-quantity h)))) 

If we pass the wrong type somewhere, the scheme system will throw a runtime error. If scheme was statically typed, the implementation would know what we can pass in and out, and complain on wrong types before we even tried the program. But since scheme is not statically typed, this extra work of defining these records-types above is just that: Extra work. Usually, we won't do it for such simple types.

That's why, usually, a static type checking of scheme programs is less useful then it could be: We, as programmers, provide the system less knowledge about the types then we have.

Since in statically typed languages, this type declaration should be used whenever possible, they usually provide a simpler mechanism. For example, in SML?, the above code would translate to

datatype Km = Km of int
datatype Hour = Hour of int
datatype Kmph = Kmph of int

fun divide_km_by_hour (Km km) (Hour h) =
    Kmph (km div h)

We then could go on and write our calculation above as:

- divide_km_by_hour (Km 10) (Hour 5)
val it = Kmph 2 : Kmph

There. We got a Kmph type back, just as we wanted. If we ever pass anything to this function that is not Km and Hour, we get a compile time error - not a runtime error as in Scheme. And we didn't have to type much there.

The Problem of Static Type Systems

Of course, this does not go without a price. Once we have a static type system, we should take advantage of it. If we don't annotate our types, most of the advantage of the static type system are gone. Most Schemers wouldn't write a divide-km-by-hour procedure, but just use / most of the time - except when they decide that it's appropriate. You can do the same in a statically typed language. Just use numbers (oh, wait, ints, not numbers) and the division operator for ints, div.

A better example of how a static type system can be restricted is how they usually represent lists.

- [1,2,3];
val it = [1,2,3] : int list

So that's a list of ints. What about

- [1,"a",3];
stdIn:1.1-52.2 Error: operator and operand don't agree [literal]
  operator domain: string * string list
  operand:         string * int list
  in expression:
    "a" :: 3 :: nil

Well, that doesn't look too well. In SML, a list is always a list of a single type. Different elements of lists can't have different types. What we, as Schemers, do all the time, SML thinks smells fishy: A list with more then one type in it. The solution is simple: If we want a list with strings and integers in it, there must be some conceptual idea on what we store in there. So we should make a type for it.

This way, we have to think very clearly about what we want - even beforehand. The system enforces a rigid style, in which we have to have thought out a problem almost completely and are fully aware of it before SML even accepts our possible solutions as input. Some people feel too restricted by that.

Looking beyond types

This type checking seriously helps a lot. Your implementation tells you where you are calling a procedure with wrong arguments, or even when a COND on some value is not exhaustive.

But these are not the only types of bugs. Not by far. A static type system is a proof system, yet another tool in a programmer's toolbox. It's a powerful tool, but we shouldn't trust it more then it deserves. To quote a famous person:

Beware of bugs in the above code; I have only proved it correct, not tried it. (Donald Knuth)

Knuth knew very well what he was saying there. The important bugs in software stem from conceptual problems. A static type system will help you great deal, but it's not the end.

Scheme and Static Type Systems

Static type systems seem to be annoying to many schemers. Still, static type systems provide a great deal of help. But, luckily, there's a middle ground between dynamic and static type systems, and it's called Soft Typing. PLT? has a program, MrSpidey, that implements soft typing for Scheme. Very interesting!

Reference:


category-learning-scheme