# Datatypes¶

Formality includes a powerful datatype system. A new datatype can be defined with the `T`

syntax, which is similar to Haskell’s `data`

, and creates global definitions for its type and constructors. To pattern-match against a value of a datatype, you must use `case/T`

.

## Enumerations¶

Enumerations can be defined and used as follows:

```
import [email protected]
T Suit
| clubs
| diamonds
| hearts
| spades
print_suit : {suit : Suit} -> Output
case/Suit suit
| clubs => print("First rule: you do not talk about Fight Club.")
| diamonds => print("Queen shines more than diamond.")
| hearts => print("You always had mine.")
| spades => print("The only card I need is the Ace of Spades! \m/")
: Output
main : Output
print_suit(spades)
```

The program above creates a datatype, `Suit`

, with 4 possible values. In Formality, we call those values **constructors**. It then pattern-matches a suit and outputs a different sentence depending on it. Notice that the `case`

expression requires you to annotate the returned type: that’s called the **motive**, and is very useful when theorem proving. You can omit it by using a `case`

argument instead:

```
print_suit : {case suit : Suit} -> Output
| clubs => print("First rule: you do not talk about Fight Club.")
| diamonds => print("Queen shines more than diamond.")
| hearts => print("You always had mine.")
| spades => print("The only card I need is the Ace of Spades! \m/")
```

Differently from `case`

expressions, `case`

arguments are shorter and simpler, but are less flexible.

## Constructor with fields¶

Datatype constructors can have fields, allowing them to store values:

```
import Base@0
T Person
| person {age : Num, name : String}
get_name : {p : Person} -> String
case/Person p
| person => p.name
: String
main : Output
let john = person(26, "John")
print(get_name(john))
```

As you can see, fields can be accessed inside `case`

expressions and `case`

arguments. Notice that `p.name`

is not a field accessor, but just a single variable: the `.`

is part of its name. When Formality doesn’t know the name of the matched value, you must must explicitly name it using the `as`

keyword:

```
main : {p : Person} -> Num
case/Person person(26, "John") as john
| person => john.age
: Num
```

## Moving resources to branches¶

Since Formality functions are affine, you can’t use an argument more than once. So, for example, the function below isn’t allowed:

```
import Base@0
main : {a : Bool, b : Bool} -> Bool
case/Bool a
| true => b
| false => not(b)
: Bool
```

In this case in particular, you used `b`

in two different branches, so you shouldn’t need to copy it:

```
import Base@0
main : {a : Bool, b : Bool} -> Bool
case/Bool a
+ b : Bool
| true => b
| false => not(b)
: Bool
```

Under the hoods, this is desugared to an extra lambda on each branch:

```
import Base@0
main : {a : Bool, b : Bool} -> Bool
(case/Bool a
| true => {b} b
| false => {b} not(b)
: Bool -> Bool)(b)
```

Using `case`

arguments:

```
import Base@0
main : {case a : Bool, b : Bool} -> Bool
| true => b
| false => not(b)
```

As you can see, the version using `case`

arguments is the shortest.

## Dependent motives¶

In Formality, the type returned by a case expression can depend on the matched value. For example, we can do this:

```
import Base@0
main : Num
case/Bool true as x
| true => 42
| false => "hello"
: case/Bool x
| true => Num
| false => String
: Type
```

While strange-looking, this is perfectly logical and well-typed. The reason this works is that, when checking the type of a `case`

expression, Formality first specializes the motive for *every possible branch* to determine “what it demands”. In this case, it demands a `Num`

on the `true`

branch and a `String`

on the `false`

branch. If you satisfy every demand, then it determines the type of the whole `case`

expression by specializing the motive using actual matched value. In this case, we matched on `true`

, so it returns `Num`

.

This has many interesting effects and applications.

### Functions with different return types¶

We could write a function that returns different types based on its input:

```
import Base@0
foo : {case b : Bool} -> case/Bool b | true => Num | false => String : Type
| true => 42
| false => "hello"
main : Num
foo(true) .+. foo(true)
```

Note this is not the same as returning `Either`

, which is done in Haskell when we want to return different types, needing an extra pattern-match every time you call the function. Here, that’s not necessary, because `foo`

really returns different types based on its input. Thanks to dependent types, Formality can statically determine if you’re doing the right thing, so, replacing `foo(true)`

by `foo(false)`

would be a type error.

### Proving absurds with Empty¶

Another interesting example comes from the `Empty`

datatype:

```
// {- From [email protected] -}
T Empty
```

This code isn’t incomplete, the datatype has zero constructors. This means it is impossible to construct a term `t`

with type `Empty`

, but we can still accept it as a function argument. So, what happens if we pattern match against it?

```
import Base@0
wtf : {x : Empty} -> ???
case/Empty
: ???
```

The answer is: we can replace `???`

by anything, and the program will check. That’s because we technically proved all the demanded cases, so Formality just returns the motive directly as the type of this `case`

expression. That allows us to write a function that returns something absurd:

```
import Base@0
prove_1_is_2 : {e : Empty} -> Equal(Num, %1, %2)
case/Empty e
: Equal(Num, %1, %2)
```

If we managed to call it, we’d have a proof that `1 .==. 2`

, which is absurd, making Formality inconsistent. But this is fine: `prove_1_is_2`

can’t ever be called, because we can’t construct a value of type `Empty`

. This is pretty useful, so we have a base-lib function that, given an element of `Empty`

, returns any absurd type:

```
// {- From [email protected] -}
absurd : {e : Empty, ~P : Type} -> P
case/Empty e
: P
```

The opposite holds too: given an absurd equality, we can make an element of type `Empty`

. Here is an example:

```
// {- From [email protected] -}
true_isnt_false : {e : true == false} -> Empty
unit :: rewrite x
in (case/Bool x
| true => Unit
| false => Empty
: Type)
with e
```

As an exercise, convince yourself why this works.

### Unreachable branches with equality notes¶

Notice the program below:

```
import Base@0
main : Num
case/Bool true
| true => 10
| false => ?
: Num
```

Here, we’re matching against `true`

, so we know the `false`

case is unreachable, but we still need to fill it with something. In this case, we could write any number, but that’s not always possible. In those cases, equality notes can be helpful:

```
import Base@0
main : Num
case/Bool true as x
+ refl(~Bool, ~%true) as e : Equal(Bool, %x, %true)
| true => 10
| false => absurd(false_isnt_true(e), ~Num)
: Num
```

This moves a proof that `Equal(Nat, x, true)`

that will be specialized to the value of `x`

on each branch. On the `false`

branch, we get an `e : Equal(Nat, false, true)`

, which is absurd, so we can fill it by calling `absurd`

function on `e`

, without providing an actual number.

To desugar an equality note, Formality simply adds an extra, erased equality argument:

```
import Base@0
main
(case/Bool true as x
| true => {e} 10
| false => {e} absurd(false_isnt_true(e), ~Num)
: {e : Equal(Bool, %x, %true)} -> Num)(refl(~Bool, ~%true))
```

Notice that the left side of the equality note is allowed to access the matched value (`x`

), while the right side is used to build the `refl`

proof.

## Recursive fields¶

Fields can refer to the datatype being defined:

```
// {- From [email protected] -}
T Nat
| succ {pred : Nat}
| zero
```

Since `Nat`

is so common, there is a syntax-sugar for it: `0n3`

, which expands to `succ(succ(succ(zero)))`

.

Mutual recursion is allowed:

```
T Foo
| foo {bar : Foo}
T Bar
| bar {foo : Bar}
```

And negative occurrences too (soundly):

```
T Loop
| loop {f : Foo -> Foo}
```

While recursion is very liberal on types, the same isn’t true for programs. You can’t write recursive functions directly, such as:

```
mul_by_2 : {case n : Nat} -> Nat
| succ => succ(succ(mul_by_2(n)))
| zero => zero
main : Nat
mul_by_2(0n2)
```

This makes dealing with recursive datatypes more complex than usual, since you must use a boxed definition:

```
#mul_by_2*N : !{case n : Nat} -> Nat
| succ => succ(succ(mul_by_2(n)))
| zero => zero
halt: zero
#main : !Nat
($mul_by_2*)(0n2)
```

This is explained in more details on the Boxes and Recursion sections of the Tutorial.

## Polymorphism¶

Polymorphic datatypes allow us to create multiple instances of the same datatype with different contained types.

```
import Base@0
T Pair {A : Type, B : Type}
| pair {x : A, y : B}
main : Num
let a = pair(~Bool, ~Num, true, 7)
case/Pair a
| pair => a.y
: Num
```

The `{A : Type, B : Type}`

after the datatype declares two polymorphic variables, `A`

and `B`

, allowing us to create different types of pair, such as a pair of Bool, or a pair of Num, without needing to duplicate the definition of `Pair`

. Each polymorphic variable adds an implicit, erased argument to each constructor, so, instead of `pair(true, 7)`

, you need to write `pair(~Bool, ~Num, true, 7)`

. In a future, this verbosity will be prevented with implicit arguments. For now, you can amend it with a `let`

:

```
import Base@0
T Pair {A : Type, B : Type}
| pair {x : A, y : B}
main : [:Pair(Bool, Bool), Pair(Bool, Bool)]
let pairbb = pair(~Bool, ~Bool)
let a = pairbb(true, false)
let b = pairbb(false, true)
[a, b]
```

By combining polymorphism with recursive types, we can create the popular `List`

type (already defined on the base libs above):

```
// {- T List {T : Type}
// | cons {head : T, tail : List(T)}
// | nil
// Returns all but the first element
// tail : {~T : Type, case list : List(T)} -> List(T)
// | cons => list.tail
// | nil => nil(~T) -}
main : List(Num)
tail(~Num, cons(~Num, 1, cons(~Num, 2, cons(~Num, 3, nil(~Num)))))
```

Since `List`

is so common, there is a built-in syntax-sugar for it, the dollar sign:

```
main : List(Num)
tail(~Num, Num$[1, 2, 3])
```

## Indices¶

Indices are like polymorphic variables, except that, rather than constant types, they are **computed values** that can depend on each constructor’s arguments. That gives us a lot of type-level power, and is one of the reasons Formality is a great proof language. For example:

```
T IsEven (x : Num)
| make_even {half : Num} (half .*. 2)
```

This datatype has one index, `x`

, of type `Num`

. Its constructor, `is_even`

, has one field, `half : Num`

. When you write `make_even(3)`

, the number `3`

is multiplied by two and moved to the type-level, resulting in a value of type `IsEven(6)`

. So, for example:

```
even_0 : IsEven(0)
make_even(0)
even_2 : IsEven(2)
make_even(1)
even_4 : IsEven(4)
make_even(2)
even_6 : IsEven(6)
make_even(3)
```

Notice that is **impossible** to create a value with type `IsEven(3)`

, because that would require a `half : Num`

such that `half .*. 2 .==. 3`

, and there is no such number. Because of that, indexed datatypes have many applications. For example, suppose that you want to write a `div2`

function that divides a number by two, but you don’t want it to be called with odd numbers. You could do this:

```
div2 : {x : Num, ~x_is_even : IsEven(x)} -> Num
x ./. 2
main : Num
div2(10, ~make_even(5))
```

Here, the second argument “proves” that the first is even, making it impossible to call `div2`

with an odd input. Since `x_is_even`

is erased from the runtime, this gives us a static, zero-cost guarantee. Alternatively, we could have used `~x_is_even : [k : Num ~ x .==. k .*. 2]`

, but indexed datatypes are more handy in general. For example, you could easily write one for 3 consecutive numbers:

```
T Consecutive (a : Num, b : Num, c : Num)
| consecutive {init : Num} (init, init .+. 1, init .+. 2)
```

So, for example, `Consecutive(a, b, c)`

means `a`

, `b`

, `c`

are 3 consecutive numbers such as `7, 8, 9`

. You could also write a type for sorted lists:

```
import Base@0
T SortedList {A : Type} (xs : List(Num))
| scons {
add : Num,
head : Num,
tail : List(Num),
prof : SortedList(Num, cons(~Num, head, tail))
} (cons(~Num, head .+. add, cons(~Num, head, tail)))
| snil
(nil(~Num))
```

Here, `SortedList(xs)`

would mean that `xs`

is a list of `Num`

s with elements in descending order, such as `[7, 5, 5, 3, 1]`

. In order words, a function that returned `[xs : List(Num) ~ SortedList(xs)]`

could only be written if `xs`

was, indeed, sorted. And so on. Indexed datatypes, in a way, give us a powerful language of specifications which we can use to statically reason about our datatypes and algorithms.

For a more complex example of indices, here is a Vector, which is like a `List`

, except that its type stores its own length:

```
// From [email protected]
// A vector is a list with a statically known length
T Vector {A : Type} (len : Nat)
| vcons {~len : Nat, head : A, tail : Vector(A, len)} (succ(len))
| vnil (zero)
```

Every time we call `vcons`

, the `len`

index on the type of the vector increases by one, allowing us to track its length statically:

```
main : Vector(String, 0n3)
vcons(~String, ~0n2, "ichi",
vcons(~String, ~0n1, "ni",
vcons(~String, ~0n0, "san",
vnil(~String))))
```

This has many applications. For example, we can create a type-safe `vhead`

that returns the first element of a non-empty vector:

```
// {-From [email protected] -}
// {-A type-safe "head" that returns the first element of a non-empty vector
// On the `vcons` case, return the vector's head
// On the `vnil` case, prove it is unreachable, since `xs.len > 0` -}
vhead: {~T : Type, ~n : -Nat, xs : Vector(T, %succ(+n))} -> T
case/Vector xs
+ refl(~Nat, ~%succ(+n)) as e : Equal(Nat, xs.len, %succ(+n))
| vcons => xs.head
| vnil => absurd(zero_isnt_succ(~n, e), ~T)
: T
```

Notice that constructor indices are accessible on the left side of an equality note. That gives us an `e : zero is succ(n)`

(since `zero`

is the length of `vnil`

, and `succ(n)`

is the length of `xs`

). Since `0 != 1`

, this branch is unreachable, so we can fill it with an `absurd`

.

## Self-Encodings¶

Interestingly, none of the features above are part of Formality’s type theory. Instead, they are lightweight syntax-sugars that elaborate to plain-old lambdas. To be specific, a datatype is encoded as is own inductive hypothesis, with “Self Types”. For example, the `Bool`

datatype desugars to:

```
Bool : Type
${self}
{ ~P : {x : Bool} -> Type
, true : P(true)
, false : P(false)
} -> P(self)
true : Bool
new(~Bool){~P, true, false} true
false : Bool
new(~Bool){~P, true, false} false
case_of : {b : Bool, ~P : {x : Bool} -> Type, t : P(true), f : P(false)} -> P(b)
(%b)(~P, t, f)
```

Here, `${self} ...`

, `new(~T) val`

and `%b`

are the type, introduction, and elimination of Self Types, respectively. You can see how any datatype is encoded under the hoods by asking `fm`

to evaluate its type, as in, `fm Data.Bool@0/Bool -W`

(inside the `fm_modules`

directory). The `-W`

flag asks Formality to not evaluate fully since `Bool`

is recursive. While you probably won’t need to deal with self-encodings yourself, knowing how they work is valuable, since it allows you to express types not covered by the built-in syntax.

**TODO**: write a brief explanation of how Self encodings work (although I think it should be self-explanatory from this example!).