CyberSpy

Rantings from a guy with way too much free time

Riding the Camel Deeper into the Desert of Code: Records and Variants

2018-02-11 Programming Rob Baruch

Extending and Encapsulating Data - Records and Variants

Up next on our tour of OCaml are records and variants. If you are familiar with typedef, struct, and union in c/c++, you’ll quickly come to understand how to use records and variants in OCaml. We’ll start off looking at records, how to define them, use them in functions, and automatically generate common functions from our type definitions. There’s more to the eye than what you get in a language like c, so pay close attention as we move through the descriptions below. We’ll break it down one at a time so you’ll have no problems!

Defining Records

In OCaml, defining a record is as simple as

type fruits = Atom1 | Atom2 | ... | AtomN

or

type <identifier> = {
    <field_identifier> : <type_def> ;
    ...
}

or, even recursively using the and keyword:

type <rec_identifier> = Atom1 | Atom2 of <identifier>
and <identifier2> = { <field_identifier> : <type_def>; <field_identifier> : <rec_identifier> }

Concrete examples of each type of definition follow below We’ll not focus on recursive types as recursion types and functions warrant a blog post all by itself! For now, let’s just look at definitions of each and focus on the second form of type definitions.

Take the first one. We can think of this type as an equivalent to an typed enumeration.

type fruits = Apples | Bananas | Oranges | Tomatoes
;;
type fruits = Apples | Bananas | Oranges | Tomatoes

Using the third form, we define a linked-list of int values:

# type list = Empty | Node of node
and node = { value : int; next : list }
;;
type list = Empty | Node of node
and node = { value : int; next : list; }

Or, more polymorphically, we can define a generic linked-list (but we’ll also not discuss this in detail as this too is a subject for a larger blogpost):

# type 'a list = Empty | Node of 'a node
and 'a node = { value : 'a; next : 'a list }
;;
type 'a list = Empty | Node of 'a node
and 'a node = { value : 'a; next : 'a list; }

In this definition, 'a specifies a generic type. We could for example, make a linked-list of fruits!

# Node {value = Apples; next = (Node {value = Bananas; next = Empty})};; 
- : fruits list =
Node {value = Apples; next = Node {value = Bananas; next = Empty}}

Lastly, let’s look at the second form. We can create a record, pet_collection as follows:


# type pet_collection = {
    name: string;
    age: int;
    weight: float;
}
;;
type pet_collection = { name : string; age : int; weight : float; }

And we can create an instance of our newly defined record simply by matching the type pattern.

# let pet1 = {name = "fluffy"; age = 10; weight = 90. } 
;;
val pet1 : pet_collection = {name = "fluffy"; age = 10; weight = 90.}

Pattern Matching and Records

Let’s see how we use records in functions with pattern-matching. Check out the function, describe_pet below. Notice that the argument to the function is a record of type pet_collection. But more than that, we’ve deconstructed the record using pattern matching to assign names to each field. By specifying this pattern for our function parameters, each field is broken out into a variable we can access in our function.

let describe_pet {name = n; age = a; weight = w} =
    if a > 10 then
        sprintf "Old %s, is a %d year old weighing %f" n a w
    else
        sprintf "%s, is a %d year old weighing %f" n a w
;;
val describe_pet : pet_collection -> string = <fun>

Punning Intended!

OCaml also has a short-cut for the pattern-matching above. Instead of assigning a different variable to each field in the record, we can just enumerate the type {field1; field2; fieldn} and use it to pattern match. And then in our function, we can use the fieldnames in the record as the parameters in the function. This is known as punning. Here’s the same function above written with punning:

let describe_pet {name; age; weight } =
    if age > 10 then
        sprintf "Old %s, is a %d year old weighing %f" name age weight
    else
        sprintf "%s, is a %d year old weighing %f" name age weight
;;

Modifying Fields in Records

OCaml expressions are immutable unless we explicitly make them mutable. In records, we can use the mutable keyword, and the assignment operator, <-, for mutable fields to modify their values. Here’s an example

type pet_log = {
    mutable access: Time.t;
    pet: pet_collection;

}

Let’s create a new pet_log, and then modify the access value in our pet_log:

# let log = {access = Time.now(); pet = pet1 }
;;
val log : pet_log =
  {access = 1518375241.44114208;
   pet = {name = "fluffy"; age = 10; weight = 90.}}

# log.access <- Time.now() 
- : unit = ()
# log.access
;;
- : Time.t = 1518375381.63572907

You may recall from my previous post that mutable updates evaluate to type unit. We can evaluate the record field after the modification to see the updated value. So why is OCaml so pedantic with mutations? By strictly controlling the explicit use of mutable variables, we are more likely to write code that is less error-prone with unexpected (implicit) side-effects.

Functional Updating

You may be wondering how do you update partial fields in a record without constructing an entire record that specifies each field in the record. Here’s where functional updating comes in to play.

Let’s write a function that renames our pet’s name:

# let rename_pet t name = 
 {t with name = name}
;;
 val rename_pet : pet_collection -> string -> pet_collection = <fun>

Or, more simply using punning:

# let rename_pet t name = 
 {t with name}
;;
val rename_pet : pet_collection -> string -> pet_collection = <fun>

What’s up with the letter t in the parameter above? In OCaml it is convention to use t as the type of an object - both in a function parameter and also in the type definition when enclosed in a module. Let’s take a look at how we might wrap a type into a module.

types in modules

One common use of defining a type inside a module is to create an isolated namespace. Take a look at the definition below:

module Pets = struct 
    type t = {
        name: string;
        age: int;
        weight: float;
        mutable access: Time.t;
    }

end
;;
module Pets :
  sig
    type t = {
      name : string;
      age : int;
      weight : float;
      mutable access : Time.t;
    }
  end

We’re not going to cover modules here, but I wanted to comment on the use of derived fields in modules. You might read in some texts that you can use the with fields definition inside a module to generate accessor fields for your types contained in your module. That has been deprecated. Instead you can use the following opam modules: ppx_jane and fieldslib. Install them using opam. And once installed (and #required), you can add [@@deriving fields] at the end of your module to derive type-associated fields.

module Pets = struct 
    type t = {
        name: string;
        age: int;
        weight: float;
        mutable access: Time.t;
    }
    [@@deriving fields]
end
;;
module Pets :
  sig
    type t = {
      name : string;
      age : int;
      weight : float;
      mutable access : Time.t;
    }
    val access : t -> Time.t
    val set_access : t -> Time.t -> unit
    val weight : t -> float
    val age : t -> int
    val name : t -> string
    module Fields :
      sig
        val names : string list
        val access :
          ([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm
        val weight : ([< `Read | `Set_and_create ], t, float) Field.t_with_perm
        val age : ([< `Read | `Set_and_create ], t, int) Field.t_with_perm
        val name : ([< `Read | `Set_and_create ], t, string) Field.t_with_perm
        val make_creator :
          name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                'a -> ('b -> string) * 'c) ->
          age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
               'c -> ('b -> int) * 'd) ->
          weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
                  'd -> ('b -> float) * 'e) ->
          access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
                  'e -> ('b -> Time.t) * 'f) ->
          'a -> ('b -> t) * 'f
        val create :
          name:string -> age:int -> weight:float -> access:Time.t -> t
        val map :
          name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                string) ->
          age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> int) ->
          weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
                  float) ->
          access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
                  Time.t) ->
          t
        val iter :
          name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                unit) ->
          age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> unit) ->
          weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
                  unit) ->
          access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
                  unit) ->
          unit
        val fold :
          init:'a ->
          name:('a ->
                ([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                'b) ->
          age:('b ->
               ([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> 'c) ->
          weight:('c ->
                  ([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
                  'd) ->
          access:('d ->
                  ([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
                  'e) ->
          'e
        val map_poly :
          ([< `Read | `Set_and_create ], t, 'a) Field.user -> 'a list
        val for_all :
          name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                bool) ->
          age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> bool) ->
          weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
                  bool) ->
          access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
                  bool) ->
          bool
        val exists :
          name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                bool) ->
          age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> bool) ->
          weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
                  bool) ->
          access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
                  bool) ->
          bool
        val to_list :
          name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                'a) ->
          age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> 'a) ->
          weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
                  'a) ->
          access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
                  'a) ->
          'a list
        module Direct :
          sig
            val iter :
              t ->
              name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                    t -> string -> 'a) ->
              age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
                   t -> int -> 'b) ->
              weight:(([< `Read | `Set_and_create ], t, float)
                      Field.t_with_perm -> t -> float -> 'c) ->
              access:(([< `Read | `Set_and_create ], t, Time.t)
                      Field.t_with_perm -> t -> Time.t -> 'd) ->
              'd
            val fold :
              t ->
              init:'a ->
              name:('a ->
                    ([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                    t -> string -> 'b) ->
              age:('b ->
                   ([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
                   t -> int -> 'c) ->
              weight:('c ->
                      ([< `Read | `Set_and_create ], t, float)
                      Field.t_with_perm -> t -> float -> 'd) ->
              access:('d ->
                      ([< `Read | `Set_and_create ], t, Time.t)
                      Field.t_with_perm -> t -> Time.t -> 'e) ->
              'e
            val for_all :
              t ->
              name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                    t -> string -> bool) ->
              age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
                   t -> int -> bool) ->
              weight:(([< `Read | `Set_and_create ], t, float)
                      Field.t_with_perm -> t -> float -> bool) ->
              access:(([< `Read | `Set_and_create ], t, Time.t)
                      Field.t_with_perm -> t -> Time.t -> bool) ->
              bool
            val exists :
              t ->
              name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                    t -> string -> bool) ->
              age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
                   t -> int -> bool) ->
              weight:(([< `Read | `Set_and_create ], t, float)
                      Field.t_with_perm -> t -> float -> bool) ->
              access:(([< `Read | `Set_and_create ], t, Time.t)
                      Field.t_with_perm -> t -> Time.t -> bool) ->
              bool
            val to_list :
              t ->
              name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                    t -> string -> 'a) ->
              age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
                   t -> int -> 'a) ->
              weight:(([< `Read | `Set_and_create ], t, float)
                      Field.t_with_perm -> t -> float -> 'a) ->
              access:(([< `Read | `Set_and_create ], t, Time.t)
                      Field.t_with_perm -> t -> Time.t -> 'a) ->
              'a list
            val map :
              t ->
              name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
                    t -> string -> string) ->
              age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
                   t -> int -> int) ->
              weight:(([< `Read | `Set_and_create ], t, float)
                      Field.t_with_perm -> t -> float -> float) ->
              access:(([< `Read | `Set_and_create ], t, Time.t)
                      Field.t_with_perm -> t -> Time.t -> Time.t) ->
              t
            val set_all_mutable_fields : t -> access:Time.t -> unit
          end
      end
  end

For now, I’m not going to elaborate on each of these field value functions. We’ll go into greater detail in a future post. I wanted to give a flavor for how we can define a simple type, wrap it into a module, and automatically generate additional field values to manipulate our types.

Up Next, Variants

Following up on the use of records, next, let’s examine variants. Variants are similar to unions in languages like c/c++, but with a slight twist. Unlike ambiguous use of unions overlapping storage definitions in c/c++ which leads to ambiguous and error-prone uses, variants in OCaml are preceded by an unambiguous tag thereby creating a strict typing of each element in the variant definition.

Let’s look at how we formally define variants. Okay, so it turns out that the second and third type of record definition above are actually variant definitions! Moreover, when we combine variants, tuples, and records, we are actually defining algebraic data types. The use of all three allows for complex and rich (and recursive) data type declarations for many things such as trees, lists, BNF languages, just to name a few.

A variant is strictly defined as:

    type <variant> =
  | <Tag> [ of <type> [* <type>]... ]
  | <Tag> [ of <type> [* <type>]... ]
  | ...

Notices that the [of <type> ... ] and the [* <type>] are optional. We can define a simple variant:

# type answer  = Yes | No
;;
type answer = Yes | No

Or we can create a more complex variant type:

# type id = string
type baseType = Int | Real | String | Bool | Char
type mlType = Base of baseType
    | Arrow of mlType * mlType
    | Product of mlType list
    | Record of (id * mlType) list
    | DatatypeName of id
;;
type id = string
type baseType = Int | Real | String | Bool | Char
type mlType =
    Base of baseType
  | Arrow of mlType * mlType
  | Product of mlType list
  | Record of (id * mlType) list
  | DatatypeName of id

In fact, this variant definition above could represent the Abstract Syntax of an OCaml expression. Notice further that this variant is recursively defined. An mlType contains a variant, Record, which is defined as a list of tuples; each tuple contains an id and an mlType. For example, The expression int*bool->{name: string} could be represented using the variant definition above with the following expression:

# Arrow ( Product [Base Int; Base Bool],  Record [("name", Base String)]);;
- : mlType =
Arrow (Product [Base Int; Base Bool], Record [("name", Base String)])

There’s a lot more to variants than we’ve discussed here. Specifically polymorphic variants. We briefly touched on polymorphism and variants above, but there’s more to say on the topic. I’ll save that for another post where we can dedicate the entire post to covering more complex and real-world examples.

Next Steps

We’ve covered a lot of ground in this post - learning more about records and variants. I strongly suggest that you fire up utop and experiment defining, instantiating, and writing functions using records and variants to get a better feel for how OCaml interacts with them as they integral to the programming paradigms and are used everywhere.

– rob

comments powered by Disqus