Riding the Camel Deeper into the Desert of Code: Records and Variants
Extending and Encapsulating Data - Records and Variants
Up next on our tour of OCaml are records and variants. If you are familiar with typedef
, struct
, and union
in c/c++, you’ll quickly come to understand how to use records and variants in OCaml. We’ll start off looking at records, how to define them, use them in functions, and automatically generate common functions from our type definitions. There’s more to the eye than what you get in a language like c, so pay close attention as we move through the descriptions below. We’ll break it down one at a time so you’ll have no problems!
Defining Records
In OCaml, defining a record is as simple as
type fruits = Atom1 | Atom2 | ... | AtomN
or
type <identifier> = {
<field_identifier> : <type_def> ;
...
}
or, even recursively using the and keyword:
type <rec_identifier> = Atom1 | Atom2 of <identifier>
and <identifier2> = { <field_identifier> : <type_def>; <field_identifier> : <rec_identifier> }
Concrete examples of each type of definition follow below We’ll not focus on recursive types as recursion types and functions warrant a blog post all by itself! For now, let’s just look at definitions of each and focus on the second form of type definitions.
Take the first one. We can think of this type as an equivalent to an typed enumeration.
type fruits = Apples | Bananas | Oranges | Tomatoes
;;
type fruits = Apples | Bananas | Oranges | Tomatoes
Using the third form, we define a linked-list of int
values:
# type list = Empty | Node of node
and node = { value : int; next : list }
;;
type list = Empty | Node of node
and node = { value : int; next : list; }
Or, more polymorphically, we can define a generic linked-list (but we’ll also not discuss this in detail as this too is a subject for a larger blogpost):
# type 'a list = Empty | Node of 'a node
and 'a node = { value : 'a; next : 'a list }
;;
type 'a list = Empty | Node of 'a node
and 'a node = { value : 'a; next : 'a list; }
In this definition, 'a
specifies a generic type. We could for example, make a linked-list of fruits!
# Node {value = Apples; next = (Node {value = Bananas; next = Empty})};;
- : fruits list =
Node {value = Apples; next = Node {value = Bananas; next = Empty}}
Lastly, let’s look at the second form. We can create a record, pet_collection
as follows:
# type pet_collection = {
name: string;
age: int;
weight: float;
}
;;
type pet_collection = { name : string; age : int; weight : float; }
And we can create an instance of our newly defined record simply by matching the type pattern.
# let pet1 = {name = "fluffy"; age = 10; weight = 90. }
;;
val pet1 : pet_collection = {name = "fluffy"; age = 10; weight = 90.}
Pattern Matching and Records
Let’s see how we use records in functions with pattern-matching. Check out the function, describe_pet
below. Notice that the argument to the function is a record of type pet_collection
. But more than that, we’ve deconstructed the record using pattern matching to assign names to each field. By specifying this pattern for our function parameters, each field is broken out into a variable we can access in our function.
let describe_pet {name = n; age = a; weight = w} =
if a > 10 then
sprintf "Old %s, is a %d year old weighing %f" n a w
else
sprintf "%s, is a %d year old weighing %f" n a w
;;
val describe_pet : pet_collection -> string = <fun>
Punning Intended!
OCaml also has a short-cut for the pattern-matching above. Instead of assigning a different variable to each field in the record, we can just enumerate the type {field1; field2; fieldn}
and use it to pattern match. And then in our function, we can use the fieldnames in the record as the parameters in the function. This is known as punning. Here’s the same function above written with punning:
let describe_pet {name; age; weight } =
if age > 10 then
sprintf "Old %s, is a %d year old weighing %f" name age weight
else
sprintf "%s, is a %d year old weighing %f" name age weight
;;
Modifying Fields in Records
OCaml expressions are immutable unless we explicitly make them mutable. In records, we can use the mutable keyword, and the assignment operator, <-, for mutable fields to modify their values. Here’s an example
type pet_log = {
mutable access: Time.t;
pet: pet_collection;
}
Let’s create a new pet_log
, and then modify the access
value in our pet_log
:
# let log = {access = Time.now(); pet = pet1 }
;;
val log : pet_log =
{access = 1518375241.44114208;
pet = {name = "fluffy"; age = 10; weight = 90.}}
# log.access <- Time.now()
- : unit = ()
# log.access
;;
- : Time.t = 1518375381.63572907
You may recall from my previous post that mutable updates evaluate to type unit
. We can evaluate the record field after the modification to see the updated value. So why is OCaml so pedantic with mutations? By strictly controlling the explicit use of mutable variables, we are more likely to write code that is less error-prone with unexpected (implicit) side-effects.
Functional Updating
You may be wondering how do you update partial fields in a record without constructing an entire record that specifies each field in the record. Here’s where functional updating comes in to play.
Let’s write a function that renames our pet’s name:
# let rename_pet t name =
{t with name = name}
;;
val rename_pet : pet_collection -> string -> pet_collection = <fun>
Or, more simply using punning:
# let rename_pet t name =
{t with name}
;;
val rename_pet : pet_collection -> string -> pet_collection = <fun>
What’s up with the letter t
in the parameter above? In OCaml it is convention to use t
as the type of an object - both in a function parameter and also in the type definition when enclosed in a module. Let’s take a look at how we might wrap a type into a module.
types in modules
One common use of defining a type inside a module is to create an isolated namespace. Take a look at the definition below:
module Pets = struct
type t = {
name: string;
age: int;
weight: float;
mutable access: Time.t;
}
end
;;
module Pets :
sig
type t = {
name : string;
age : int;
weight : float;
mutable access : Time.t;
}
end
We’re not going to cover modules here, but I wanted to comment on the use of derived fields in modules. You might read in some texts that you can use the with fields
definition inside a module to generate accessor fields for your types contained in your module. That has been deprecated. Instead you can use the following opam modules: ppx_jane and fieldslib. Install them using opam. And once installed (and #required), you can add [@@deriving fields] at the end of your module to derive type-associated fields.
module Pets = struct
type t = {
name: string;
age: int;
weight: float;
mutable access: Time.t;
}
[@@deriving fields]
end
;;
module Pets :
sig
type t = {
name : string;
age : int;
weight : float;
mutable access : Time.t;
}
val access : t -> Time.t
val set_access : t -> Time.t -> unit
val weight : t -> float
val age : t -> int
val name : t -> string
module Fields :
sig
val names : string list
val access :
([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm
val weight : ([< `Read | `Set_and_create ], t, float) Field.t_with_perm
val age : ([< `Read | `Set_and_create ], t, int) Field.t_with_perm
val name : ([< `Read | `Set_and_create ], t, string) Field.t_with_perm
val make_creator :
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
'a -> ('b -> string) * 'c) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
'c -> ('b -> int) * 'd) ->
weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
'd -> ('b -> float) * 'e) ->
access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
'e -> ('b -> Time.t) * 'f) ->
'a -> ('b -> t) * 'f
val create :
name:string -> age:int -> weight:float -> access:Time.t -> t
val map :
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
string) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> int) ->
weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
float) ->
access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
Time.t) ->
t
val iter :
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
unit) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> unit) ->
weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
unit) ->
access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
unit) ->
unit
val fold :
init:'a ->
name:('a ->
([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
'b) ->
age:('b ->
([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> 'c) ->
weight:('c ->
([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
'd) ->
access:('d ->
([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
'e) ->
'e
val map_poly :
([< `Read | `Set_and_create ], t, 'a) Field.user -> 'a list
val for_all :
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
bool) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> bool) ->
weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
bool) ->
access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
bool) ->
bool
val exists :
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
bool) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> bool) ->
weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
bool) ->
access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
bool) ->
bool
val to_list :
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
'a) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm -> 'a) ->
weight:(([< `Read | `Set_and_create ], t, float) Field.t_with_perm ->
'a) ->
access:(([< `Read | `Set_and_create ], t, Time.t) Field.t_with_perm ->
'a) ->
'a list
module Direct :
sig
val iter :
t ->
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
t -> string -> 'a) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
t -> int -> 'b) ->
weight:(([< `Read | `Set_and_create ], t, float)
Field.t_with_perm -> t -> float -> 'c) ->
access:(([< `Read | `Set_and_create ], t, Time.t)
Field.t_with_perm -> t -> Time.t -> 'd) ->
'd
val fold :
t ->
init:'a ->
name:('a ->
([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
t -> string -> 'b) ->
age:('b ->
([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
t -> int -> 'c) ->
weight:('c ->
([< `Read | `Set_and_create ], t, float)
Field.t_with_perm -> t -> float -> 'd) ->
access:('d ->
([< `Read | `Set_and_create ], t, Time.t)
Field.t_with_perm -> t -> Time.t -> 'e) ->
'e
val for_all :
t ->
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
t -> string -> bool) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
t -> int -> bool) ->
weight:(([< `Read | `Set_and_create ], t, float)
Field.t_with_perm -> t -> float -> bool) ->
access:(([< `Read | `Set_and_create ], t, Time.t)
Field.t_with_perm -> t -> Time.t -> bool) ->
bool
val exists :
t ->
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
t -> string -> bool) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
t -> int -> bool) ->
weight:(([< `Read | `Set_and_create ], t, float)
Field.t_with_perm -> t -> float -> bool) ->
access:(([< `Read | `Set_and_create ], t, Time.t)
Field.t_with_perm -> t -> Time.t -> bool) ->
bool
val to_list :
t ->
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
t -> string -> 'a) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
t -> int -> 'a) ->
weight:(([< `Read | `Set_and_create ], t, float)
Field.t_with_perm -> t -> float -> 'a) ->
access:(([< `Read | `Set_and_create ], t, Time.t)
Field.t_with_perm -> t -> Time.t -> 'a) ->
'a list
val map :
t ->
name:(([< `Read | `Set_and_create ], t, string) Field.t_with_perm ->
t -> string -> string) ->
age:(([< `Read | `Set_and_create ], t, int) Field.t_with_perm ->
t -> int -> int) ->
weight:(([< `Read | `Set_and_create ], t, float)
Field.t_with_perm -> t -> float -> float) ->
access:(([< `Read | `Set_and_create ], t, Time.t)
Field.t_with_perm -> t -> Time.t -> Time.t) ->
t
val set_all_mutable_fields : t -> access:Time.t -> unit
end
end
end
For now, I’m not going to elaborate on each of these field value functions. We’ll go into greater detail in a future post. I wanted to give a flavor for how we can define a simple type, wrap it into a module, and automatically generate additional field values to manipulate our types.
Up Next, Variants
Following up on the use of records, next, let’s examine variants. Variants are similar to unions in languages like c/c++, but with a slight twist. Unlike ambiguous use of unions overlapping storage definitions in c/c++ which leads to ambiguous and error-prone uses, variants in OCaml are preceded by an unambiguous tag thereby creating a strict typing of each element in the variant definition.
Let’s look at how we formally define variants. Okay, so it turns out that the second and third type of record definition above are actually variant definitions! Moreover, when we combine variants, tuples, and records, we are actually defining algebraic data types. The use of all three allows for complex and rich (and recursive) data type declarations for many things such as trees, lists, BNF languages, just to name a few.
A variant is strictly defined as:
type <variant> =
| <Tag> [ of <type> [* <type>]... ]
| <Tag> [ of <type> [* <type>]... ]
| ...
Notices that the [of <type> ... ]
and the [* <type>]
are optional. We can define a simple variant:
# type answer = Yes | No
;;
type answer = Yes | No
Or we can create a more complex variant type:
# type id = string
type baseType = Int | Real | String | Bool | Char
type mlType = Base of baseType
| Arrow of mlType * mlType
| Product of mlType list
| Record of (id * mlType) list
| DatatypeName of id
;;
type id = string
type baseType = Int | Real | String | Bool | Char
type mlType =
Base of baseType
| Arrow of mlType * mlType
| Product of mlType list
| Record of (id * mlType) list
| DatatypeName of id
In fact, this variant definition above could represent the Abstract Syntax of an OCaml expression. Notice further that this variant is recursively defined. An mlType contains a variant, Record, which is defined as a list of tuples; each tuple contains an id and an mlType. For example, The expression int*bool->{name: string}
could be represented using the variant definition above with the following expression:
# Arrow ( Product [Base Int; Base Bool], Record [("name", Base String)]);;
- : mlType =
Arrow (Product [Base Int; Base Bool], Record [("name", Base String)])
There’s a lot more to variants than we’ve discussed here. Specifically polymorphic variants. We briefly touched on polymorphism and variants above, but there’s more to say on the topic. I’ll save that for another post where we can dedicate the entire post to covering more complex and real-world examples.
Next Steps
We’ve covered a lot of ground in this post - learning more about records and variants. I strongly suggest that you fire up utop and experiment defining, instantiating, and writing functions using records and variants to get a better feel for how OCaml interacts with them as they integral to the programming paradigms and are used everywhere.
– rob