This is a follow-up to Type-Driven Development with TypeScript.
The shape of data defines a program. There are important benefits to writing out types for your data.
Let's consider a Hacker News client, which consumes stories and other items from the Hacker News API. This is a TypeScript type that describes the format for stories:
type Story = {
type: "story"
by: string // username
dead?: boolean
deleted?: boolean
descendants: number
id: number
kids?: number[] // numeric IDs of comments
score: number
text?: string // HTML content if the story is a text post
time: number // seconds since Unix epoch
title: string
url?: string // URL of linked article if the story is not a text post
}
In Javascript and other dynamically-typed languages, it is common to write
a program without any explicit description of a data structure like Story
.
The shape of the data is implied in the code that manipulates the data.
But that means anyone reading the code has to mentally reconstruct that
shape from context, or refer to documentation outside of the program itself.
Catching mistakes
TypeScript provides the option of documenting data structures in the form
of types.
An obvious advantage is that the type checker can identify mistakes like typos
in property names when accessing data.
The Hacker News API uses the property name descendants
;
for some reason every time I try to type descendants
I end up typing
descendents
by mistake.
If I did not have a type checker to point out that Story
does not have
a property named descendents
I could end up wasting a lot of time debugging!
List all changes
But this just scratches the surface. Types for data structures help keep programmers oriented. When a data structure is suddenly required to change, all you need to do is to update that particular type and the type checker will list all of the changes that need to be made to work with this new type.
Reducing cognitive load
When you come back to a program after you have been away from it long enough to forget how everything works, having descriptions of data structures right there in the code makes it much easier to understand what is going on. The same is true if more than one person is working on the project. Every detail that can be captured in types is one less detail that programmers have to carry in their heads. Reduced cognitive load leaves programmers with more energy for writing important business logic.
Bridging the gap with validators
But what if you make a mistake when you write the type?
I mentioned that I had problems mixing up descendants
and descendents
.
I actually made the same mistake the first time I wrote the Story
type.
The type checker cannot help me if I give it bad information from the start!
Unfortunately, a static type checker cannot check types against data from an
external API.
But what you can do is to write a validator that will check at runtime that
incoming data has the shape that you expect.
Then you can extract a static type from the validator that is guaranteed to
match any values that pass validation.
There is a nifty library called io-ts that works like magic.
Instead of the Story
type above
We can define a validator using io-ts combinators:
import * as t from "io-ts"
// The `V` in `StoryV` is short for `Validator`
const StoryV = t.type({
type: t.literal("story"), // value of property called `type` is the exact string `"story"`
by: t.string, // username
dead: optional(t.boolean),
deleted: optional(t.boolean),
descendants: t.number, // number of comments
id: t.number,
kids: optional(t.array(t.number)), // IDs of comments on an item
score: t.number,
text: optional(t.string), // HTML content if story is a text post
time: t.number, // seconds since Unix epoch
title: t.string,
url: optional(t.string), // URL of linked article if the story is not text post
})
// The `optional()` combinator is defined later in the article
This looks similar to the Story
type from the beginning of the post.
StoryV
expresses the properties of objects coming from the Hacker News API
with a type for each property.
(The t.type()
combinator produces a validator that expects an object with the
given property names and types.)
But this time the "types" for each property are actually values supplied by
io-ts: t.number
, t.string
, t.boolean
, etc.
Values can be referenced at runtime, types cannot.
With StoryV
we can validate any arbitrary Javascript value by calling
StoryV.decode(whateverValue)
.
If the given value is not an object with the expected properties then decode
will return an error value.
From validator to type
What makes io-ts uniquely valuable is that it simultaneously defines a runtime validator and a static type.
If StoreV.decode()
returns a success result, then TypeScript knows that
the resulting value has a descendants
property and does not have a descendents
property.
If a value passes validation, then it is guaranteed to match that static type, and we can use it to check the correctness of the rest of the program. If a value does not pass, then you will get a failure with a clear point in the program where it should be handled.
For example:
import fetch from "node-fetch"
async function fetchTitle(storyId: number): Promise<string> {
const res = await fetch(
`https://hacker-news.firebaseio.com/v0/item/${storyId}.json`,
)
const data = await res.json()
// If the data that is fetched does not match the `StoryV` validator then this
// line will result in a rejected promise.
const story = await decodeToPromise(StoryV, data)
// This line does not type-check because TypeScript can infer from the
// definition of `StoryV` that `story` does not have a property called
// `descendents`.
const ds = story.descendents
// TypeScript infers that `story` does have a `title` property with a value of
// type `string`, so this line passes type-checking.
return story.title
}
// `decodeToPromise` is defined later in the article
Validating incoming data at runtime allows the program to fail fast if there is
a mismatch between the data and the program's expectations.
In development, that makes it easy to catch bugs early:
any mismatch is identified immediately at the point where you call
decodeToPromise
.
You don't need fixtures or unit tests to check data ingress.
Yes,
the validation step could lead to failures in production that you would not
have seen otherwise if data comes in some unexpected shape under some condition -
but the alternative is for the program to limp along with unknown data leading to
possibly-undefined behavior.
Failing fast is better!
To minimize unnecessary validation errors it is a good idea to make your validators permissive in what they accept. For example, err on the side of marking properties as optional if there is any possibility that those properties will be absent in some cases. And you can exclude properties from the validator that you are not going to use in your program.
Referencing types produced using io-ts
StoryV
replaces the hand-written Story
type -
so we no longer have a way to refer to the type of story objects.
But we can get that type back!
Io-ts provides a type operator called t.TypeOf
that extracts a static type
from a validator.
We can define a new Story
type like this:
type Story = t.TypeOf<typeof StoryV>
Every TypeScript value has a type.
You can reference and manipulate the value at runtime.
Likewise, you can reference and manipulate the type at type check time.
The expression typeof StoryV
uses TypeScript's built-in typeof
operator to
get the typecheck-time representation of StoryV
which conveniently holds a complete description of the shape of story objects.
That description is wrapped in a validator type;
t.TypeOf
pulls the shape description out into an independent type.
You can use the computed Story
type in annotations in the rest of your
program:
function formatStory(story: Story): string {
return `"${story.title}" submitted by ${story.by}`
}
When data comes in different shapes
The Hacker News API publishes more than just stories.
The /v0/item/
endpoint alone also provides comments, job postings, polls, and
poll options,
which all have different shapes.
We want to be able to fetch an item from that endpoint
and use a runtime check on the type
property in the returned object to
determine what type of item it is.
And we want the type checker to verify the correctness of the whole process.
Let's use io-ts to create some more item definitions.
These will be similar to the definition of StoryV
.
Here are abbreviated definitions
(see the accompanying code for complete definitions):
const CommentV = t.type(
{
type: t.literal("comment"),
parent: t.number,
text: t.string, // HTML content
/* ... */
},
"Comment",
) // The second argument is a label that makes validation messages nicer.
type Comment = t.TypeOf<typeof CommentV>
const JobV = t.type(
{
type: t.literal("job"),
text: optional(t.string), // HTML content if job is a text post
url: optional(t.string), // URL of linked page if the job is not text post
/* ... */
},
"Job",
)
type Job = t.TypeOf<typeof JobV>
const PollV = t.type(
{
type: t.literal("poll"),
descendants: t.number, // number of comments
parts: t.array(t.number),
/* ... */
},
"Poll",
)
type Poll = t.TypeOf<typeof PollV>
const PollOptV = t.type(
{
type: t.literal("pollopt"),
poll: t.number, // ID of poll that includes this option
score: t.number,
text: t.string, // HTML content
/* ... */
},
"PollOpt",
)
type PollOpt = t.TypeOf<typeof PollOptV>
The Hacker News item API could return a story or any of these types,
which means that the type of values from the item API is a union of all five
types.
More specifically the type is a tagged union:
the type
property in API responses is a tag that we can use to distinguish
between types within the union.
A tagged union validator looks like this:
const ItemV = t.taggedUnion(
"type", // the name of the tag property
[CommentV, JobV, PollV, PollOptV, StoryV],
"Item", // a label to make validation messages nicer
)
type Item = t.TypeOf<typeof ItemV>
This is why it was important to use the t.literal()
combinator instead of
t.string
for the type of the type
property in each item validator:
using t.literal()
with a literal string makes the exact string value
available to the type checker.
With that information, TypeScript can use type guards to narrow the type of an
item
to a specific item type based on the value of item.type
.
For example:
function formatItem(item: Item): string {
switch (item.type) {
case "story":
// Stories have titles, so this is ok.
return `"${item.title}" submitted by ${item.by}`
case "job":
return `job posting: ${item.title}`
case "poll":
// Only polls have a `parts` property - this would not pass type checking
// without the type guard.
const numOpts = item.parts.length
return `poll: "${item.title}" - choose one of ${numOpts} options`
case "pollopt":
// In some item types `text` can be undefined, but not in poll options.
return `poll option: ${item.text}`
case "comment":
const excerpt =
item.text.length > 60 ? item.text.slice(0, 60) + "..." : item.text
return `${item.by} commented: ${excerpt}`
// Usually TypeScript will report an error if you do not include
// a `default` case in a `switch`. But in this function TypeScript infers
// that all possible item types have been handled.
}
}
By the way, io-ts also supports intersections, untagged unions, and other fun combinators. Oh, and io-ts supports Flow too - not just TypeScript!
Next steps
This was just a quick introduction to what io-ts is capable of, and techniques for applying type-checking to external data. The concepts here are not limited to consuming API data: I recommend similar use of io-ts validators when working with data loaded from a database, serialized messages between micro-services, user input, or any other case where data can come in from outside the program.
The best way to cement your understanding of a pattern is to experiment with it. I encourage you to check out the accompanying code and try adding some features. One idea is to display ID numbers with story titles and add an option so that if the user passes an ID as a command-line argument when running the script it displays a link and some comments on the corresponding story.
Appendix A: definition for optional
In a hand-written definition for an object type you can use a question mark to indicate that a property might be absent:
type Story = {
text?: string
url?: string
}
There is no easy way to do that with io-ts because the argument to t.type()
is an actual object, and object properties are either present or not present.
There is another combinator, t.partial()
, that describes an object where all
properties optional.
The idiomatic way to represent an object where some properties are optional is
to use an intersection of t.type()
for required properties, and t.partial()
for optional properties:
const StoryV = t.intersection(
[
t.type({
// required properties
type: t.literal("story"),
descendants: t.number, // number of comments
}),
t.partial({
// optional properties
text: t.string, // HTML content if story is a text post
url: t.string, // URL of linked article if the story is not text post
}),
],
"Story",
)
I used a different pattern in this article. I didn't want to introduce too many concepts all at once; so I didn't introduce intersections and nested definitions right away.
My optional()
combinator is a union of the given type with undefined
.
Technically this implies that we expect the given property to be present in
every case,
but that the value might be undefined
.
In practice, that distinction often does not matter,
and io-ts will validate an object that is missing a required property if the
type of that property is allowed to be undefined
.
But note that io-ts might make object validation more strict in the future!
This is the definition of optional
:
function optional<RT extends t.Any>(
type: RT,
name: string = `${type.name} | undefined`,
): t.UnionType<
[RT, t.UndefinedType],
t.TypeOf<RT> | undefined,
t.OutputOf<RT> | undefined,
t.InputOf<RT> | undefined
> {
return t.union<[RT, t.UndefinedType]>([type, t.undefined], name)
}
That is adapted from the maybe
combinator given in the io-ts README.
It is pretty dense for readers who do not have much experience with advanced
TypeScript use cases.
This is the sort of function that should be put into a library,
and I might do that in the future.
Appendix B: definition for decodeToPromise
The built-in io-ts method StoryV.decode()
returns an Either
value,
which is a type from the package fp-ts that can hold either an error or
a successful result.
It is similar to a promise except that it represents an immediate result,
not an asynchronous one.
The examples in this article use promises;
so I wrote a function, decodeToPromise
to put validation results into the
more familiar Promise
type.
Here is the definition:
import { reporter } from "io-ts-reporters"
// Apply a validator and get the result in a `Promise`
function decodeToPromise<T, O, I>(
validator: t.Type<T, O, I>,
input: I,
): Promise<T> {
const result = validator.decode(input)
return result.fold(
(errors) => {
const messages = reporter(result)
return Promise.reject(new Error(messages.join("\n")))
},
(value) => Promise.resolve(value),
)
}
fold()
is a method on the Either
type.
It is used to collapse a possibility of success and a possibility of error into
one definite value.
TypeScript checks that the error-case callback and the value-case callback have
compatible return types.
One callback or the other will run depending on whether the result
is an error or
a success value.
decodeToPromise
also invokes an io-ts reporter to translate a set of
validation errors into a readable message.