# Optparse-Applicative's parser structure

I confessed in my first post about Scalpel that I have a hard time getting a good intuition about `Applicative`

. To improve my understanding of the concept, I decided to investigate a great library, optparse-applicative - that I will nickname “the OA” in the rest of this post, for lulz and keyboard-related-lazyness reasons.

The OA can be compared, for instance, to Python’s argparse or Rust’s clap: a command-line option parser. There are several libraries in Haskell to perform this common task, conveniently listed here.

As the name suggests, the OA uses, emphatically, `Applicative`

; the README explains this choice: building your API lib around `Applicative`

is a common usage when it comes to parsing, established by libs like parsec , attoparsec and aeson.

The same README gives this simple usage example. This will be my main reference in this post, and I’ll try to detail every brick these few lines are built upon.

```
import Options.Applicative
import Data.Semigroup ((<>))
data Sample = Sample
{ hello :: String
, quiet :: Bool
, enthusiasm :: Int }
sample :: Parser Sample
sample = Sample
<$> strOption
( long "hello"
<> metavar "TARGET"
<> help "Target for the greeting" )
<*> switch
( long "quiet"
<> short 'q'
<> help "Whether to be quiet" )
<*> option auto
( long "enthusiasm"
<> help "How enthusiastically to greet"
<> showDefault
<> value 1
<> metavar "INT" )
```

Without even having to read any library code, if you know a bit of Haskell, you can already see that the OA provides a parametric `Parser`

abstraction, shown in the signature of the `sample`

method; it also comes with a set of utility methods - here, `strOption`

, `switch`

and `option`

, that we can compose inside a bigger applicative builder. These methods make heavy use of the shorter version of `mappend`

, `(<>)`

.

Having used various cli-parsing library in several languages, I have to say this is the most intuitive lib I’ve found. For most simple usecase, you will :

Create a datatype that represent what you want to parse (here,

`Sample`

).For each field of your data structure, use the relevant utility method (though you might have to roll your own sometimes)

Compose the various options through the

`Parser Applicative`

.

Once again, we can see that sitting on top of Haskell’s common abstractions allow for powerful, simple and easy to use interfaces. With this fairly basic API, the OA will give us an automatic help method, commodities to build autocompletion, and ways to plug custom error messages.

## How we will read the OA’s code

Unfortunately, the code required to produce such a clear and easy to read API is not frequently, itself, very clear and easy to read. It is fairly abstract, handle many potential cases, and sometimes relies on language extensions that beginners like me tend to avoid. And, most of all, and we will often see this when trying to read libraries, we encounter a common problem. I don’t know what’s the canonical word for this particular problem, so I’ll call it *nested abstractions*.

Taken individually, most abstractions are not scary. Monoids are types that can be associated to produce a new version of the same type, and that provide an identity value; that’s it. Functors are structure over a value that can be modified without modifying the structure itself. Even monads, to a point, are not *that* difficult to grasp conceptually. However, when we start mixing various abstractions, it gets tricky quickly.

This problem is not specific to Haskell, though. Every potent piece of software will make heavy use of the abstraction provided by the language, or to a set of custom abstractions, to the point where it gets harder to get a clear picture. And yet, I find Haskell *helps* here, precisely because most Haskell code will *nest* its abstractions. The same way that `MonadTransformers`

will tend to be nested; you have “something over something over something”; admittedly, this is scary at first, but at the same time, it gives you an *order* in which to read. I would even go further and say that Haskell lets us identify easily the two entrypoints for abstraction stacks.

For the big picture, we have types and common abstractions. Though we don’t know how a Parser is implemented, or even how

`strOption`

,`switch`

and`option`

work exactly, we have a general idea, thanks to our knowledge of`Applicative`

and`Semigroup`

. This let us*use the library without having to care about its implementation*, or even*study bits and pieces of the library with having already a general idea of how this pieces will fit in the general architecture*. (To a point, it’s reminiscent of the way that Design Patterns were supposed to save time, by giving common names to frequently encountered structures. Only, Haskell abstractions do it much better.)For studying things in depth, it’s generally not too difficult to get the nesting order; and we can build our knowledge of the library “from the bottom to the top”. Here, we know that the “smallest pieces” will be the stuff composed through the Semigroup’s

`<>`

; these will be composed by the Applicative bits; getting us finaly to our`Sample`

.

That said, I find there is no clear rule of thumb to decide what’s best between the “top-down” or “bottom-up” approach.

Sometimes, the definition for the broader abstraction is enough.

Sometimes, they are a bit too abstract and it’s better to start from the bottom.

It might because I’m not yet used to the whole set of common abstractions that Haskell offers, though. In this particular instance, I’ll try to “dive” as deep in the implementation as I can, till I can gradually come back to the highest level.

A final note; as during our exploration of Scalpel, you typically tackle Haskell libraries through their *types* first, and how these are *built* and *composed*. Only then can you have a fine grasp of the functions that will “do stuff out of these types”. Today, we will only study OA’s *types*, not the parsing process itself.

## So Many Options

A first look at the cabal file shows another great thing about the OA : it doesn’t need much. Look at the dependencies : transformers, a library to access the OS’ processes, a pretty printer, and that’s all. In the exposed modules, we notice in the listing a `Option.Applicative.Types`

that sounds promising, and we immediately open a 390-line long file.

This file contains the definition for the `Parser`

type; I’m sure an experience haskeller can read it with ease, but we’re not there yet, and when we look at it, we immediately regret having opened this file, we feel our heart failing us, and we’d like to see simpler things and less language extension, thank you very much. This is perfectly normal.

I won’t copy the definition of `Parser`

right now, because it’s scary. The type contains several recursive constructors; there is a *simple* one, thankfully, called `OptP`

, and this one mostyl contains a type named `Option`

. Since *a lot* of `Parser`

s will actually be `OptP`

, we will enter the rest of the types defined by the OA through this door.

### The Option type

`Option`

is also defined in Types.hs:

```
-- | A single option of a parser.
data Option a = Option
{ optMain :: OptReader a -- ^ reader for this option
, optProps :: OptProperties -- ^ properties of this option
}
```

An option is but a reader and some properties. Two other types, then ! `OptReader`

and `OptProperties`

. The second one is simpler, so let’s start here.

### The OptProperties type

In the same module :

```
-- | Specification for an individual parser option.
data OptProperties = OptProperties
{ propVisibility :: OptVisibility -- ^ whether this flag is shown is the brief description
, propHelp :: Chunk Doc -- ^ help text for this option
, propMetaVar :: String -- ^ metavariable for this option
, propShowDefault :: Maybe String -- ^ what to show in the help text as the default
, propDescMod :: Maybe ( Doc -> Doc ) -- ^ a function to run over the brief description
}
```

This is rather self-explanatory and boring. We’ll notice the odd `Chunk`

part; `Chunk`

is a “free monoid” defined in the Help directory, that handles stuff that happen when the user types “–help”. Actually, this whole type is mostly here to know how to display help messages. We won’t comment it any further.

### The OptReader type

Once again, in the Types.hs file:

```
-- | An 'OptReader' defines whether an option matches an command line argument.
data OptReader a
= OptReader [OptName] (CReader a) (String -> ParseError)
-- ^ option reader
| FlagReader [OptName] !a
-- ^ flag reader
| ArgReader (CReader a)
-- ^ argument reader
| CmdReader (Maybe String) [String] (String -> Maybe (ParserInfo a))
-- ^ command reader
```

The various builders help us grasp the various options the OA will let you use.

`OptReader`

is for proper optional arguments; stuff that you can add or omit after the executable but that will need further information if you specify them. E.g., the`tail`

command can take an argument`-c`

or`--bytes`

, followed by a number of bytes (to specifiy the number of bytes to output).`FlagReader`

is for flags; e.g., the`-a`

in`ls -a`

.`ArgReader`

is for mandatory, often called “positional” arguments. E.g.,`mv`

takes typically two positional arguments, a source and a destination.`CmdReader`

is for subcommands, something that adds a lot of complexity and that I’ll probably handle in another post, but that would let us manage many different behaviour, like when we type`git status`

and`git log`

.

Reading an option or a flag demands a list of [OptName], defined as:

```
data OptName = OptShort !Char
| OptLong !String
deriving (Eq, Ord, Show)
```

Once again, a classical pattern; you can type: `ls --human-readable`

. Though I’m yet to see someone actually type that, people tend to use the shorter version: `ls -h`

. Which is why we have a *list* of `OptName`

s. For the `ArgReader`

, this is not necessary; positional arguments are parsed in their order of appearance, so they don’t need names.

Finally, we note that an `OptReader`

can read potentially any `a`

; this is why we typically need a `CReader`

that will, probably, be able to “translate” the string from the command line into `a`

. This is the next type we need to read.

### The CReader type

Again, more nesting ! A `CReader`

(short of Command line Reader, I suppose) is:

```
data CReader a = CReader
{ crCompleter :: Completer
, crReader :: ReadM a }
```

We will ignore `Completer`

, used for autocompletion; and focus on the `ReadM`

. As proper haskellers, we are of course excited by this final M, for we are attracted by the delicious smell of Monads. And this time, we are going to touch a type that stands at the core of the library.

### The `ReadM`

type

Types.hs contain the whole definition for the `ReadM`

type:

```
-- | A newtype over 'ReaderT String Except', used by option readers.
newtype ReadM a = ReadM
{ unReadM :: ReaderT String (Except ParseError) a }
instance Functor ReadM where
fmap f (ReadM r) = ReadM (fmap f r)
instance Applicative ReadM where
pure = ReadM . pure
ReadM x <*> ReadM y = ReadM $ x <*> y
instance Alternative ReadM where
empty = mzero
(<|>) = mplus
instance Monad ReadM where
return = pure
ReadM r >>= f = ReadM $ r >>= unReadM . f
fail = readerError
instance MonadPlus ReadM where
mzero = ReadM mzero
mplus (ReadM x) (ReadM y) = ReadM $ mplus x y
-- | Return the value being read.
readerAsk :: ReadM String
readerAsk = ReadM ask
```

We have a `newtype`

is nothing but :

a

`Reader`

monad over a String…… on top of the

`Except`

monad over a`ParseError`

.

Though I suspect the `Functor`

, `Applicative`

and `Monad`

could have been automatically derived using the `GeneralizedNewtypeDeriving`

extension, the author wrote the implementation manually. If you know *nothing* about Haskell, `Reader`

is a way to combine functions who should all be able to have read-only access to a value - here, a simple `String`

. `ReaderT`

is a variation provided, in this case, by the mtl library, that lets us use `Reader`

plus another monad; in this instance, the `Except`

monad. This monad lets us combine functions that could throw errors at us rather than providing a nice value. Parsing from the command line involves reading something that a user typed; users have been proven to be unreliable; it makes sense we want to be able to throw exceptions.

The function `readerAsk`

is simply `ask`

from `ReaderT`

(“getting the value we have read-access over”) rewritten to stay in our `newtype`

. Though we can’t really be sure of it yet, it’s pretty obvious that “the value we have read access over” will be “what the user typed”. Note that, thanks to the genericity of this `a`

, though we will always get a `String`

as our shared value, we can transform it to any other type. So we can theoretically build a `ReadM Int`

, or a `ReadM`

for any type we created ourselves, actually.

### Too. Many. Types.

We know have a clearer picture of what `Option`

s are. We could probably build some by hand, without using the nice interface that optparse-applicative provides. It would be rather verbose, though, since there are a lot of types involved.

Just to build a simple, dumb, `ArgReader`

for a `String`

, we would need to type something like:

`ArgReader (CReader (mkCompleter (return . const [])) readerAsk)`

And that’s not even a full `Option`

, we would need to provide the `OptProperties`

.

The issue with type safety is that it can become verbose rather quickly. Fortunately, Haskell provides a wealth of strategy and solution to circumvent this verbosity while keeping our safe and sound type system. Many common abstractions are really syntactic sugar. Here, `Monoid`

is going to help us tremendously, as we’re going to see.

## Constructors “à la `Mod`

”

Before going on, I’ll add a few words on a common pattern in Haskell source-code architecture and a few rules of thumb:

The type model behind a library is typically exposed in a Types.hs module, as we’ve just seen.

When these types get cumbersome to write directly, it is fairly common to see modules with names like Builder.hs; this will typically be shortcuts offered to the user.

These API module are often separated from the innards and utilities that comes with it; a frequent idiom is to put these in a file in the same directory, named Internal.hs (sometimes, like in the OA, Internal.hs will be in a directory with name of the main module, the main module being in the top-directory). You typically won’t really need to read the Internal.hs, unless you’re trying, like we do right here, to understand the exact implementation.

With that in mind, we want to see how the OA manages to get to its pleasant, Applicative plus Monoid (well, Semigroup really) notation rather than the direct use of constructors from its datatypes. Part of the answer will be in the shortcuts functions defined in Builder.hs, but to understand these, we will need to read the Builder/Internal.hs file first, for there lives a powerful, hidden, type.

### The `Mod`

datatype

We need to investigate the Builder.Internal module, and go directly to the definitions given for a new datatype : `Mod`

. Let’s get a quick overview:

```
data Mod f a = Mod (f a -> f a)
(DefaultProp a)
(OptProperties -> OptProperties)
optionMod :: (OptProperties -> OptProperties) -> Mod f a
optionMod = Mod id mempty
fieldMod :: (f a -> f a) -> Mod f a
fieldMod f = Mod f mempty id
instance Monoid (Mod f a) where
mempty = Mod id mempty id
mappend = (<>)
instance Semigroup (Mod f a) where
Mod f1 d1 g1 <> Mod f2 d2 g2
= Mod (f2 . f1) (d2 <> d1) (g2 . g1)
```

Mod is parametric over `f`

and `a`

. To build a `Mod`

, we need a rather simple function (the `(f a -> f a)`

), something called a `DefaultProp`

and another simple function that takes a `OptProperties`

and returns another `OptProperties`

(we will do like the cool kids and call these “simple functions” endomorphism).

The two functions `optionMod`

and `fieldMod`

are smart constructors that let us build a `Mod`

with default `id`

values for the first or the second parameter.

`Mod`

is a `Monoid`

, and, as such, a `Semigroup`

(a `Monoid`

is a `Semigroup`

with `mempty`

, or, in more mathematical terms, an identity element; something that won’t change anything when `mappended`

, like “adding zero” or “multiplying by 1”).

The `mempty`

definition for `Mod`

tells us a bit more about what `DefaultProp`

are: they are monoids themselves, since we define them in terms of `mempty`

when defining the identity element of `Mod`

. And empty endomorphisms are simply… `id`

(see by yourself the canonical Endo monoid definition).

It makes sense : an endomorphism transforms a value of type `a`

into another value of type `a`

; the identity element is the element that doesn’t operate any change; so an “empty endomorphism” simply returns its parameter without changing it.

The `Semigroup`

instance for `Mod`

tells us how we can combine two `Mods`

. Once again, the definition is easy to read: combining endomorphisms will be, in this instance, function composition (so it’s actually the “monoid of endomorphism under composition”). As for the `DefaultProp`

, since we know they’re monoid themselves, we can use their own definition of `mappend`

.

So, what we have here is a bit of a pumped up `Endo`

. If you don’t know `Endo`

, here’s a dumb example of how you could use it:

```
> import Data.Monoid
> let endos = Endo (+1) <> Endo (+2) <> Endo (+3)
> :t endos
endos :: Num a => Endo a
> endos `appEndo` 0
6
```

### How `Mod`

is used

Let us go back to the first example of the way to define a parser with the OA. At this point, we should have the intuition that this bit:

```
strOption
( long "hello"
<> metavar "TARGET"
<> help "Target for the greeting" )
```

heavily uses, under the hood, the `Mod`

datatype - hence the (`<>`

) between `long`

, `metavar`

and `help`

. But since we want to be sure, the best thing now would be to check the definition for these functions, starting with `long`

:

```
-- | Specify a long name for an option.
long :: HasName f => String -> Mod f a
long = fieldMod . name . OptLong
```

We’ve seen this `fieldMod`

. It’s a smart constructor for mod.

```
fieldMod :: (f a -> f a) -> Mod f a
fieldMod f = Mod f mempty id
```

And we know that `OptLong`

is one of the constructors of `OptName`

. `HasName`

is a class defined in the internal of Builder. We know that we’re not always going to build stuff with names; positional arguments, for instance, don’t have names, whereas flags and optional arguments do. This typeclass, however, is only for options that have names.

```
class HasName f where
name :: OptName -> f a -> f a
```

So ! Long will build an `OptLong`

from its eta-reducted `String`

parameter; it will then give this as the first parameteter of `name`

. giving us a function of type `(f a -> f a)`

. Exactly what `fieldMod`

requires. We get a `Mod`

with our mistery, partially-applied `name`

function; an empty `DefaultProp`

; and finally, the “id” function. This `Mod`

will be `mappend`

ed to the result of `metavar`

.

```
metavar :: HasMetavar f => String -> Mod f a
metavar var = optionMod $ \p -> p { propMetaVar = var }
```

We know that `optionMod`

is the other smart constructor we’ve seen before. It uses `id`

as its first parameter when constructing `Mod`

; yet another `mempty`

for the `DefaultProp`

; and the `(OptProperties -> OptProperties)`

function will be the one we have to provide. Here, it’s a basic lambda that updates the `optProperties`

record, setting `propMetaVar`

. Note that, once again, we need to satisfy a typeclass (`HasMetavar`

).

So, how will the result of `long`

and the result of `metavar`

combine ? It’s rather easy to understand : the `f a -> f a`

from `long`

will be combined with the `id`

from `metavar`

(which is akin to only applying `f a -> f a`

). As for the `DefaultProp`

, being empty in both case, it will stay this way. And the `OptProperties`

endomorphism will be the composition of `id`

and the lambda from `metavar`

; so, only this lambda.

After `long`

and `metavar`

, there will be a call to another function, `help`

, that behave like `metavar`

, only it modifies another field of the `OptProperties`

record. This one will get composed to the previous lambda.

So ! Back to the small piece we were analyzing :

```
strOption
( long "hello"
<> metavar "TARGET"
<> help "Target for the greeting" )
```

We now know that the three last lines create a `Mod`

. But what’s the deal with the `strOption`

that takes this Mod as its parameter ?

### A first look at the Parser

`strOption`

is a shortcut and a specialization at the same time. Let’s see the signature and definition.

```
strOption :: IsString s => Mod OptionFields s -> Parser s
strOption = option str
```

It will take a `Mod`

parametrized over `OptionField`

as its input. We haven’t had the pleasure to meet `OptionFields`

yet, it lives inside the Builder.Internal module and it’s pretty simple:

```
data OptionFields a = OptionFields
{ optNames :: [OptName]
, optCompleter :: Completer
, optNoArgError :: String -> ParseError }
```

So OptionFields is parametric over `a`

(an `a`

that became a `s`

in the signature for `strOption`

), and it contains various data, first and foremost the names of the options. Allow me to quote, once again the definition for `Mod`

:

```
data Mod f a = Mod (f a -> f a)
(DefaultProp a)
(OptProperties -> OptProperties)
```

Mod is parametric over `f`

and `a`

; when using `strOption`

, `f`

is going to be this `OptionFields`

. Actually, there are x-Fields datatype defined for all of the `OptReader`

types; flags, arguments and sucommands. We still don’t really know what `a`

is, though. While we’re at it, we know that `strOption`

will use a `Mod OptionFields`

, but we’re yet to understand what it will do with it. It’s calling a function named `option`

, with a basic `ReadM`

over `String`

apty named `str`

: it will return the string that the user typed in the command-line.

We might want to read `option`

too:

```
option :: ReadM a -> Mod OptionFields a -> Parser a
option r m = mkParser d g rdr
where
Mod f d g = metavar "ARG" `mappend` m
fields = f (OptionFields [] mempty ExpectsArgError)
crdr = CReader (optCompleter fields) r
rdr = OptReader (optNames fields) crdr (optNoArgError fields)
```

So… `option`

is a utility method that takes a `ReadM`

; remember, a `ReadM`

is mostly a `ReaderT`

able to throw exceptions. It also takes a `Mod`

. And it will return this famous `Parser`

absraction that is more or less the end product of the OA.

Once again, this method is mostly delegating the handywork to another on, `mkParser`

. `d`

and `g`

, the first two parameters, will be the end result of the `Mod`

monoid, as we can see from the first line of the `where`

block. So `d`

is a `DefaultProp`

and `g`

is an endomorphism for `OptProperties`

.

This first line of the `where`

block, by the way !, is a thing of beauty. We’re creating a dumb `Mod`

to initialize the `metavar`

property with a default “ARG” value, and we `mappend`

this to our `Mod`

to get the final result of accumulated values; we immediately deconstruct it using pattern matching, and these deconstructed elements are the one we’re going to use for our call to `mkParser`

. This `Mod`

actually represent the result of all accumulated smaller, helping methods.

What about the three others lines of the where block ? Well, they mostly encapsulate precisely the boring bits we don’t want to have to write everytime. Let us do a quick recap on the types here…

A typical `Parser`

for `a`

contains:

an

`Option`

over`a`

…that is something that combines

`OptProperties`

and an`OptReader`

(over`a`

).An

`OptReader`

for optional arguments would take a list of`OptionNames`

, a`CReader`

(which is mostly a`ReadM a`

and some stuff for autocompletion) and a function that handle`ParseError`

.

`fields`

, `crdr`

and `rdr`

are there to build the `OptReader`

part of this hierarchy.

We know that the first type parameter for our `Mod`

is `OptionFields`

, and that `f`

is going to be a function `OptionFields -> OptionFields`

. We apply our `f`

(which, as this point, might be a long serie of function composition by each `mappend`

on `Mod`

) to a default `OptionField`

(without name, with an empty `Completer`

and a default error handler).

This intermediary type will be used to build the rest: our `CReader`

uses the `ReadM`

given as parameter for `option`

, plus the `Completer`

(if any) extracted from `fields`

; and the `rdr`

takes the `crdr`

we’ve just built, plus the extracted option names and error handler from `fields`

.

In other words : `option`

is a smart constructor built atop the `Mod`

monoid.

On the topic of monoids for smart, incrementing construction and pleasant API, there’s a good, short and clever post from OCharles. The OA is not that different from the solution suggested by OCharles, the implementation is just a bit harder to follow.

So, at the end of the day, we have a `DefaultProp`

(`d`

) (still mysterious to us since we didn’t study its definition), an endomorphism on `OptProperties`

(`g`

) and a fully fledged `OptReader`

(`rdr`

). Following the type signatures, this is everything that `mkParser`

need to make a… `Parser`

. All this for *any type a*. Time to see how this `mkParser`

operates.

## The Parser Applicative

We’ve dwelved in the depth of the type stack, it’s time to go up to the `Parser`

type itself. Since our entry point was one of its simplest constructor (`OptP`

), we shall start with this one.

### The Option constructor

In Builder/Internal.hs, we find the definition for `mkParser`

:

```
mkParser :: DefaultProp a
-> (OptProperties -> OptProperties)
-> OptReader a
-> Parser a
mkParser d@(DefaultProp def _) g rdr = liftOpt opt <|> maybe empty pure def
where
opt = mkOption d g rdr
```

This should be pretty easy to read if you’re familiar with `Alternative`

and its wonderful `(<|>)`

operator. Basically, this code could be translated as: “try to do the thing at the left of `(<|>)`

and if it doesn’t work, well, I’ll try to output the default.” Without even knowing the way `DefaultProp`

is implemented, we now have a solid idea of how it’s used.

The rest is mostly a cascade of smart constructors. `mkParser`

uses `mkOption`

:

```
mkOption :: DefaultProp a
-> (OptProperties -> OptProperties)
-> OptReader a
-> Option a
mkOption d g rdr = Option rdr (mkProps d g)
```

Which uses `mkProps`

:

```
mkProps :: DefaultProp a
-> (OptProperties -> OptProperties)
-> OptProperties
mkProps (DefaultProp def sdef) g = props
where
props = (g baseProps)
{ propShowDefault = sdef <*> def }
```

So. Remember this line from `option`

and how it was used ?

`fields = f (OptionFields [] mempty ExpectsArgError)`

It’s the same trick at play here. We are using a `OptProperties`

function that, per the magic of the `Mod`

monoid, could be a composition of many functions, and we’ll apply it to the constant `baseProps`

:

```
baseProps :: OptProperties
baseProps = OptProperties
{ propMetaVar = ""
, propVisibility = Visible
, propHelp = mempty
, propShowDefault = Nothing
, propDescMod = Nothing
}
```

So if we had:

```
strOption (metavar "STR"
<> help "A string")
```

… we would automatically compose these two lambdas:

```
\p -> p { help = help}
\p -> p { propMetaVar = var }
```

However, the `propShowDefault`

field (here to indicate if the –help message should display the default value or not) *is not* accessible this way, it’s stored in the `DefaultProp`

(which handle everything default-related), hence the need for the `DefaultProp`

parameter in `mkProps`

, and the final update in the `props`

function defined in the `where`

block.

You might be surprised by the `Applicative`

syntax `sdef <*> def`

, particularly because we still don’t know how `DefaultProp`

is implemented. If you’re intersted, the next subsection is for you. If not, skip it.

### A word on `DefaultProp`

A `DefaultProp`

is parametric over a type `a`

. It contains a `Maybe a`

and a `Maybe (a -> String)`

. This should of course tip you off as to its Applicative nature if you remember the signatures associated with the typeclass. In other words, it’s “maybe a way to transform any `a`

as a `String`

” and “maybe an `a`

”; a potential function and a potential value (in the deconstruction in `mkProps`

, `sdef`

is the function and `def`

is the value).

We can apply this potential function to this potential value using `Applicative`

. If the function or the value is `Nothing`

, we’ll get `Nothing`

. If we have both `Just`

a function and `Just`

a value, we’ll get `Just`

a value.

```
> import Data.Maybe
> let def = Just 2
> let sdef = Just show
> sdef <*> def
Just "2"
> let def' = Nothing
> sdef <*> def'
Nothing
```

That’s an interesting “real world usage” for Applicatives. Most example for Applicatives typically use constructors over applicative values, for instance:

```
> data Test = Test String String deriving (Show)
> Test <$> Just "hello" <*> Just "world"
Just (Test "hello" "world")
> Test <$> ["hello", "salutations"] <*> ["world", "universe"]
[ Test "hello" "world"
, Test "hello" "universe"
, Test "salutations" "world"
, Test "salutations" "universe"]
```

But you can use applicative in any situation where you have “maybe a function”. You could, for example, use a datatype modeling various potential transformation using `Applicative`

.

Feature-wise, you probably understood by now that `DefaultProp`

let us define the default value of an argument, if any, and the way it should be displayed, should it be so.

### Back to mkParser

Now that we know we’re going to get a “Option” and we globally know how it’s build, time to go back to `mkParser`

.

```
mkParser :: DefaultProp a
-> (OptProperties -> OptProperties)
-> OptReader a
-> Parser a
mkParser d@(DefaultProp def _) g rdr = liftOpt opt <|> maybe empty pure def
where
opt = mkOption d g rdr
```

`liftOpt`

is nothing but a synonym for the `OptP`

constructor of `Parser`

(it’s defined in the Common.hs module). I think we should now be ready to face the actual implementation of the `Parser`

type (in the Types.hs module), the very one I didn’t want to start with:

```
data Parser a
= NilP (Maybe a)
| OptP (Option a)
| forall x . MultP (Parser (x -> a)) (Parser x)
| AltP (Parser a) (Parser a)
| forall x . BindP (Parser x) (x -> Parser a)
```

If you’re like me and not quite familiar yet with every language extensions, you might be surprised by the mix of rather classic constructors and constraints. This is allowed by the `ExistentialQuantification`

extension. This extension lets us add constraints at the constructor level, which vanilla Haskell won’t allow. Picture the `MultP`

constructor without the extension:

`| MultP (Parser (x -> a)) (Parser x)`

The parameter for `MultP`

do look a bit like the parameters for `(<*>)`

, don’t they ? Replace the `f`

in `f (a -> b)`

and `f a`

by `Parser`

.

However, we’re not in an `instance`

block, we’re defining the constructor of a datatype. The way it’s written in this definition, we don’t know what `x`

will be; we only know that x *can be something else* that our `a`

. This would not compile, ghc would complain that `x`

is not defined. So we need the forall constraint to get the desired flexibility; if we don’t want to make `Parser`

parametric over `a`

AND `x`

we need `ExistentialQuantification`

.

Granted, this datatype is rather abstract and we’re having a hard time getting a good intuition for everything it can contain. Though, without any context, we can already see that there are more or less two types (in the colloquial sense of the word !) of constructors for `Parser`

: we have `NilP`

and `OptP`

, “simple constructors” on the one hand, and `MultP`

, `AltP`

and `BindP`

, three recursives constructors, on the other.

Though we are not going to study in this post (it’s already fairly long !) how parsers are run, we can peek inside a utility function from Common.hs to get a better idea of what the constructors represent:

```
evalParser :: Parser a -> Maybe a
evalParser (NilP r) = r
evalParser (OptP _) = Nothing
evalParser (MultP p1 p2) = evalParser p1 <*> evalParser p2
evalParser (AltP p1 p2) = evalParser p1 <|> evalParser p2
evalParser (BindP p k) = evalParser p >>= evalParser . k
```

Let’s ignore the first two values. As our intuition suggested, `MultP`

models an `Applicative`

where we apply the first parser (well, actually, the result of the evaluation of the first parser) to the second one (once again, to the result of the evaluation of the second parser). `AltP`

is the same with alternative and `BindP`

for monads. So we could say that the `Parser`

type is inhabited by “simple parsing” and “parsing combination patterns”.

### Building parsers

Let’s go back to the initial example !

```
sample :: Parser Sample
sample = Sample
<$> strOption
( long "hello"
<> metavar "TARGET"
<> help "Target for the greeting" )
<*> switch
( long "quiet"
<> short 'q'
<> help "Whether to be quiet" )
<*> option auto
( long "enthusiasm"
<> help "How enthusiastically to greet"
<> showDefault
<> value 1
<> metavar "INT" )
```

If we follow the definitions for `strOption`

, `switch`

and `option`

, they will all get us an `OptP`

. Actually, almost every primitive from the Builder.hs will get you a single Parser with `OptP`

.

But `Parser`

is, itself an applicative !

```
instance Applicative Parser where
pure = NilP . Just
(<*>) = MultP
```

So all our `OptP`

are going to be combined through `<*>`

to become `MultP`

. In other words, an unsugarized version for the basic example could be:

```
ps = NilP . Just . Sample
p1 = strOption (long "hello" <> metavar "TARGET" <> help "Target for greeting")
p2 = switch (long "quiet" <> short 'q' <> help "Wether to be quiet")
p3 = option auto (long "enthusiasm" <> help "How enthusiastically to greet" <> showDefault <> value 1 <> metavar "INT")
sample = MultP (MultP (MultP ps p1) p2) p3
```

Which is admittedly not as pleasant to read or maintain.

## This is only the structure

It is important to point out that, right now, the only thing we have is a *structure*. `Parser`

s contain information (names, expected type, potential combinations…) and they even encapsulate a “real parsing function” (the `ReadM`

), but right now, they’re not doing much. To actually parse stuff, we will need to use the `execParser`

primitive defined in the Extra.hs module. But that’s for another post.

But we’ve seen a lot of things already: common library organization patterns, smart constructors using monoids, and a pretty extended example on how datatypes can be combined to form a beautiful interface. Note how extensible this design is; we didn’t study in depth the way it can be leverage to provide the autocompletion feature or the wealth of configuration for help or error message generation, but all the condition to implement these are there. After all, beautiful, clear, safe, structures are the foundations of any good piece of software.