# What I Wish I Knew When Learning Haskell

Version 2.5

## Version

This is the fifth major draft of this document since 2009.

Pull requests are always accepted for changes and additional content. This is a living document. The only way this document will stay up to date is through the kindness of readers like you and community patches and pull requests on Github.

If you’d like a physical copy of the text you can either print it for yourself (see Printable PDF) or purchase one online:

## Author

This text is authored by Stephen Diehl.

- Web: www.stephendiehl.com
- Twitter: https://twitter.com/smdiehl
- Github: https://github.com/sdiehl

Special thanks for Erik Aker for copyediting assitance.

## License

Copyright © 2009-2020 Stephen Diehl

This code included in the text is dedicated to the public domain. You can copy, modify, distribute and perform the code, even for commercial purposes, all without asking permission.

You may distribute this text in its full form freely, but may not reauthor or sublicense this work. Any reproductions of major portions of the text must include attribution.

The software is provided “as is”, without warranty of any kind, express or implied, including But not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, Arising from, out of or in connection with the software or the use or other dealings in the software.

# Basics

## What is Haskell?

At its heart Haskell is a lazy, functional, statically-typed programming language with advanced type system features such as higher-rank, higher-kinded parametric polymorphism, monadic effects, generalized algebraic data types, ad-hoc polymorphism through type classes, associated type families, and more. As a programming language, Haskell pushes the frontiers of programming language design more so than any other general purpose language while still remaining practical for everyday use.

Beyond language features, Haskell remains an organic, community-driven effort, run by its userbase instead of by corporate influences. While there are some Haskell companies and consultancies, most are fairly small and none have an outsized influence on the development of the language. This is in stark contrast to ecosystems like Java and Go where Oracle and Google dominate all development. In fact, the Haskell community is a synthesis between multiple disciplines of academic computer science and industrial users from large and small firms, all of whom contribute back to the language ecosystem.

Originally, Haskell was borne out of academic research. Designed as an ML dialect, it was initially inspired by an older language called Miranda. In the early 90s, a group of academics formed the GHC committee to pursue building a research vehicle for lazy programming languages as a replacement for Miranda. This was a particularly in-vogue research topic at the time and as a result the committee attracted various talented individuals who initiated the language and ultimately laid the foundation for modern Haskell.

Over the last 30 years Haskell has evolved into a mature ecosystem, with an equally mature compiler. Even so, the language is frequently reimagined by passionate contributors who may be furthering academic research goals or merely contributing out of personal interest. Although laziness was originally the major research goal, this has largely become a quirky artifact that most users of the language are generally uninterested in. In modern times the major themes of Haskell community are:

- A vehicle for type system research
- Experimentation in the design space of typed effect systems
- Algebraic structures as a method of program synthesis
- Referential transparency as a core language feature
- Embedded domain specific languages
- Experimentation toward practical dependent types
- Stronger encoding of invariants through type-level programming
- Efficient functional compiler design
- Alternative models of parallel and concurrent programming

Although these are the major research goals, Haskell is still a fully general purpose language, and it has been applied in wildly diverse settings from garbage trucks to cryptanalysis for the defense sector and everything in-between. With a thriving ecosystem of industrial applications in web development, compiler design, machine learning, financial services, FPGA development, algorithmic trading, numerical computing, cryptography research, and cybersecurity, the language has a lot to offer to any field or software practitioner.

Haskell as an ecosystem is one that is purely organic, it takes decades to evolve, makes mistakes and is not driven by any one ideology or belief about the purpose of functional programming. This makes Haskell programming simultaneously frustrating and exciting; and therein lies the fun that has been the intellectual siren song that has drawn many talented programmers to dabble in this beautiful language at some point in their lives.

See:

## How to Read

This is a guide for working software engineers who have an interest in Haskell but don’t know Haskell yet. I presume you know some basics about how your operating system works, the shell, and some fundamentals of other imperative programming languages. If you are a Python or Java software engineer with no Haskell experience, this is the executive summary of Haskell theory and practice for you. We’ll delve into a little theory as needed to explain concepts but no more than necessary. If you’re looking for a purely introductory tutorial, this probably isn’t the right start for you, however this can be read as a companion to other introductory texts.

There is no particular order to this guide, other than the first chapter which describes how to get set up with Haskell and use the foundational compiler and editor tooling. After that you are free to browse the chapters in any order. Most are divided into several sections which outline different concepts, language features or libraries. However, the general arc of this guide bends toward more complex topics in later chapters.

As there is no ordering after the first chapter I will refer to concepts globally without introducing them first. If something doesn’t make sense just skip it and move on. I strongly encourage you to play around with the source code modules for yourself. Haskell cannot be learned from an armchair; instead it can only be mastered by writing a ton of code for yourself. GHC may initially seem like a cruel instructor, but in time most people grow to see it as their friend.

## GHC

GHC is the Glorious Glasgow Haskell Compiler. Originally written in 1989, GHC is now the de facto standard for Haskell compilers. A few other compilers have existed along the way, but they either are quite limited or have bit rotted over the years. At this point, GHC is a massive compiler and it supports a wide variety of extensions. It’s also the only reference implementation for the Haskell language and as such, it defines what Haskell the language is by its implementation.

GHC is run at the command line with the command `ghc`

.

GHC’s runtime is written in C and uses machinery from GCC infrastructure for its native code generator and can also use LLVM for its native code generation. GHC is supported on the following architectures:

- Linux x86
- Linux x86_64
- Linux PowerPC
- NetBSD x86
- OpenBSD x86
- FreeBSD x86
- MacOS X Intel
- MacOS X PowerPC
- Windows x86_64

GHC itself depends on the following Linux packages.

- build-essential
- libgmp-dev
- libffi-dev
- libncurses-dev
- libtinfo5

## ghcup

There are two major packages that need to be installed to use Haskell:

- ghc
- cabal-install

GHC can be installed on Linux and Mac with ghcup by running the following command:

This can be used to manage multiple versions of GHC installed locally.

To select which version of GHC is available on the PATH use the `set`

command.

This can also be used to install cabal.

To modify your shell to include ghc and cabal.

Or you can permanently add the following to your `.bashrc`

or `.zshrc`

file:

## Package Managers

There are two major Haskell packaging tools: **Cabal** and **Stack**. Both take differing views on versioning schemes but can more or less interoperate at the package level. So, why are there two different package managers?

The simplest explanation is that Haskell is an organic ecosystem with no central authority, and as such different groups of people with different ideas and different economic interests about optimal packaging built their own solutions around two different models. The interests of an organic community don’t always result in open source convergence; however, the ecosystem has seen both package managers reach much greater levels of stability as a result of collaboration. In this article, I won’t offer a preference for which system to use: it is left up to the reader to experiment and use the system which best suits your or your company’s needs.

## Project Structure

A typical Haskell project hosted on Github or Gitlab will have several **executable**, **test** and **library** components across several subdirectories. Each of these files will correspond to an entry in the Cabal file.

```
.
├── app # Executable entry-point
│ └── Main.hs # main-is file
├── src # Library entry-point
│ └── Lib.hs # Exposed module
├── test # Test entry-point
│ └── Spec.hs # Main-is file
├── ChangeLog.md # extra-source-files
├── LICENSE # extra-source-files
├── README.md # extra-source-files
├── package.yaml # hpack configuration
├── Setup.hs
├── simple.cabal # cabal configuration generated from hpack
├── stack.yaml # stack configuration
├── .hlint.yaml # hlint configuration
└── .ghci # ghci configuration
```

More complex projects consisting of multiple modules will include multiple project directories like those above, but these will be nested in subfolders with a `cabal.project`

or `stack.yaml`

in the root of the repository.

```
.
├── lib-one # component1
├── lib-two # component2
├── lib-three # component3
├── stack.yaml # stack project configuration
└── cabal.project # cabal project configuration
```

An example Cabal project file, named `cabal.project`

above, this multi-component library repository would include these lines.

By contrast, an example Stack project `stack.yaml`

for the above multi-component library repository would be:

## Cabal

Cabal is the build system for Haskell. Cabal is also the standard build tool for Haskell source supported by GHC. Cabal can be used simultaneously with Stack or standalone with cabal new-build.

To update the package index from Hackage, run:

To start a new Haskell project, run:

This will result in a `.cabal`

file being created with the configuration options for our new project.

Cabal can also build dependencies in parallel by passing `-j<n>`

where `n`

is the number of concurrent builds.

Let’s look at an example `.cabal`

file. There are two main entry points that any package may provide: a `library`

and an `executable`

. Multiple executables can be defined, but only one library. In addition, there is a special form of executable entry point `Test-Suite`

, which defines an interface for invoking unit tests from `cabal`

.

For a **library**, the `exposed-modules`

field in the `.cabal`

file indicates which modules within the package structure will be publicly visible when the package is installed. These modules are the user-facing APIs that we wish to expose to downstream consumers.

For an **executable**, the `main-is`

field indicates the module that exports the `main`

function responsible for running the executable logic of the application. Every module in the package must be listed in one of `other-modules`

, `exposed-modules`

or `main-is`

fields.

```
name: mylibrary
version: 0.1
cabal-version: >= 1.10
author: Paul Atreides
license: MIT
license-file: LICENSE
synopsis: My example library.
category: Math
tested-with: GHC
build-type: Simple
library
exposed-modules:
Library.ExampleModule1
Library.ExampleModule2
build-depends:
base >= 4 && < 5
default-language: Haskell2010
ghc-options: -O2 -Wall -fwarn-tabs
executable "example"
build-depends:
base >= 4 && < 5,
mylibrary == 0.1
default-language: Haskell2010
main-is: Main.hs
Test-Suite test
type: exitcode-stdio-1.0
main-is: Test.hs
default-language: Haskell2010
build-depends:
base >= 4 && < 5,
mylibrary == 0.1
```

To run an “executable” under `cabal`

execute the command:

To load the “library” into a GHCi shell under `cabal`

execute the command:

The `<name>`

metavariable is either one of the executable or library declarations in the `.cabal`

file and can optionally be disambiguated by the prefix `exe:<name>`

or `lib:<name>`

respectively.

To build the package locally into the `./dist/build`

folder, execute the `build`

command:

To run the tests, our package must itself be reconfigured with the `--enable-tests`

flag and the `build-depends`

options. The `Test-Suite`

must be installed manually, if not already present.

```
$ cabal install --only-dependencies --enable-tests
$ cabal configure --enable-tests
$ cabal test
$ cabal test <name>
```

Moreover, arbitrary shell commands can be invoked with the GHC environmental variables. It is quite common is to run a new bash shell with this command such that the `ghc`

and `ghci`

commands use the package environment. This can also run any system executable with the `GHC_PACKAGE_PATH`

variable set to the libraries package database.

The haddock documentation can be generated for the local project by executing the `haddock`

command. The documentation will be built to the `./dist`

folder.

When we’re finally ready to upload to Hackage ( presuming we have a Hackage account set up ), then we can build the tarball and upload with the following commands:

The current state of a local build can be frozen with all current package constraints enumerated:

This will create a file `cabal.config`

with the constraint set.

The `cabal`

configuration is stored in `$HOME/.cabal/config`

and contains various options including credential information for Hackage upload.

A library can also be compiled with runtime profiling information enabled. More on this is discussed in the section on Concurrency and Profiling.

Another common flag to enable is `documentation`

which forces the local build of Haddock documentation, which can be useful for offline reference. On a Linux filesystem these are built to the `/usr/share/doc/ghc-doc/html/libraries/`

directory.

Cabal can also be used to install packages globally to the system PATH. For example the parsec package to your system from Hackage, the upstream source of Haskell packages, invoke the `install`

command:

To download the source for a package, we can use the `get`

command to retrieve the source from Hackage.

## Cabal New-Build

The interface for Cabal has seen an overhaul in the last few years and has moved more closely towards the Nix-style of local builds. Under the new system packages are separated into categories:

**Local Packages**- Packages are built from a configuration file which specifies a path to a directory with a cabal file. These can be working projects as well as all of its local transitive dependencies.**External Packages**- External packages are packages retrieved from either the public Hackage or a private Hackage repository. These packages are hashed and stored locally in`~/.cabal/store`

to be reused across builds.

As of Cabal 3.0 the new-build commands are the default operations for build operations. So if you type `cabal build`

using Cabal 3.0 you are already using the new-build system.

Historically these commands were separated into two different command namespaces with prefixes `v1-`

and `v2-`

, with v1 indicating the old sandbox build system and the v2 indicating the new-build system.

The new build commands are listed below:

```
new-build Compile targets within the project.
new-configure Add extra project configuration
new-repl Open an interactive session for the given component.
new-run Run an executable.
new-test Run test-suites
new-bench Run benchmarks
new-freeze Freeze dependencies.
new-haddock Build Haddock documentation
new-exec Give a command access to the store.
new-update Updates list of known packages.
new-install Install packages.
new-clean Clean the package store and remove temporary files.
new-sdist Generate a source distribution file (.tar.gz).
```

Cabal also stores all of its build artifacts inside of a `dist-newstyle`

folder stored in the project working directory. The compilation artifacts are of several categories.

`.hi`

- Haskell interface modules which describe the type information, public exports, symbol table, and other module guts of compiled Haskell modules.`.hie`

- An extended interface file containing module symbol data.`.hspp`

- A Haskell preprocessor file.`.o`

- Compiled object files for each module. These are emitted by the native code generator assembler.`.s`

- Assembly language source file.`.bc`

- Compiled LLVM bytecode file.`.ll`

- An LLVM source file.`cabal_macros.h`

- Contains all of the preprocessor definitions that are accessible when using the CPP extension.`cache`

- Contains all artifacts generated by solving the constraints of packages to set up a build plan. This also contains the hash repository of all external packages.`packagedb`

- Database of all of the cabal metadata of all external and local packages needed to build the project. See Package Databases.

These all get stored under the `dist-newstyle`

folder structure which is set up hierarchically under the specific CPU architecture, GHC compiler version and finally the package version.

```
dist-newstyle
├── build
│ └── x86_64-linux
│ └── ghc-8.6.5
│ └── mypackage-0.1.0
│ ├── build
│ │ ├── autogen
│ │ │ ├── cabal_macros.h
│ │ │ └── Paths_mypackage.hs
│ │ ├── libHSmypackage-0.1.0-inplace.a
│ │ ├── libHSmypackage-0.1.0-inplace-ghc8.6.5.so
│ │ ├── MyPackage
│ │ │ ├── Example.dyn_hi
│ │ │ ├── Example.dyn_o
│ │ │ ├── Example.hi
│ │ │ ├── Example.o
│ │ ├── MyPackage.dyn_hi
│ │ ├── MyPackage.dyn_o
│ │ ├── MyPackage.hi
│ │ └── MyPackage.o
│ ├── cache
│ │ ├── build
│ │ ├── config
│ │ └── registration
│ ├── package.conf.inplace
│ │ ├── package.cache
│ │ └── package.cache.lock
│ └── setup-config
├── cache
│ ├── compiler
│ ├── config
│ ├── elaborated-plan
│ ├── improved-plan
│ ├── plan.json
│ ├── solver-plan
│ ├── source-hashes
│ └── up-to-date
├── packagedb
│ └── ghc-8.6.5
│ ├── package.cache
│ ├── package.cache.lock
│ └── mypackage-0.1.0-inplace.conf
└── tmp
```

## Local Packages

Both Stack and Cabal can handle local packages built from the local filesystem, from remote tarballs, or from remote Git repositories.

Inside of the `stack.yaml`

simply specify the git repository remote and the hash to pull.

```
resolver: lts-14.20
packages:
# From Git
- git: https://github.com/sdiehl/protolude.git
commit: f5c2bf64b147716472b039d30652846069f2fc70
```

In Cabal to add a remote create a `cabal.project`

file and add your remote in the `source-repository-package`

section.

```
packages: .
source-repository-package
type: git
location: https://github.com/hvr/HsYAML.git
tag: e70cf0c171c9a586b62b3f75d72f1591e4e6aaa1
```

## Version Bounds

All Haskell packages are versioned and the numerical quantities in the version are supposed to follow the Package Versioning Policy.

As packages evolve over time there are three numbers which monotonically increase depending on what has changed in the package.

- Major version number
- Minor version number
- Patch version number

```
-- PVP summary: +-+------- breaking API changes
-- | | +----- non-breaking API additions
-- | | | +--- code changes with no API change
version: 0.1.0.0
```

Every library’s cabal file will have a packages dependencies section which will specify the external packages which the library depends on. It will also contain the allowed versions that it is known to build against in the `build-depends`

section. The convention is to put the upper bound to the next major unreleased version and the lower bound at the currently used version.

```
build-depends:
base >= 4.6 && <4.14,
async >= 2.0 && <2.3,
deepseq >= 1.3 && <1.5,
containers >= 0.5 && <0.7,
hashable >= 1.2 && <1.4,
transformers >= 0.2 && <0.6,
text >= 1.2 && <1.3,
bytestring >= 0.10 && <0.11,
mtl >= 2.1 && <2.3
```

Individual lines in the version specification can be dependent on other variables in the cabal file.

Version bounds in cabal files can be managed automatically with a tool `cabal-bounds`

which can automatically generate, update and format cabal files.

See:

## Stack

Stack is an alternative approach to Haskell’s package structure that emerged in 2015. Instead of using a rolling build like Cabal, Stack breaks up sets of packages into release blocks that guarantee internal compatibility between sets of packages. The package solver for Stack uses a different strategy for resolving dependencies than cabal-install has historically used and Stack combines this with a centralised build server called Stackage which continuously tests the set of packages in a resolver to ensure they build against each other.

#### Install

To install `stack`

on Linux or Mac, run:

For other operating systems, see the official install directions.

#### Usage

Once `stack`

is installed, it is possible to setup a build environment on top of your existing project’s `cabal`

file by running:

An example `stack.yaml`

file for GHC 8.8.1 would look like this:

Most of the common libraries used in everyday development are already in the Stackage repository. The `extra-deps`

field can be used to add Hackage dependencies that are not in the Stackage repository. They are specified by the package and the version key. For instance, the `zenc`

package could be added to `stack build`

in the following way:

The `stack`

command can be used to install packages and executables into either the current build environment or the global environment. For example, the following command installs the executable for `hlint`

, a popular linting tool for Haskell, and places it in the PATH:

To check the set of dependencies, run:

Just as with `cabal`

, the build and debug process can be orchestrated using `stack`

commands:

```
$ stack build # Build a cabal target
$ stack repl # Launch ghci
$ stack ghc # Invoke the standalone compiler in stack environment
$ stack exec bash # Execute a shell command with the stack GHC environment variables
$ stack build --file-watch # Build on every filesystem change
```

To visualize the dependency graph, use the dot command piped first into graphviz, then piped again into your favorite image viewer:

## Hpack

Hpack is an alternative package description language that uses a structured YAML format to generate Cabal files. Hpack succeeds in DRYing (Don’t Repeat Yourself) several sections of cabal files that are often quite repetitive across large projects. Hpack uses a `package.yaml`

file which is consumed by the command line tool `hpack`

. Hpack can be integrated into Stack and will generate resulting cabal files whenever `stack build`

is invoked on a project using a `package.yaml`

file. The output cabal file contains a hash of the input yaml file for consistency checking.

A small `package.yaml`

file might look something like the following:

```
name : example
version : 0.1.0
synopsis : My fabulous library
description : My fabulous library
maintainer : John Doe
github : john/example
category : Development
ghc-options: -Wall
dependencies:
- base >= 4.9 && < 5
- protolude
- deepseq
- directory
- filepath
- text
- containers
- unordered-containers
- aeson
- pretty-simple
library:
source-dirs: src
exposed-modules:
- Example
executable:
main: Main.hs
source-dirs: exe
dependencies:
- example
tests:
spec:
main: Test.hs
source-dirs:
- test
- src
dependencies:
- example
- tasty
- tasty-hunit
```

## Base

GHC itself ships with a variety of core libraries that are loaded into all Haskell projects. The most foundational of these is `base`

which forms the foundation for all Haskell code. The base library is split across several modules.

*Prelude*- The default namespace imported in every module.*Data*- The simple data structures wired into the language*Control*- Control flow functions*Foreign*- Foreign function interface*Numeric*- Numerical tower and arithmetic operations*System*- System operations for Linux/Mac/Windows*Text*- Basic String types.*Type*- Typelevel operations*GHC*- GHC Internals*Debug*- Debug functions*Unsafe*- Unsafe “backdoor” operations

There have been several large changes to Base over the years which have resulted in breaking changes that means older versions of base are not compatible with newer versions.

- Monad Applicative Proposal (AMP)
- MonadFail Proposal (MFP)
- Semigroup Monoid Proposal (SMP)

## Prelude

The Prelude is the default standard module. The Prelude is imported by default into all Haskell modules unless either there is an explicit import statement for it, or the NoImplicitPrelude extension is enabled.

The Prelude exports several hundred symbols that are the default datatypes and functions for libraries that use the GHC-issued prelude. Although the Prelude is the default import, many libraries these days do not use the standard prelude instead choosing to roll a custom one on a per-project basis or to use an off-the shelf prelude from Hackage.

The Prelude contains common datatype and classes such as List, Monad, Maybe and most associated functions for manipulating these structures. These are the most foundational programming constructs in Haskell.

## Modern Haskell

There are two official language standards:

- Haskell98
- Haskell2010

And then there is what is colloquially referred to as Modern Haskell which is not an official language standard, but an ambiguous term to denote the emerging way most Haskellers program with new versions of GHC. The language features typically included in modern Haskell are not well-defined and will vary between programmers. For instance, some programmers prefer to stay quite close to the Haskell2010 standard and only include a few extensions while some go all out and attempt to do full dependent types in Haskell.

By contrast, the type of programming described by the phrase Modern Haskell typically utilizes some type-level programming, as well as flexible typeclasses, and a handful of Language Extensions.

## Flags

GHC has a wide variety of flags that can be passed to configure different behavior in the compiler. Enabling GHC compiler flags grants the user more control in detecting common code errors. The most frequently used flags are:

Flag | Description |
---|---|

`-fwarn-tabs` |
Emit warnings of tabs instead of spaces in the source code |

`-fwarn-unused-imports` |
Warn about libraries imported without being used |

`-fwarn-name-shadowing` |
Warn on duplicate names in nested bindings |

`-fwarn-incomplete-uni-patterns` |
Emit warnings for incomplete patterns in lambdas or pattern bindings |

`-fwarn-incomplete-patterns` |
Warn on non-exhaustive patterns |

`-fwarn-overlapping-patterns` |
Warn on pattern matching branches that overlap |

`-fwarn-incomplete-record-updates` |
Warn when records are not instantiated with all fields |

`-fdefer-type-errors` |
Turn type errors into warnings |

`-fwarn-missing-signatures` |
Warn about toplevel missing type signatures |

`-fwarn-monomorphism-restriction` |
Warn when the monomorphism restriction is applied implicitly |

`-fwarn-orphans` |
Warn on orphan typeclass instances |

`-fforce-recomp` |
Force recompilation regardless of timestamp |

`-fno-code` |
Omit code generation, just parse and typecheck |

`-fobject-code` |
Generate object code |

Like most compilers, GHC takes the `-Wall`

flag to enable all warnings. However, a few of the enabled warnings are highly verbose. For example, `-fwarn-unused-do-bind`

and `-fwarn-unused-matches`

typically would not correspond to errors or failures.

Any of these flags can be added to the `ghc-options`

section of a project’s `.cabal`

file. For example:

```
ghc-options:
-fwarn-tabs
-fwarn-unused-imports
-fwarn-missing-signatures
-fwarn-name-shadowing
-fwarn-incomplete-patterns
```

The flags described above are simply the most useful. See the official reference for the complete set of GHC’s supported flags.

For information on debugging GHC internals, see the commentary on GHC internals.

## Hackage

Hackage is the upstream source of open source Haskell packages. With Haskell’s continuing evolution, Hackage has become many things to developers, but there seem to be two dominant philosophies of uploaded libraries.

**A Repository for Production Libraries**

In the first philosophy, libraries exist as reliable, community-supported building blocks for constructing higher level functionality on top of a common, stable edifice. In development communities where this method is the dominant philosophy, the authors of libraries have written them as a means of packaging up their understanding of a problem domain so that others can build on their understanding and expertise.

**An Experimental Playground**

In contrast to the previous method of packaging, a common philosophy in the Haskell community is that Hackage is a place to upload experimental libraries as a means of getting community feedback and making the code publicly available. Library authors often rationalize putting these kinds of libraries up without documentation, often without indication of what the library actually does or how it works. This unfortunately means a lot of Hackage namespace has become polluted with dead-end, bit-rotting code. Sometimes packages are also uploaded purely for internal use within an organisation, or to accompany an academic paper. These packages are often left undocumented as well.

For developers coming to Haskell from other language ecosystems that favor the former philosophy (e.g., Python, JavaScript, Ruby), seeing *thousands of libraries without the slightest hint of documentation or description of purpose* can be unnerving. It is an open question whether the current cultural state of Hackage is sustainable in light of these philosophical differences.

Needless to say, there is a lot of very low-quality Haskell code and documentation out there today, so being conservative in library assessment is a necessary skill. That said, there are also quite a few phenomenal libraries on Hackage that are highly curated by many people.

As a general rule, if the Haddock documentation for the library does not have a **minimal working example**, it is usually safe to assume that it is an RFC-style library and probably should be avoided for production code.

There are several heuristics you can use to answer the question **Should I Use this Hackage Library**:

- Check the
**Uploaded**to see if the author has updated it in the last five years. - Check the
**Maintainer**email address, if the author has an academic email address and has not uploaded a package in two or more years, it is safe to assume that this is a*thesis project*and probably should not be used industrially. - Check the
**Modules**to see if the author has included toplevel Haddock docstrings. If the author has not included any documentation then the library is likely of low-quality and should not be used industrially. - Check the
**Dependencies**for the bound on`base`

package. If it doesn’t include the latest base included with the latest version of GHC then the code is likely not actively maintained. - Check the reverse Hackage search to see if the package is used by other libraries in the ecosystem. For example: https://packdeps.haskellers.com/reverse/QuickCheck

An example of a bitrotted package:

**https://hackage.haskell.org/package/numeric-quest**

An example of a well maintained package:

**https://hackage.haskell.org/package/QuickCheck**

## Stackage

Stackage is an alternative opt-in packaging repository which mirrors a subset of Hackage. Packages that are included in Stackage are built in a massive continuous integration process that checks to see that given versions link successfully against each other. This can give a higher degree of assurance that the bounds of a given resolver ensure compatibility.

Stackage releases are built nightly and there are also long-term stable (LTS) releases. Nightly resolvers have a date convention while LTS releases have a major and minor version. For example:

`lts-14.22`

`nightly-2020-01-30`

See:

## GHCi

GHCi is the interactive shell for the GHC compiler. GHCi is where we will spend most of our time in everyday development. Following is a table of useful GHCi commands.

Command | Shortcut | Action |
---|---|---|

`:reload` |
`:r` |
Code reload |

`:type` |
`:t` |
Type inspection |

`:kind` |
`:k` |
Kind inspection |

`:info` |
`:i` |
Information |

`:print` |
`:p` |
Print the expression |

`:edit` |
`:e` |
Load file in system editor |

`:load` |
`:l` |
Set the active Main module in the REPL |

`:module` |
`:m` |
Add modules to imports |

`:add` |
`:ad` |
Load a file into the REPL namespace |

`:instances` |
`:in` |
Show instances of a typeclass |

`:browse` |
`:bro` |
Browse all available symbols in the REPL namespace |

The introspection commands are an essential part of debugging and interacting with Haskell code:

```
λ: :info Functor
class Functor f where
fmap :: (a -> b) -> f a -> f b
(<$) :: a -> f b -> f a
-- Defined in `GHC.Base'
...
```

Querying the current state of the global environment in the shell is also possible. For example, to view module-level bindings and types in GHCi, run:

To examine module-level imports, execute:

Language extensions can be set at the repl.

To see compiler-level flags and pragmas, use:

```
λ: :set
options currently set: none.
base language is: Haskell2010
with the following modifiers:
-XNoDatatypeContexts
-XNondecreasingIndentation
GHCi-specific dynamic flag settings:
other dynamic, non-language, flag settings:
-fimplicit-import-qualified
warning settings:
λ: :showi language
base language is: Haskell2010
with the following modifiers:
-XNoDatatypeContexts
-XNondecreasingIndentation
-XExtendedDefaultRules
```

Language extensions and compiler pragmas can be set at the prompt. See the Flag Reference for the vast collection of compiler flag options.

Several commands for the interactive shell have shortcuts:

Function | |
---|---|

`+t` |
Show types of evaluated expressions |

`+s` |
Show timing and memory usage |

`+m` |
Enable multi-line expression delimited by `:{` and `:}` . |

## .ghci.conf

The GHCi shell can be customized globally by defining a configure file `ghci.conf`

in `$HOME/.ghc/`

or in the current working directory as `./.ghci.conf`

.

For example, we can add a command to use the Hoogle type search from within GHCi. First, install `hoogle`

:

Then, we can enable the search functionality by adding a command to our `ghci.conf`

:

```
:set prompt "λ: "
:def hlint const . return $ ":! hlint \"src\""
:def hoogle \s -> return $ ":! hoogle --count=15 \"" ++ s ++ "\""
```

```
λ: :hoogle (a -> b) -> f a -> f b
Data.Traversable fmapDefault :: Traversable t => (a -> b) -> t a -> t b
Prelude fmap :: Functor f => (a -> b) -> f a -> f b
```

It is common community tradition to set the prompt to a colored `λ`

:

GHC can also be coerced into giving slightly better error messages:

GHCi can also use a pretty printing library to format all output, which is often much easier to read. For example if your project is already using the amazing `pretty-simple`

library simply include the following line in your ghci configuration.

```
:set -ignore-package pretty-simple -package pretty-simple
:def! pretty \ _ -> pure ":set -interactive-print Text.Pretty.Simple.pPrint"
:pretty
```

And the default prelude can also be disabled and swapped for something more sensible:

```
:seti -XNoImplicitPrelude
:seti -XFlexibleContexts
:seti -XFlexibleInstances
:seti -XOverloadedStrings
import Protolude -- or any other preferred prelude
```

#### GHCi Performance

For large projects, GHCi with the default flags can use quite a bit of memory and take a long time to compile. To speed compilation by keeping artifacts for compiled modules around, we can enable object code compilation instead of bytecode.

Enabling object code compilation may complicate type inference, since type information provided to the shell can sometimes be less informative than source-loaded code. This underspecificity can result in breakage with some language extensions. In that case, you can temporarily reenable bytecode compilation on a per module basis with the `-fbyte-code`

flag.

If all you need is to typecheck your code in the interactive shell, then disabling code generation entirely makes reloading code almost instantaneous:

## Editor Integration

Haskell has a variety of editor tools that can be used to provide interactive development feedback and functionality such as querying types of subexpressions, linting, type checking, and code completion. These are largely provided by the haskell-ide-engine which serves as an editor agnostic backend that interfaces with GHC and Cabal to query code.

**Vim**

**Emacs**

**VSCode**

- haskell-ide-engine - Tab completion plugin
- language-haskell - Syntax highlighting plugin
- ghcid - Interactive error reporting plugin
- hie-server - Jump to definition and tag handling plugin
- hlint - Linting and style-checking plugin
- ghcide - Interactive completion plugin
- ormolu-vscode - Code formatting plugin

## Linux Packages

There are several upstream packages for Linux packages which are released by GHC development. The key ones of note for Linux are:

For scripts and operations tools, it is common to include commands to add the following apt repositories, and then use these to install the signed GHC and cabal-install binaries (if using Cabal as the primary build system).

```
$ sudo add-apt-repository -y ppa:hvr/ghc
$ sudo apt-get update
$ sudo apt-get install -y cabal-install-3.0 ghc-8.8.1
```

It is not advisable to use a Linux system package manager to manage Haskell dependencies. Although this can be done, in practice it is better to use Cabal or Stack to create locally isolated builds to avoid incompatibilities.

## Names

Names in Haskell exist within a specific namespace. Names are either unqualified of the form:

Or qualified by the module where they come from, such as:

The major namespaces are described below with their naming conventions

Namespace | Convention |
---|---|

Modules | Uppercase |

Functions | Lowercase |

Variables | Lowercase |

Type Variables | Lowercase |

Datatypes | Uppercase |

Constructors | Uppercase |

Typeclasses | Uppercase |

Synonyms | Uppercase |

Type Families | Uppercase |

## Modules

A module consists of a set of imports and exports and when compiled generates an interface which is linked against other Haskell modules. A module may reexport symbols from other modules.

```
-- A module starts with its export declarations of symbols declared in this file.
module MyModule (myExport1, myExport2) where
-- Followed by a set of imports of symbols from other files
import OtherModule (myImport1, myImport2)
-- Rest of the logic and definitions in the module follow
-- ...
```

Modules’ dependency graphs optionally may be cyclic (i.e. they import symbols from each other) through the use of a boot file, but this is often best avoided if at all possible.

Various module import strategies exist. For instance, we may:

Import all symbols into the local namespace.

Import select symbols into the local namespace:

Import into the global namespace masking a symbol:

Import symbols qualified under `Data.Map`

namespace into the local namespace.

Import symbols qualified and reassigned to a custom namespace (`M`

, in the example below):

You may also dump multiple modules into the same namespace so long as the symbols do not clash:

A main module is a special module which reserves the name `Main`

and has a mandatory export of type `IO ()`

which is invoked when the executable is run.. This is the entry point from the system into a Haskell program.

## Functions

Functions are the central construction in Haskell. A function `f`

of two arguments `x`

and `y`

can be defined in a single line as the left-hand and right-hand side of an equation:

This line defines a function named `f`

of two arguments, which on the right-hand side adds and yields the result. Central to the idea of functional programming is that computational functions should behave like mathematical functions. For instance, consider this mathematical definition of the above Haskell function, which, aside from the parentheses, looks the same:

*f*(*x*, *y*) = *x* + *y*

In Haskell, a function of two arguments need not necessarily be applied to two arguments. The result of applying only the first argument is to yield another function to which later the second argument can be applied. For example, we can define an `add`

function and subsequently a single-argument `inc`

function, by merely pre-applying `1`

to `add`

:

In addition to named functions Haskell also has anonymous lambda functions denoted with a backslash. For example the identity function:

Is identical to:

Functions may call themselves or other functions as arguments; a feature known as *higher-order functions*. For example the following function applies a given argument `f`

, which is itself a function, to a value `x`

twice.

## Types

Typed functional programming is essential to the modern Haskell paradigm. But what are types precisely?

The *syntax* of a programming language is described by the constructs that define its types, and its *semantics* are described by the interactions among those constructs. A type system overlays additional structure on top of the syntax that imposes constraints on the formation of expressions based on the context in which they occur.

Dynamic programming languages associate types with values *at evaluation*, whereas statically typed languages associate types to expressions *before evaluation*. Dynamic languages are in a sense as statically typed as static languages, however they have a degenerate type system with only one type.

The dominant philosophy in functional programming is to “make invalid states unrepresentable” at compile-time, rather than performing massive amounts of runtime checks. To this end Haskell has developed a rich type system that is based on typed lambda calculus known as Girard’s System-F (See Rank-N Types) and has incrementally added extensions to support more type-level programming over the years.

The following *ground types* are quite common:

`()`

- The unit type`Char`

- A single unicode character (“code point”)`Text`

- Unicode strings`Bool`

- Boolean values`Int`

- Machine integers`Integer`

- GMP arbitrary precision integers`Float`

- Machine floating point values`Double`

- Machine double floating point values

*Parameterised types* consist of a type and several *type parameters* indicated as lower case *type variables*. These are associated with common data structures such as lists and tuples.

`[a]`

– Homogeneous lists with elements of type`a`

`(a,b)`

– Tuple with two elements of types`a`

and`b`

`(a,b,c)`

– Tuple with three elements of types`a`

,`b`

, and`c`

The type system grows quite a bit from here, but these are the foundational types you’ll first encounter. See the later chapters for all types off advanced features that can be optionally turned on.

*This tutorial will only cover a small amount of the theory of type systems. For a more thorough treatment of the subject there are two canonical texts:*

- Pierce, B. C., & Benjamin, C. (2002).
**Types and Programming Languages**. MIT Press. - Harper, R. (2016).
**Practical Foundations for Programming Languages**. Cambridge University Press.

## Type Signatures

A toplevel Haskell function consists of two lines. The *value-level* definition which is a function name, followed by its arguments on the left-hand side of the equals sign, and then the function body which computes the value it yields on the right-hand side:

```
myFunction x y = x ^ 2 + y ^ 2
-- ^ ^ ^ ^^^^^^^^^^^^^
-- | | | |
-- | | | +-- function body
-- | | +------ second argument
-- | +-------- first argument
-- +-------------- function
```

The *type-level* definition is the function name followed by the type of its arguments separated by arrows, and the final term is the type of the entire function body, meaning the type of value yielded by the function itself.

```
myFunction :: Int -> Int -> Int
-- ^ ^ ^ ^^^^^
-- | | | |
-- | | | +- return type
-- | | +------ second argument
-- | +------------ first argument
-- +----------------------- function
```

Here is a simple example of a function which adds two integers.

Functions are also capable of invoking other functions inside of their function bodies:

The simplest function, called the *identity function*, is a function which takes a single value and simply returns it back. This is an example of a polymorphic function since it can handle values of *any type*. The identity function works just as well over strings as over integers.

This can alternatively be written in terms of an anonymous *lambda function* which is a backslash followed by a space-separated list of arguments, followed by a function body.

One of the big ideas in functional programming is that functions are themselves first class values which can be passed to other functions as arguments themselves. For example the `applyTwice`

function takes an argument `f`

which is of type (`a -> a`

) and it applies that function over a given value `x`

twice and yields the result. `applyTwice`

is a higher-order function which will transform one function into another function.

Often to the left of a type signature you will see a big arrow `=>`

which denotes a set of **constraints** over the type signature. Each of these constraints will be in uppercase and will normally mention at least one of the type variables on the right hand side of the arrow. These constraints can mean many things but in the simplest form they denote that a type variable must have an implementation of a type class. The `add`

function below operates over any two similar values `x`

and `y`

, but these values must have a numerical interface for adding them together.

Type signatures can also appear at the value level in the form of *explicit type signatures* which are denoted in parentheses.

These are sometimes needed to provide additional hints to the typechecker when specific terms are ambiguous to the typechecker, or when additional language extensions have been enabled which don’t have precise inference methods for deducing all type variables.

## Currying

In other languages functions normally have an *arity* which prescribes the number of arguments a function can take. Some languages have fixed arity (like Fortran) others have flexible arity (like Python) where a variable of number of arguments can be passed. Haskell follows a very simple rule: all functions in Haskell take a single argument. For multi-argument functions (some of which we’ve already seen), arguments will be individually applied until the function is *saturated* and the function body is evaluated.

For example, the add function from above can be partially applied to produce an add1 function:

Uncurrying is the process of taking a function which takes two arguments and transforming it into a function which takes a tuple of arguments. The Haskell prelude includes both a curry and an uncurry function for transforming functions into those that take multiple arguments from those that take a tuple of arguments and vice versa:

For example, uncurry applied to the add function creates a function that takes a tuple of integers:

## Algebraic Datatypes

Custom datatypes in Haskell are defined with the `data`

keyword followed by the the type name, its parameters, and then a set of *constructors*. The possible constructors are either *sum types* or of *product types*. All datatypes in Haskell can expressed as sums of products. A sum type is a set of options that is delimited by a pipe.

A datatype can only ever be inhabited by only single value from a sum type and intuitively models a set of “options” a value may take. While a product type is a combination of a set of typed values, potentially named by record fields. For example the following are two definitions of a Point product type, the latter with two fields `x`

and `y`

.

As another example: A deck of common playing cards could be modeled by the following set of product and sum types:

```
data Suit = Clubs | Diamonds | Hearts | Spades
data Color = Red | Black
data Value
= Two
| Three
| Four
| Five
| Six
| Seven
| Eight
| Nine
| Ten
| Jack
| Queen
| King
| Ace
deriving (Eq, Ord)
```

A record type can use these custom datatypes to define all the parameters that define an individual playing card.

Some example values:

```
queenDiamonds :: Card
queenDiamonds = Card Diamonds Red Queen
-- Alternatively
queenDiamonds :: Card
queenDiamonds = Card { suit = Diamonds, color = Red, value = Queen }
```

The problem with the definition of this datatype is that it admits several values which are malformed. For instance it is possible to instantiate a `Card`

with a suit `Hearts`

but with color `Black`

which is an invalid value. The convention for preventing these kind of values in Haskell is to limit the export of constructors in a module and only provide a limited set of functions which the module exports, which can enforce these constraints. These are **smart constructors** and an extremely common pattern in Haskell library design. For example we can export functions for building up specific suit cards that enforce the color invariant.

```
module Cards (Card, diamond, spade, heart, club) where
diamond :: Value -> Card
diamond = Card Diamonds Red
spade :: Value -> Card
spade = Card Spades Black
heart :: Value -> Card
heart = Card Hearts Red
club :: Value -> Card
club = Card Clubs Black
```

Datatypes may also be **recursive**, in the sense that they can contain themselves as fields. The most common example is a linked list which can be defined recursively as either an empty list or a value linked to a potentially nested version of itself.

An example value would look like:

Constructors for datatypes can also be defined as infix symbols. This is somewhat rare, but is sometimes used in more math heavy libraries. For example the constructor for our list type could be defined as the infix operator `:+:`

. When the value is printed using a Show instance, the operator will be printed in infix form.

## Lists

Linked lists or *cons lists* are a fundamental data structure in functional programming. GHC has builtin syntactic sugar in the form of list syntax which allows us to write lists that expand into explicit invocations of the *cons* operator `(:)`

. The operator is right associative and an example is shown below:

This syntax also extends to the typelevel where lists are represented as brackets around the type of values they contain.

The cons operator itself has the type signature which takes a *head element* as its first argument and a *tail argument* as its second.

The `Data.List`

module from the standard Prelude defines a variety of utility functions for operations over linked lists. For example the `length`

function returns the integral length of the number of elements in the linked list.

While the `take`

function extracts a fixed number of elements from the list.

Both of these functions are *pure* and return a new list without modifying the underlying list passed as an argument.

Another function `iterate`

is an example of a function which returns an *infinite list*. It takes as its first argument a function and then repeatedly applies that function to produce a new element of the linked list.

Consuming these infinite lists can be used as a control flow construct to construct loops. For example instead of writing an explicit loop, as we would in other programming languages, we instead construct a function which generates list elements. For example producing a list which produces subsequent powers of two:

We can then use the `take`

function to evaluate this *lazy* stream to a desired depth.

An equivalent loop in an imperative language would look like the following.

```
def powersOfTwo(n):
square_list = [1]
for i in range(1,n+1):
square_list.append(2 ** i)
return square_list
print(powersOfTwo(15))
```

## Pattern Matching

To unpack an algebraic datatype and extract its fields we’ll use a built in language construction known as *pattern matching*. This is denoted by the `case`

syntax and *scrutinizes* a specific value. A case expression will then be followed by a sequence of *matches* which consist of a *pattern* on the left and an arbitrary expression on the right. The left patterns will all consist of constructors for the type of the scrutinized value and should enumerate all possible constructors. For product type patterns that are scrutinized a sequence of variables will bind the fields associated with its positional location in the constructor. The types of all expressions on the right hand side of the matches must be identical.

Pattern matches can be written in explicit case statements or in toplevel functional declarations. The latter simply expands the former in the desugaring phase of the compiler.

```
data Example = Example Int Int Int
example1 :: Example -> Int
example1 x = case x of
Example a b c -> a + b + c
example2 :: Example -> Int
example2 (Example a b c) = a + b +c
```

Following on the playing card example in the previous section, we could use a pattern to produce a function which scores the face value of a playing card.

```
value :: Value -> Integer
value card = case card of
Two -> 2
Three -> 3
Four -> 4
Five -> 5
Six -> 6
Seven -> 7
Eight -> 8
Nine -> 9
Ten -> 10
Jack -> 10
Queen -> 10
King -> 10
Ace -> 1
```

And we can use a double pattern match to produce a function which determines which suit trumps another suit. For example we can introduce an order of suits of cards where the ranking of cards proceeds (Clubs, Diamonds, Hearts, Spaces). A `_`

underscore used inside a pattern indicates a wildcard pattern and matches against any constructor given. This should be the last pattern used a in match list.

```
suitBeats :: Suit -> Suit -> Bool
suitBeats Clubs Diamonds = True
suitBeats Clubs Hearts = True
suitBeats Clubs Spaces = True
suitBeats Diamonds Hearts = True
suitBeats Diamonds Spades = True
suitBeats Hearts Spades = True
suitBeats _ _ = False
```

And finally we can write a function which determines if another card beats another card in terms of the two pattern matching functions above. The following pattern match brings the values of the record into the scope of the function body assigning to names specified in the pattern syntax.

```
beats :: Card -> Card -> Bool
beats (Card suit1 color1 value1) (Card suit2 color2 value2) =
(suitBeats suit1 suit2) && (value1 > value2)
```

Functions may also invoke themselves. This is known as *recursion*. This is quite common in pattern matching definitions which recursively tear down or build up data structures. This kind of pattern is one of the defining modes of programming functionally.

The following two recursive pattern matches are desugared forms of each other:

Pattern matching on lists is also an extremely common pattern. It has special pattern syntax and the tail variable is typically pluralized. In the following `x`

denotes the head variable and `xs`

denotes the tail. For example the following function traverses a list of integers and adds `(+1)`

to each value.

## Guards

Guard statements are expressions that evaluate to boolean values that can be used to restrict pattern matches. These occur in a pattern match statements at the toplevel with the pipe syntax on the left indicating the guard condition. The special `otherwise`

condition is just a renaming of the boolean value `True`

exported from Prelude.

Guards can also occur in pattern case expressions.

```
absoluteJust :: Maybe Int -> Maybe Int
absoluteJust n = case n of
Nothing -> Nothing
Just n
| n < 0 -> Just (-n)
| otherwise -> Just n
```

## Operators and Sections

An operator is a function that can be applied using infix syntax or partially applied using a section. Operators can be defined to use any combination of the special ASCII symbols or any unicode symbol.

`!`

`#`

`%`

`&`

`*`

`+`

`.`

`/`

`<`

`=`

`>`

`?`

`@`

`\`

`^`

`|`

`-`

`~`

`:`

The following are reserved syntax and cannot be overloaded:

`..`

`:`

`::`

`=`

`\`

`|`

`<-`

`->`

`@`

`~`

`=>`

Operators are of one of three fixity classes.

- Infix - Place between expressions
- Prefix - Placed before expressions
- Postfix - Placed after expressions. See Postfix Operators.

Expressions involving infix operators are disambiguated by the operator’s fixity and precedence. Infix operators are either left or right associative. Associativity determines how operators of the same precedence are grouped in the absence of parentheses.

```
a + b + c + d = ((a + b) + c) + d -- left associative
a + b + c + d = a + (b + (c + d)) -- right associative
```

Precedence and associativity are denoted by fixity declarations for the operator using either `infixr`

`infixl`

and `infix`

. The standard operators defined in the Prelude have the following precedence table.

```
infixr 9 .
infixr 8 ^, ^^, **
infixl 7 *, /, `quot`, `rem`, `div`, `mod`
infixl 6 +, -
infixr 5 ++
infix 4 ==, /=, <, <=, >=, >
infixr 3 &&
infixr 2 ||
infixr 1 >>, >>=
infixr 0 $, `seq`
```

Sections are written as `( op e )`

or `( e op )`

. For example:

Operators written within enclosed parens are applied like traditional functions. For example the following are equivalent:

## Tuples

Tuples are heterogeneous structures which contain a fixed number of values. Some simple examples are shown below:

```
-- 2-tuple
tuple2 :: (Integer, String)
tuple2 = (1, "foo")
-- 3-tuple
tuple3 :: (Integer, Integer, Integer)
tuple3 = (10, 20, 30)
```

For two-tuples there are two functions `fst`

and `snd`

which extract the left and right values respectively.

GHC supports tuples to size 62.

## Where & Let Clauses

Haskell syntax contains two different types of declaration syntax: `let`

and `where`

. A let binding is an expression and binds anywhere in its body. For example the following let binding declares `x`

and `y`

in the expression `x+y`

.

A where binding is a toplevel syntax construct (i.e. not an expression) that binds variables at the end of a function. For example the following binds `x`

and `y`

in the function body of `f`

which is `x+y`

.

Where clauses following the Haskell *layout rule* where definitions can be listed on newlines so long as the definitions have greater indentation than the first toplevel definition they are bound to.

## Conditionals

Haskell has builtin syntax for scrutinizing boolean expressions. These are first class expressions known as `if`

statements. An if statement is of the form `if cond then trueCond else falseCond`

. Both the `True`

and `False`

statements must be present.

If statements are just syntactic sugar for `case`

expressions over boolean values. The following example is equivalent to the above example.

## Function Composition

Functions are obviously at the heart of functional programming. In mathematics function composition is an operation which takes two functions and produces another function with the result of the first argument function applied to the result of the second function. This is written in mathematical notation as:

*g* ∘ *f*

The two functions operate over a domain. For example *X*, *Y* and *Z*.

*f* : *X* → *Y* *g* : *Y* → *Z*

Or in Haskell notation:

Composition operation results in a new function:

*g* ∘ *f* : *X* → *Z*

In Haskell this operator is given special infix operator to appear similar to the mathematical notation. Intuitively it takes two functions of types `b -> c`

and `a -> b`

and composes them together to produce a new function. This is the canonical example of a higher-order function.

Haskell code will liberally use this operator to compose chains of functions. For example the following composes a chain of list processing functions `sort`

, `filter`

and `map`

:

Another common higher-order function is the `flip`

function which takes as its first argument a function of two arguments, and reverses the order of these two arguments returning a new function.

The most common operator in all of Haskell is the function application operator `$`

. This function is right associative and takes the entire expression on the right hand side of the operator and applies it to a function on the left.

This is quite often used in the pattern where the left hand side is a composition of other functions applied to a single argument. This is common in *point-free* style of programming which attempts to minimize the number of input arguments in favour of pure higher order function composition. The flipped form of this function does the opposite and is left associative, and applies the entire left hand side expression to a function given in the second argument to the function.

For comparison consider the use of `$`

, `&`

and explicit parentheses.

```
ex1 = f1 . f2 . f3 . f4 $ input -- with ($)
ex1 = input & f1 . f2 . f3 . f4 -- with (&)
ex1 = (f1 . f2 . f3 . f4) input -- with explicit parens
```

The `on`

function takes a function `b`

and yields the result of applying unary function `u`

to two arguments `x`

and `y`

. This is a higher order function that transforms two inputs and combines the outputs.

This is used quite often in sort functions. For example we can write a custom sort function which sorts a list of lists based on length.

```
λ: import Data.List
λ: sortSize = sortBy (compare `on` length)
λ: sortSize [[1,2], [1,2,3], [1]]
[[1],[1,2],[1,2,3]]
```

## List Comprehensions

List comprehensions are a syntactic construct that first originated in the Haskell language and has now spread to other programming languages. List comprehensions provide a simple way of working with lists and sequences of values that follow patterns. List comprehension syntax consists of three components:

**Generators**- Expressions which evaluate a list of values which are iteratively added to the result.**Let bindings**- Expressions which generate a constant value which is scoped on each iteration.**Guards**- Expressions which generate a boolean expression which determine whether an iteration is added to the result.

The simplest generator is simply a list itself. The following example produces a list of integral values, each element multiplied by two.

We can extend this by adding a let statement which generalizes the multiplier on each step and binds it to a variable `n`

.

And we can also restrict the set of resulting values to only the subset of values of `x`

that meet a condition. In this case we restrict to only values of `x`

which are odd.

Comprehensions with multiple generators will combine each generator pairwise to produce the *cartesian product* of all results.

```
λ: [(x,y) | x <- [1,2,3], y <- [10,20,30]]
[(1,10),(1,20),(1,30),(2,10),(2,20),(2,30),(3,10),(3,20),(3,30)]
λ: [(x,y,z) | x <- [1,2], y <- [10,20], z <- [100,200]]
[(1,10,100),(1,10,200),(1,20,100),(1,20,200),(2,10,100),(2,10,200),(2,20,100),(2,20,200)]
```

Haskell has builtin comprehension syntax which is syntactic sugar for specific methods of the `Enum`

typeclass.

Syntax Sugar | Enum Class Method |
---|---|

`[ e1.. ]` |
`enumFrom e1` |

`[ e1,e2.. ]` |
`enumFromThen e1 e2` |

`[ e1..e3 ]` |
`enumFromTo e1 e3` |

`[ e1,e2..e3 ]` |
`enumFromThenTo e1 e2 e3` |

There is an `Enum`

instance for `Integer`

and `Char`

types and so we can write list comprehensions for both, which generate ranges of values.

```
λ: [1 .. 15]
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
λ: ['a' .. 'z']
"abcdefghijklmnopqrstuvwxyz"
λ: [1,3 .. 15]
[1,3,5,7,9,11,13,15]
λ: [0,50..500]
[0,50,100,150,200,250,300,350,400,450,500]
```

These comprehensions can be used inside of function definitions and reference locally bound variables. For example the `factorial`

function (written as *n*!) is defined as the product of all positive integers up to a given value.

As a more complex example consider a naive prime number sieve:

```
primes :: [Integer]
primes = sieve [2..]
where
sieve (p:xs) = p : sieve [ n | n <- xs, n `mod` p > 0 ]
```

And a more complex example, consider the classic FizzBuzz interview question. This makes use of iteration and guard statements.

```
fizzbuzz :: [String]
fizzbuzz = [fb x| x <- [1..100]]
where fb y
| y `mod` 15 == 0 = "FizzBuzz"
| y `mod` 3 == 0 = "Fizz"
| y `mod` 5 == 0 = "Buzz"
| otherwise = show y
```

## Comments

Single line comments begin with double dashes `--`

:

Multiline comments begin with `{-`

and end with `-}`

.

```
{-
The goal of computation is the emulation of our synthetic abilities, not the
understanding of our analytic ones.
-}
```

Comments may also add additional structure in the form of Haddock docstrings. These comments will begin with a pipe.

Modules may also have a comment convention which describes the individual authors, copyright and stability information in the following form:

```
{-|
Module : MyEnterpriseModule
Description : Make it so.
Copyright : (c) Jean Luc Picard
License : MIT
Maintainer : jl@enterprise.com
Stability : experimental
Portability : POSIX
Description of module structure in Haddock markup style.
-}
```

## Typeclasses

Typeclasses are one of the core abstractions in Haskell. Just as we wrote polymorphic functions above which operate over all given types (the `id`

function is one example), we can use typeclasses to provide a form of bounded polymorphism which constrains type variables to a subset of those types that implement a given class.

For example we can define an equality class which allows us to define an overloaded notion of equality depending on the data structure provided.

Then we can define this typeclass over several different types. These definitions are called **typeclass instances**. For example for the `Bool`

type the equality typeclass would be defined as:

```
instance Equal Bool where
equal True True = True
equal False False = True
equal True False = False
equal False True = False
```

Over the unit type, where only a single value exists, the instance is trivial:

For the Ordering type, defined as:

We would have the following Equal instance:

```
instance Equal Ordering where
equal LT LT = True
equal EQ EQ = True
equal GT GT = True
equal _ _ = False
```

An Equal instance for a more complex data structure like the list type relies upon the fact that the type of the elements in the list must also have a notion of equality, so we include this as a constraint in the typeclass context, which is written to the left of the fat arrow `=>`

. With this constraint in place, we can write this instance recursively by pattern matching on the list elements and checking for equality all the way down the spine of the list:

```
instance (Equal a) => Equal [a] where
equal [] [] = True -- Empty lists are equal
equal [] ys = False -- Lists of unequal size are not equal
equal xs [] = False
-- equal x y is only allowed here due to the constraint (Equal a)
equal (x:xs) (y:ys) = equal x y && equal xs ys
```

In the above definition, we know that we can check for equality between individual list elements if those list elements satisfy the Equal constraint. Knowing that they do, we can then check for equality between two complete lists.

For tuples, we will also include the Equal constraint for their elements, and we can then check each element for equality respectively. Note that this instance includes two constraints in the context of the typeclass, requiring that both type variables `a`

and `b`

must also have an Equal instance.

```
instance (Equal a, Equal b) => Equal (a,b) where
equal (x0, x1) (y0, y1) = equal x0 y0 && equal x1 y1
```

The default prelude comes with a variety of typeclasses that are used frequently and defined over many prelude types:

**Num**- Provides a basic numerical interface for values with addition, multiplication, subtraction, and negation.**Eq**- Provides an interface for values that can be tested for equality.**Ord**- Provides an interface for values that have a total ordering.**Read**- Provides an interface for values that can be read from a string.**Show**- Provides an interface for values that can be printed to a string.**Enum**- Provides an interface for values that are enumerable to integers.**Semigroup**- Provides an algebraic semigroup interface.**Functor**- Provides an algebraic functor interface. See Functors.**Monad**- Provides an algebraic monad interface. See Monads.**Category**- Provides an algebraic category interface. See Categories.**Bounded**- Provides an interface for enumerable values with bounds.**Integral**- Provides an interface for integral-like quantities.**Real**- Provides an interface for real-like quantities.**Fractional**- Provides an interface for rational-like quantities.**Floating**- Provides an interface for defining transcendental functions over real values.**RealFrac**- Provides an interface for rounding real values.**RealFloat**- Provides an interface for working with IEE754 operations.

To see the implementation for any of these typeclasses you can run the GHCi info command to see the methods and all instances in scope. For example:

```
λ: :info Num
class (Eq a, Show a) => Num a where
(+) :: a -> a -> a
(*) :: a -> a -> a
(-) :: a -> a -> a
negate :: a -> a
abs :: a -> a
signum :: a -> a
fromInteger :: Integer -> a
-- Imported from GHC.Num
instance Num Float -- Imported from GHC.Float
instance Num Double -- Imported from GHC.Float
instance Num Integer -- Imported from GHC.Num
instance Num Int -- Imported from GHC.Num
```

Many of the default classes have instances that can be derived automatically. After the definition of a datatype you can add a `deriving`

clause which will generate the instances for this datatype automatically. This does not work universally but for many instances which have boilerplate definitions, GHC is quite clever and can save you from writing quite a bit of code by hand.

For example for a custom list type.

## Side Effects

Contrary to a common misconception, side effects are an integral part of Haskell programming. Probably the most interesting thing about Haskell’s approach to side effects is that they are encoded in the type system. This is certainly a different approach to effectful programming, and the language has various models for modeling these effects within the type system. These models range from using Monads to building algebraic models of effects that draw clear lines between effectful code and pure code. The idea of reasoning about where effects can and cannot exist is one of the key ideas of Haskell, but this certainly does not mean trying to avoid side effects altogether!

Indeed, a Hello World program in Haskell is quite simple:

Other side effects can include reading from the terminal and prompting the user for input, such as in the complete program below:

## Records

Records in Haskell are fundamentally broken for several reasons:

**The syntax is unconventional.**

Most programming languages use dot or arrow syntax for field accessors like the following:

Haskell however uses function application syntax since record accessors are simply just functions. Instead or creating a privileged class of names and syntax for field accessors, Haskell instead choose to implement the simplest model and expands accessors to function during compilation.

**Incomplete pattern matches are implicitly generated for sums of products.**

The functions generated for `a`

or `b`

in both of these cases are partial. See Exhaustiveness checking.

**Lack of Namespacing**

Given two records defined in the same module (or imported) GHC is unable to (by default) disambiguate which field accessor to assign at a callsite that uses `a`

.

This can be routed around with the language extension `DisambiguateRecordFields`

but only to a certain extent. If we want to write maximally polymorphic functions which operate over arbitrary records which have a field `a`

, then the GHC typesystem is not able to express this without some much higher-level magic.

## Pragmas

At the beginning of a module there is special syntax for pragmas which direct the compiler to compile the current module in a specific way. The most common is a language extension pragma denoted like the following:

These flags alter the semantics and syntax of the module in a variety of ways. See Language Extensions for more details on all of these options.

Additionally we can pass specific GHC flags which alter the compilation behavior, enabling or disabling specific bespoke features based on our needs. These include compiler warnings, optimisation flags and extension flags.

Warning flags allow you to inform users at compile-time with a custom error message. Additionally you can mark a module as deprecated with a specific replacement message.

```
module Widget {-# DEPRECATED "This module is deprecated." #-}
module Widget {-# WARNING "This module is dangerous." #-}
```

## Newtypes

Newtypes are a form of zero-cost abstraction that allows developers to specify compile-time names for types for which the developer wishes to expose a more restrictive interface. They’re zero-cost because these newtypes end up with the same underlying representation as the things they differentiate. This allows the compiler to distinguish between different types which are representationally identical but semantically different.

For instance velocity can be represented as a scalar quantity represented as a double but the user may not want to mix doubles with other vector quantities. Newtypes allow us to distinguish between scalars and vectors at compile time so that no accidental calculations can occur.

Most importantly these newtypes disappear during compilation and the velocity type will be represented as simply just a machine double with no overhead.

See also the section on Newtype Deriving for a further discussion of tricks involved with handling newtypes.

## Bottoms

The bottom is a singular value that inhabits every type. When this value is evaluated, the semantics of Haskell no longer yield a meaningful value. In other words, further operations on the value cannot be defined in Haskell. A bottom value is usually written as the symbol ⊥, ( i.e. the compiler flipping you off ). Several ways exist to express bottoms in Haskell code.

For instance, `undefined`

is an easily called example of a bottom value. This function has type `a`

but lacks any type constraints in its type signature. Thus, `undefined`

is able to stand in for any type in a function body, allowing type checking to succeed, even if the function is incomplete or lacking a definition entirely. The `undefined`

function is extremely practical for debugging or to accommodate writing incomplete programs.

```
undefined :: a
mean :: Num a => Vector a -> a
mean nums = (total / count) where -- Partially defined function
total = undefined
count = undefined
addThreeNums :: Num a => a -> a -> a -> a
addThreeNums n m j = undefined -- No function body declared at all
f :: a -> Complicated Type
f = undefined -- Write tomorrow, typecheck today!
-- Arbitrarily complicated types
-- welcome!
```

Another example of a bottom value comes from the evaluation of the `error`

function, which takes a `String`

and returns something that can be of any type. This property is quite similar to `undefined`

, which also can also stand in for any type.

Calling `error`

in a function causes the compiler to throw an exception, halt the program, and print the specified error message.

```
error :: String -> a -- Takes an error message of type
-- String and returns whatever type
-- is needed
```

In the `divByY`

function below, passing the function `0`

as the divisor results in this function returning such an exception.

```
-- Annotated code that features use of the error function.
divByY:: (Num a, Eq a, Fractional a) => a -> a -> a
divByY _ 0 = error "Divide by zero error" -- Dividing by 0 causes an error
divByY dividend divisor = dividend / divisor -- Handles defined division
```

A third type way to express a bottom is with an infinitely looping term:

Examples of actual Haskell code that use this looping syntax lives in the source code of the GHC.Prim module. These bottoms exist because the operations cannot be defined in native Haskell. Such operations are baked into the compiler at a very low level. However, this module exists so that Haddock can generate documentation for these primitive operations, while the looping syntax serves as a placeholder for the actual implementation of the primops.

Perhaps the most common introduction to bottoms is writing a partial function that does not have exhaustive pattern matching defined. For example, the following code has non-exhaustive pattern matching because the `case`

expression, lacks a definition of what to do with a `B`

:

The code snippet above is translated into the following GHC Core output where the compiler will insert an exception to account for the non-exhaustive patterns:

GHC can be made more vocal about incomplete patterns using the `-fwarn-incomplete-patterns`

and `-fwarn-incomplete-uni-patterns`

flags.

A similar situation can arise with records. Although constructing a record with missing fields is rarely useful, it is still possible.

When the developer omits a field’s definition, the compiler inserts an exception in the GHC Core representation:

Fortunately, GHC will warn us by default about missing record fields.

Bottoms are used extensively throughout the Prelude, although this fact may not be immediately apparent. The reasons for including bottoms are either practical or historical.

The canonical example is the `head`

function which has type `[a] -> a`

. This function could not be well-typed without the bottom.

```
-- | Extract the first element of a list, which must be non-empty.
head :: [a] -> a
head (x:_) = x
head [] = error "Prelude.head: empty list"
```

Some further examples of bottoms:

```
import GHC.Err
import Prelude hiding (head, (!!), undefined)
-- degenerate functions
undefined :: a
undefined = error "Prelude.undefined"
head :: [a] -> a
head (x:_) = x
head [] = error "Prelude.head: empty list"
(!!) :: [a] -> Int -> a
xs !! n | n < 0 = error "Prelude.!!: negative index"
[] !! _ = error "Prelude.!!: index too large"
(x:_) !! 0 = x
(_:xs) !! n = xs !! (n-1)
```

It is rare to see these partial functions thrown around carelessly in production code because they cause the program to halt. The preferred method for handling exceptions is to combine the use of safe variants provided in `Data.Maybe`

with the functions `maybe`

and `either`

.

Another method is to use pattern matching, as shown in `listToMaybe`

, a safer version of `head`

described below:

```
listToMaybe :: [a] -> Maybe a
listToMaybe [] = Nothing -- An empty list returns Nothing
listToMaybe (a:_) = Just a -- A non-empty list returns the first element
-- wrapped in the Just context.
```

Invoking a bottom defined in terms of `error`

typically will not generate any position information. However, `assert`

, which is used to provide assertions, can be short-circuited to generate position information in place of either `undefined`

or `error`

calls.

```
import GHC.Base
foo :: a
foo = undefined
-- *** Exception: Prelude.undefined
bar :: a
bar = assert False undefined
-- *** Exception: src/fail.hs:8:7-12: Assertion failed
```

See: Avoiding Partial Functions

## Exhaustiveness

Pattern matching in Haskell allows for the possibility of non-exhaustive patterns. For example, passing Nothing to `unsafe`

will cause the program to crash at runtime. However, this function is an otherwise valid, type-checked program.

Since `unsafe`

takes a `Maybe a`

value as its argument, two possible values are valid input: `Nothing`

and `Just a`

. Since the case of a `Nothing`

was not defined in `unsafe`

, we say that the pattern matching within that function is *non-exhaustive*. In other words, the function does not implement appropriate handling of all valid inputs. Instead of yielding a value, such a function will halt from an incomplete match.

Partial functions from non-exhaustivity are a controversial subject, and frequent use of non-exhaustive patterns is considered a dangerous code smell. However, the complete removal of non-exhaustive patterns from the language would itself be too restrictive and forbid too many valid programs.

Several flags exist that we can pass to the compiler to warn us about such patterns or forbid them entirely, either locally or globally.

```
$ ghc -c -Wall -Werror A.hs
A.hs:3:1:
Warning: Pattern match(es) are non-exhaustive
In an equation for `unsafe': Patterns not matched: Nothing
```

The `-Wall`

or `-fwarn-incomplete-patterns`

flag can also be added on a per-module basis by using the `OPTIONS_GHC`

pragma.

A more subtle case of non-exhaustivity is the use of implicit pattern matching with a single *uni-pattern* in a lambda expression. In a manner similar to the `unsafe`

function above, a uni-pattern cannot handle all types of valid input. For instance, the function `boom`

will fail when given a Nothing, even though the type of the lambda expression’s argument is a `Maybe a`

.

Non-exhaustivity arising from uni-patterns in lambda expressions occurs frequently in `let`

or `do`

-blocks after desugaring, because such code is translated into lambda expressions similar to `boom`

.

GHC can warn about these cases of non-exhaustivity with the `-fwarn-incomplete-uni-patterns`

flag.

Generally speaking, any non-trivial program will use some measure of partial functions. It is simply a fact. Thus, there exist obligations for the programmer that cannot be manifested in the Haskell type system.

## Debugger

Since GHC version 6.8.1, a built-in debugger has been available, although its use is somewhat rare. Debugging uncaught exceptions is in a similar style to debugging segfaults with gdb. Breakpoints can be set with `:break`

and the call stack stepped through with `:forward`

and `:back`

.

```
λ: :set -fbreak-on-exception -- Sets option for evaluation to stop on exception
λ: :break 2 15 -- Sets a break point at line 2, column 15
λ: :trace main -- Run a function to generate a sequence of evaluation steps
λ: :hist -- Step back from a breakpoint through previous evaluation steps
λ: :back -- Step backwards a single step at a time through the history
λ: :forward -- Step forward a single step at a time through the history
```

## Stack Traces

With runtime profiling enabled, GHC can also print a stack trace when a diverging bottom term (error, undefined) is hit. This action, though, requires a special flag and profiling to be enabled, both of which are disabled by default. So, for example:

```
import Control.Exception
f x = g x
g x = error (show x)
main = try (evaluate (f ())) :: IO (Either SomeException ())
```

And indeed, the runtime tells us that the exception occurred in the function `g`

and enumerates the call stack.

```
*** Exception (reporting due to +RTS -xc): (THUNK_2_0), stack trace:
Main.g,
called from Main.f,
called from Main.main,
called from Main.CAF
--> evaluated by: Main.main,
called from Main.CAF
```

It is best to run this code without optimizations applied `-O0`

so as to preserve the original call stack as represented in the source. With optimizations applied, GHC will rearrange the program in rather drastic ways, resulting in what may be an entirely different call stack.

## Printf Tracing

Since Haskell is a pure language it has the unique property that most code is introspectable on its own. As such, using printf to display the state of the program at critical times throughout execution is often unnecessary because we can simply open GHCi and test the function. Nevertheless, Haskell does come with an unsafe `trace`

function which can be used to perform arbitrary print statements outside of the IO monad. You can place these statements wherever you like in your code without without IO restrictions.

```
import Debug.Trace
example1 :: Int
example1 = trace "impure print" 1
example2 :: Int
example2 = traceShow "tracing" 2
example3 :: [Int]
example3 = [trace "will not be called" 3]
main :: IO ()
main = do
print example1
print example2
print $ length example3
-- impure print
-- 1
-- "tracing"
-- 2
-- 1
```

Trace uses `unsafePerformIO`

under the hood and should **not** be used in production code.

In addition to the `trace`

function, several monadic `trace`

variants are quite common.

```
import Text.Printf
import Debug.Trace
traceM :: (Monad m) => String -> m ()
traceM string = trace string $ return ()
traceShowM :: (Show a, Monad m) => a -> m ()
traceShowM = traceM . show
tracePrintfM :: (Monad m, PrintfArg a) => String -> a -> m ()
tracePrintfM s = traceM . printf s
```

## Type Inference

While inference in Haskell is usually complete, there are cases where the principal type cannot be inferred. Three common cases are:

- Reduced polymorphism due to
*mutually recursive binding groups* - Undecidability due to
*polymorphic recursion* - Reduced polymorphism due to the
*monomorphism restriction*

In each of these cases, Haskell needs a hint from the programmer, which may be provided by adding explicit type signatures.

#### Mutually Recursive Binding Groups

In this case, the inferred type signatures are correct in their usage, but they don’t represent the most general signatures. When GHC analyzes the module it analyzes the dependencies of expressions on each other, groups them together, and applies substitutions from unification across mutually defined groups. As such the inferred types may not be the most general types possible, and an explicit signature may be desired.

#### Polymorphic recursion

In the second case, recursion is polymorphic because the inferred type variable `a`

in `size`

spans two possible types (`a`

and `(a,a)`

). These two types won’t pass the occurs-check of the typechecker and it yields an incorrect inferred type:

```
Occurs check: cannot construct the infinite type: t0 = (t0, t0)
Expected type: Tree t0
Actual type: Tree (t0, t0)
In the first argument of `size', namely `t'
In the second argument of `(*)', namely `size t'
In the second argument of `(+)', namely `2 * size t'
```

Simply adding an explicit type signature corrects this. Type inference using polymorphic recursion is undecidable in the general case.

See: Static Semantics of Function and Pattern Bindings

#### Monomorphism Restriction

Finally *Monomorphism restriction* is a builtin typing rule. By default, it is turned on when compiling and off in GHCi. The practical effect of this rule is that types inferred for functions without explicit type signatures may be more specific than expected. This is because GHC will sometimes reduce a general type, such as `Num`

to a default type, such as `Double`

. This can be seen in the following example in GHCi:

This rule may be deactivated with the `NoMonomorphicRestriction`

extension, see below.

See:

## Type Holes

Since the release of GHC 7.8, type holes allow underscores as stand-ins for actual values. They may be used either in declarations or in type signatures.

Type holes are useful in debugging incomplete programs. By placing an underscore on any value on the right hand-side of a declaration, GHC will throw an error during type-checking. The error message describes which values may legally fill the type hole.

```
typedhole.hs:3:14: error:
• Found hole: _ :: [a]
Where: ‘a’ is a rigid type variable bound by
the inferred type of head' :: a at typedhole.hs:3:1
• In the first argument of ‘head’, namely ‘_’
In the expression: head _
In an equation for ‘head'’: head' = head _
• Relevant bindings include head' :: a (bound at typedhole.hs:3:1)
```

GHC has rightly suggested that the expression needed to finish the program is `xs :: [a]`

.

The same hole technique can be applied at the toplevel for signatures:

```
typedhole.hs:5:11: error:
• Found type wildcard ‘_’ standing for ‘t -> t1 -> t’
Where: ‘t1’ is a rigid type variable bound by
the inferred type of const' :: t -> t1 -> t at typedhole.hs:6:1
‘t’ is a rigid type variable bound by
the inferred type of const' :: t -> t1 -> t at typedhole.hs:6:1
To use the inferred type, enable PartialTypeSignatures
• In the type signature:
const' :: _
• Relevant bindings include
const' :: t -> t1 -> t (bound at typedhole.hs:6:1)
```

Pattern wildcards can also be given explicit names so that GHC will use the names when reporting the inferred type in the resulting message.

```
typedhole.hs:9:9: error:
• Couldn't match expected type ‘_a’ with actual type ‘Bool’
‘_a’ is a rigid type variable bound by
the type signature for:
foo :: forall _a. _a -> _a
at typedhole.hs:8:8
• In the expression: False
In an equation for ‘foo’: foo _ = False
• Relevant bindings include
foo :: _a -> _a (bound at typedhole.hs:9:1)
```

The same wildcards can be used in type contexts to dump out inferred type class constraints:

```
typedhole.hs:11:10: error:
Found constraint wildcard ‘_’ standing for ‘Num a’
To use the inferred type, enable PartialTypeSignatures
In the type signature:
succ' :: _ => a -> a
```

When the flag `-XPartialTypeSignatures`

is passed to GHC and the inferred type is unambiguous, GHC will let us leave the holes in place and the compilation will proceed with a warning instead of an error.

```
typedhole.hs:3:10: Warning:
Found hole ‘_’ with type: w_
Where: ‘w_’ is a rigid type variable bound by
the inferred type of succ' :: w_ -> w_1 -> w_ at foo.hs:4:1
In the type signature for ‘succ'’: _ -> _ -> _
```

## Deferred Type Errors

Since the release of version 7.8, GHC supports the option of treating type errors as runtime errors. With this option enabled, programs will run, but they will fail when a mistyped expression is evaluated. This feature is enabled with the `-fdefer-type-errors`

flag in three ways: at the module level, when compiled from the command line, or inside of a GHCi interactive session.

For instance, the program below will compile:

```
{-# OPTIONS_GHC -fdefer-type-errors #-} -- Enable deferred type
-- errors at module level
x :: ()
x = print 3
y :: Char
y = 0
z :: Int
z = 0 + "foo"
main :: IO ()
main = do
print x
```

However, when a pathological term is evaluated at runtime, we’ll see a message like this:

```
defer: defer.hs:4:5:
Couldn't match expected type ‘()’ with actual type ‘IO ()’
In the expression: print 3
In an equation for ‘x’: x = print 3
(deferred type error)
```

This error tells us that while `x`

has a declared type of `()`

, the body of the function `print 3`

has a type of `IO ()`

. However, if the term is never evaluated, GHC will not throw an exception.

## Name Conventions

Haskell uses short variable names as a convention. This is offputting at first but after you read enough Haskell, it ceases to be a problem. In addition there are several ad-hoc conventions that are typically adopted

Variable | Convention |
---|---|

`a,b,c..` |
Type level variable |

`x,y,z..` |
Value variables |

`f,g,h..` |
Higher order function values |

`x,y` |
List head values |

`xs,ys` |
List tail values |

`m` |
Monadic type variable |

`t` |
Monad transformer variable |

`e` |
Exception value |

`s` |
Monad state value |

`r` |
Monad reader value |

`t` |
Foldable or Traversable type variable |

`f` |
Functor or applicative type variable |

`mX` |
Maybe variable |

Functions that end with a tick (like `fold'`

) are typically strict variants of a default lazy function.

Functions that end with a _ (like `map_`

) are typically variants of a function which discards the output and returns void.

Variables that are pluralized `xs`

, `ys`

typically refer to list tails.

Records that do not export their accessors will sometimes prefix them with underscores. These are sometimes interpreted by Template Haskell logic to produce derived field accessors.

Predicates will often prefix their function names with `is`

, as in `isPositive`

.

Functions which result in an Applicative or Monad type will often suffix their name with a A for Applicative or M for Monad. For example:

Functions which have *chirality* in which they traverse a data structure (i.e. left-to-right or right-to-left) will often suffix the name with L or R for their iteration pattern. This is useful because often times these type signatures are identical.

```
mapAccumL :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c)
mapAccumR :: Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c)
```

Functions working with mutable structures or monadic state will often adopt the following naming conventions:

```
newX -- Create a new mutable X structure
writeX -- Write to an existing mutable X structure
setX -- Set the value of an existing mutable X structure
modifyX -- Apply a function over existing mutable X structure
```

Functions that are prefixed with `with`

typically take a value as their first argument and a function as their second argument returning the value with the function applied over some substructure as the result.

## ghcid

ghcid is a lightweight IDE hook that allows continuous feedback whenever code is updated. It can be run from the command line in the root of the `cabal`

project directory by specifying a command to run (e.g. `ghci`

, `cabal repl`

, or `stack repl`

).

```
ghcid --command="cabal repl" # Run cabal repl under ghcid
ghcid --command="stack repl" # Run stack repl under ghcid
ghcid --command="ghci baz.hs" # Open baz.hs under ghcid
```

When a Haskell module is loaded into `ghcid`

, the code is evaluated in order to provide the user with any errors or warnings that would happen at compile time. When the developer edits and saves code loaded into `ghcid`

, the program automatically reloads and evaluates the code for errors and warnings.

## HLint

HLint is a source linter for Haskell that provides a variety of hints on code improvements. It can be customised and configured with custom rules, on a per-project basis. HLint is configured through a `hlint.yaml`

file placed in the root of a project. To generate the default configuration run:

Custom errors can be added to this file in order to match and suggest custom changes of code from the left hand side match to the right hand side replacement:

HLint’s default is to warn on all possible failures. These can be disabled globally by adding ignore pragmas.

Or within specific modules by specifying the `within`

option.

See:

## Docker Images

Haskell has stable Docker images that are widely used for deployments across Kubernetes and Docker environments. The two Dockerhub repositories of note are:

To import the official Haskell images with `ghc`

and `cabal-install`

include the following preamble in your Dockerfile with your desired GHC version.

`FROM haskell:8.8.1`

To import the stack images include the following preamble in your Dockerfile with your desired Stack resolver replaced.

`FROM fpco/stack-build:lts-14.0`

## Continuous Integration

These days it is quite common to use cloud hosted continuous integration systems to test code from version control systems. There are many community contributed build scripts for different service providers, including the following:

- Travis CI for Cabal
- Travis CI for Stack
- Circle CI for Cabal & Stack
- Github Actions for Cabal & Stack

See also the official CI repository:

## Ormolu

Ormolu is an opinionated Haskell source formatter that produces a canonical way of rendering the Haskell abstract syntax tree to text. This ensures that code shared amongst teams and checked into version control conforms to a single universal standard for whitespace and lexeme placing. This is similar to tools in other languages such as `go fmt`

.

For example running `ormolu example.hs --inplace`

on the following module:

Will rerender the file as:

Ormolu can be installed via a variety of mechanisms.

```
$ stack install ormolu --resolver=lts-14.14 # via stack
$ cabal new-install ormolu --installdir=/home/user/.local/bin # via cabal
$ nix-build -A ormolu # via nix
```

See:

## Haddock

Haddock is the automatic documentation generation tool for Haskell source code, and it integrates with the usual `cabal`

toolchain. In this section, we will explore how to document code so that Haddock can generate documentation successfully.

Several frequent comment patterns are used to document code for Haddock. The first of these methods uses `-- |`

to delineate the beginning of a comment:

Multiline comments are also possible:

```
-- | Multiline documentation for the function
-- f with multiple arguments.
fmap :: Functor f
=> (a -> b) -- ^ function
-> f a -- ^ input
-> f b -- ^ output
```

`-- ^`

is used to comment Constructors or Record fields:

```
data T a b
= A a -- ^ Documentation for A
| B b -- ^ Documentation for B
data R a b = R
{ f1 :: a -- ^ Documentation for the field f1
, f2 :: b -- ^ Documentation for the field f2
}
```

Elements within a module (i.e. values, types, classes) can be hyperlinked by enclosing the identifier in single quotes:

Modules themselves can be referenced by enclosing them in double quotes:

`haddock`

also allows the user to include blocks of code within the generated documentation. Two methods of demarcating the code blocks exist in `haddock`

. For example, enclosing a code snippet in `@`

symbols marks it as a code block:

Similarly, it is possible to use bird tracks (`>`

) in a comment line to set off a code block.

Snippets of interactive shell sessions can also be included in `haddock`

documentation. In order to denote the beginning of code intended to be run in a REPL, the `>>>`

symbol is used:

```
-- | Example of an interactive shell session embedded within documentation
--
-- >>> factorial 5
-- 120
```

Headers for specific blocks can be added by prefacing the comment in the module block with a `*`

:

Sections can also be delineated by `$`

blocks that pertain to references in the body of the module:

```
module Foo (
-- $section1
example1,
example2
)
-- $section1
-- Here is the documentation section that describes the symbols
-- 'example1' and 'example2'.
```

Links can be added with the following syntax:

Images can also be included, so long as the path is either absolute or relative to the directory in which `haddock`

is run.

`haddock`

options can also be specified with pragmas in the source, either at the module or project level.

Option | Description |
---|---|

ignore-exports | Ignores the export list and includes all signatures in scope. |

not-home | Module will not be considered in the root documentation. |

show-extensions | Annotates the documentation with the language extensions used. |

hide | Forces the module to be hidden from Haddock. |

prune | Omits definitions with no annotations. |

## Unsafe Functions

As everyone eventually finds out there are several functions within the implementation of GHC (not the Haskell language) that can be used to subvert the type-system; these functions are marked with the prefix `unsafe`

. Unsafe functions exist only for when one can manually prove the soundness of an expression but can’t express this property in the type-system, or externalities to Haskell.

```
unsafeCoerce :: a -> b -- Unsafely coerce anything into anything
unsafePerformIO :: IO a -> a -- Unsafely run IO action outside of IO
```

Using these functions to subvert the Haskell typesystem will cause all measure of undefined behavior with unimaginable pain and suffering, and so they are strongly discouraged. When initially starting out with Haskell there are no legitimate reasons to use these functions at all.

# Monads

Monads form one of the core components for constructing Haskell programs. In their most general form monads are an algebraic building block that can give rise to ways of structuring control flow, handling data structures and orchestrating logic. Monads are a very general algebraic way of structuring code and have a certain reputation for being confusing. However their power and flexibility have become foundational to the way modern Haskell programs are structured.

There is a singular truth to keep in mind when learning monads.

A monad is just its algebraic laws. Nothing more, nothing less.

## Eightfold Path to Monad Satori

Much ink has been spilled waxing lyrical about the supposed mystique of monads. Instead, I suggest a path to enlightenment:

- Don’t read the monad tutorials.
- No really, don’t read the monad tutorials.
- Learn about the Haskell typesystem.
- Learn what a typeclass is.
- Read the Typeclassopedia.
- Read the monad definitions.
- Use monads in real code.
- Don’t write monad-analogy tutorials.

In other words, the only path to understanding monads is to read the fine source, fire up GHC, and write some code. Analogies and metaphors will not lead to understanding.

## Monad Myths

The following are all **false**:

- Monads are impure.
- Monads are about effects.
- Monads are about state.
- Monads are about imperative sequencing.
- Monads are about IO.
- Monads are dependent on laziness.
- Monads are a “back-door” in the language to perform side-effects.
- Monads are an embedded imperative language inside Haskell.
- Monads require knowing abstract mathematics.
- Monads are unique to Haskell.

## Monad Methods

Monads are not complicated. They are implemented as a typeclass with two methods, `return`

and `(>>=)`

(pronounced “bind”). In order to implement a Monad instance, these two functions must be defined:

```
class Monad m where
return :: a -> m a -- N.B. 'm' refers to a type constructor
-- (e.g., Maybe, Either, etc.) that
-- implements the Monad typeclass
(>>=) :: m a -> (a -> m b) -> m b
```

The first type signature in the Monad class definition is for `return`

. Any preconceptions one might have for the word “return” should be discarded. It has an entirely different meaning in the context of Haskell and acts very differently than in languages such as C, Python, or Java. Instead of being the final arbiter of what value a function produces, `return`

in Haskell injects a value of type `a`

into a monadic context (e.g., Maybe, Either, etc.), which is denoted as `m a`

.

The other function essential to implementing a Monad instance is `(>>=)`

. This infix function takes two arguments. On its left side is a value with type `m a`

, while on the right side is a function with type `(a -> m b)`

. The bind operation results in a final value of type `m b`

.

A third, auxiliary function (`(>>)`

) is defined in terms of the bind operation that discards its argument.

This definition says that (>>) has a left and right argument which are monadic with types `m a`

and `m b`

respectively, while the infix function yields a value of type `m b`

. The actual implementation of (>>) says that when `m`

is passed to `(>>)`

with `k`

on the right, the value `k`

will always be yielded.

## Monad Laws

In addition to specific implementations of `(>>=)`

and `return`

, all monad instances must satisfy three laws.

**Law 1**

The first law says that when `return a`

is passed through `(>>=)`

into a function `f`

, this expression is exactly equivalent to `f a`

.

In discussing the next two laws, we’ll refer to a value `m`

. This notation is shorthand for a value wrapped in a monadic context. Such a value has type `m a`

, and could be represented more concretely by values like `Nothing`

, `Just x`

, or `Right x`

. It is important to note that some of these concrete instantiations of the value `m`

have multiple components. In discussing the second and third monad laws, we’ll see some examples of how this plays out.

**Law 2**

The second law states that a monadic value `m`

passed through `(>>=)`

into `return`

is exactly equivalent to itself. In other words, using bind to pass a monadic value to `return`

does not change the initial value.

A more explicit way to write the second Monad law exists. In this following example code, the first expression shows how the second law applies to values represented by non-nullary type constructors. The second snippet shows how a value represented by a nullary type constructor works within the context of the second law.

```
(SomeMonad val) >>= return ≡ SomeMonad val -- 'SomeMonad val' has type 'm a' just
-- like 'm' from the first example of the
-- second law
NullaryMonadType >>= return ≡ NullaryMonadType
```

**Law 3**

While the first two laws are relatively clear, the third law may be more difficult to understand. This law states that when a monadic value `m`

is passed through `(>>=)`

to the function `f`

and then the result of that expression is passed to `>>= g`

, the entire expression is exactly equivalent to passing `m`

to a lambda expression that takes one parameter `x`

and outputs the function `f`

applied to `x`

. By the definition of bind, `f x`

*must* return a value wrapped in the *same* monad. Because of this property, the resultant value of that expression can be passed through `(>>=)`

to the function `g`

, which also returns a monadic value.

```
(m >>= f) >>= g ≡ m >>= (\x -> f x >>= g) -- Like in the last law, 'm' has
-- has type 'm a'. The functions 'f'
-- and 'g' have types '(a -> m b)'
-- and '(b -> m c)' respectively
```

Again, it is possible to write this law with more explicit code. Like in the explicit examples for law 2, `m`

has been replaced by `SomeMonad val`

in order to be make it clear that there can be multiple components to a monadic value. Although little has changed in the code, it is easier to see that value –namely, `val`

– corresponds to the `x`

in the lambda expression. After `SomeMonad val`

is passed through `(>>=)`

to `f`

, the function `f`

operates on `val`

and returns a result still wrapped in the `SomeMonad`

type constructor. We can call this new value `SomeMonad newVal`

. Since it is still wrapped in the monadic context, `SomeMonad newVal`

can thus be passed through the bind operation into the function `g`

.

Monad law summary: Law 1 and 2 are identity laws (left and right identity respectively) and law 3 is the associativity law. Together they ensure that Monads can be composed and ‘do the right thing’.

See:

## Do Notation

Monadic syntax in Haskell is written in a sugared form, known as `do`

notation. The advantages of this special syntax are that it is easier to write and often easier to read, and it is entirely equivalent to simply applying the monad operations. The desugaring is defined recursively by the rules:

```
do { a <- f ; m } ≡ f >>= \a -> do { m } -- bind 'f' to a, proceed to desugar
-- 'm'
do { f ; m } ≡ f >> do { m } -- evaluate 'f', then proceed to
-- desugar m
do { m } ≡ m
```

Thus, through the application of the desugaring rules, the following expressions are equivalent:

```
do
a <- f -- f, g, and h are bound to the names a,
b <- g -- b, and c. These names are then passed
c <- h -- to 'return' to ensure that all values
return (a, b, c) -- are wrapped in the appropriate monadic
-- context
do { -- N.B. '{}' and ';' characters are
a <- f; -- rarely used in do-notation
b <- g;
c <- h;
return (a, b, c)
}
f >>= \a ->
g >>= \b ->
h >>= \c ->
return (a, b, c)
```

If one were to write the bind operator as an uncurried function (which is not how Haskell uses it) the same desugaring might look something like the following chain of nested binds with lambdas.

In the do-notation, the monad laws from above are equivalently written:

**Law 1**

**Law 2**

**Law 3**

See:

## Maybe Monad

The *Maybe* monad is the simplest first example of a monad instance. The Maybe monad models a computation which may fail to yield a value at any point during computation.

The Maybe type has two value constructors. The first, `Just`

, is a unary constructor representing a successful computation, while the second, `Nothing`

, is a nullary constructor that represents failure.

The monad instance describes the implementation of `(>>=)`

for `Maybe`

by pattern matching on the possible inputs that could be passed to the bind operation (i.e., `Nothing`

or `Just x`

). The instance declaration also provides an implementation of `return`

, which in this case is simply `Just`

.

```
instance Monad Maybe where
(Just x) >>= k = k x -- 'k' is a function with type (a -> Maybe a)
Nothing >>= k = Nothing
return = Just -- Just's type signature is (a -> Maybe a), in
-- other words, extremely similar to the
-- type of 'return' in the typeclass
-- declaration above.
```

The following code shows some simple operations to do within the Maybe monad.

In the above example, the value `Just 3`

is passed via `(>>=)`

to the lambda function `\x -> return (x + 1)`

. `x`

refers to the `Int`

portion of `Just 3`

, and we can use `x`

in the second half of the lambda expression, `return (x + 1)`

which evaluates to `Just 4`

, indicating a successful computation.

In the second example, the value `Nothing`

is passed via `(>>=)`

to the same lambda function as in the previous example. However, according to the `Maybe`

Monad instance, whenever `Nothing`

is bound to a function, the expression’s result will be `Nothing`

.

Here, `return`

is applied to `4`

and results in `Just 4`

.

The next code examples show the use of `do`

notation within the Maybe monad to do addition that might fail. Desugared examples are provided as well.

```
example1 :: Maybe Int
example1 = do
a <- Just 3 -- Bind 3 to name a
b <- Just 4 -- Bind 4 to name b
return $ a + b -- Evaluate (a + b), then use 'return' to ensure
-- the result is in the Maybe monad in order to
-- satisfy the type signature
-- Just 7
desugared1 :: Maybe Int
desugared1 = Just 3 >>= \a -> -- This example is the desugared
Just 4 >>= \b -> -- equivalent to example1
return $ a + b
-- Just 7
example2 :: Maybe Int
example2 = do
a <- Just 3 -- Bind 3 to name a
b <- Nothing -- Bind Nothing to name b
return $ a + b
-- Nothing
-- This result might be somewhat surprising, since
-- addition within the Maybe monad can actually
-- return 'Nothing'. This result occurs because one
-- of the values, Nothing, indicates computational
-- failure. Since the computation failed at one
-- step within the process, the whole computation
-- fails, leaving us with 'Nothing' as the final
-- result.
desugared2 :: Maybe Int
desugared2 = Just 3 >>= \a -> -- This example is the desugared
Nothing >>= \b -> -- equivalent to example2
return $ a + b
-- Nothing
```

## List Monad

The *List* monad is the second simplest example of a monad instance. As always, this monad implements both `(>>=)`

and `return`

.

The definition of bind says that when the list `m`

is bound to a function `f`

, the result is a concatenation of `map f`

over the list `m`

. The `return`

method simply takes a single value `x`

and injects into a singleton list `[x]`

.

In order to demonstrate the `List`

monad’s methods, we will define two values: `m`

and `f`

. `m`

is a simple list, while `f`

is a function that takes a single `Int`

and returns a two element list `[1, 0]`

.

When applied to bind, evaluation proceeds as follows:

```
m >>= f
==> [1,2,3,4] >>= \x -> [1,0]
==> concat (map (\x -> [1,0]) [1,2,3,4])
==> concat ([[1,0],[1,0],[1,0],[1,0]])
==> [1,0,1,0,1,0,1,0]
```

The list comprehension syntax in Haskell can be implemented in terms of the list monad. List comprehensions can be considered syntactic sugar for more obviously monadic implementations. Examples `a`

and `b`

illustrate these use cases.

The first example (`a`

) illustrates how to write a list comprehension. Although the syntax looks strange at first, there are elements of it that may look familiar. For instance, the use of `<-`

is just like bind in a `do`

notation: It binds an element of a list to a name. However, one major difference is apparent: `a`

seems to lack a call to `return`

. Not to worry, though, the `[]`

fills this role. This syntax can be easily desugared by the compiler to an explicit invocation of `return`

. Furthermore, it serves to remind the user that the computation takes place in the List monad.

```
a = [
f x y | -- Corresponds to 'f x y' in example b
x <- xs,
y <- ys,
x == y -- Corresponds to 'guard $ x == y' in example b
]
```

The second example (`b`

) shows the list comprehension above rewritten with `do`

notation:

```
-- Identical to `a`
b = do
x <- xs
y <- ys
guard $ x == y -- Corresponds to 'x == y' in example a
return $ f x y -- Corresponds to the '[]' and 'f x y' in example a
```

The final examples are further illustrations of the List monad. The functions below each return a list of 3-tuples which contain the possible combinations of the three lists that get bound the names `a`

, `b`

, and `c`

. N.B.: Only values in the list bound to `a`

can be used in `a`

position of the tuple; the same fact holds true for the lists bound to `b`

and `c`

.

```
example :: [(Int, Int, Int)]
example = do
a <- [1,2]
b <- [10,20]
c <- [100,200]
return (a,b,c)
-- [(1,10,100),(1,10,200),(1,20,100),(1,20,200),(2,10,100),(2,10,200),(2,20,100),(2,20,200)]
desugared :: [(Int, Int, Int)]
desugared = [1, 2] >>= \a ->
[10, 20] >>= \b ->
[100, 200] >>= \c ->
return (a, b, c)
-- [(1,10,100),(1,10,200),(1,20,100),(1,20,200),(2,10,100),(2,10,200),(2,20,100),(2,20,200)]
```

## IO Monad

Perhaps the most (in)famous example in Haskell of a type that forms a monad is `IO`

. A value of type `IO a`

is a computation which, when performed, does some I/O before returning a value of type `a`

. These computations are called actions. IO actions executed in `main`

are the means by which a program can operate on or access information from the external world. IO actions allow the program to do many things, including, but not limited to:

- Print a
`String`

to the terminal - Read and parse input from the terminal
- Read from or write to a file on the system
- Establish an
`ssh`

connection to a remote computer - Take input from a radio antenna for signal processing
- Launch the missiles.

Conceptualizing I/O as a monad enables the developer to access information from outside the program, but also to use pure functions to operate on that information as data. The following examples will show how we can use IO actions and `IO`

values to receive input from stdin and print to stdout.

Perhaps the most immediately useful function for doing I/O in Haskell is `putStrLn`

. This function takes a `String`

and returns an `IO ()`

. Calling it from `main`

will result in the `String`

being printed to stdout followed by a newline character.

Here is some code that prints a couple of lines to the terminal. The first invocation of `putStrLn`

is executed, causing the `String`

to be printed to stdout. The result is bound to a lambda expression that discards its argument, and then the next `putStrLn`

is executed.

```
main :: IO ()
main = putStrLn "Vesihiisi sihisi hississäään." >>=
\_ -> putStrLn "Or in English: 'The water devil was hissing in her elevator'."
-- Sugared code, written with do notation
main :: IO ()
main = do putStrLn "Vesihiisi sihisi hississäään."
putStrLn "Or in English: 'The water devil was hissing in her elevator'."
```

Another useful function is `getLine`

which has type `IO String`

. This function gets a line of input from stdin. The developer can then bind this line to a name in order to operate on the value within the program.

The code below demonstrates a simple combination of these two functions as well as desugaring `IO`

code. First, `putStrLn`

prints a `String`

to stdout to ask the user to supply their name, with the result being bound to a lambda that discards it argument. Then, `getLine`

is executed, supplying a prompt to the user for entering their name. Next, the resultant `IO String`

is bound to `name`

and passed to `putStrLn`

. Finally, the program prints the name to the terminal.

The next code block is the *desugared equivalent* of the previous example where the uses of `(>>=)`

are made explicit.

Our final example executes in the same way as the previous two examples. This example, though, uses the special `(>>)`

operator to take the place of binding a result to the lambda that discards its argument.

See:

## What’s the point?

Although it is difficult, if not impossible, to touch, see, or otherwise physically interact with a monad, this construct has some very interesting implications for programmers. For instance, consider the non-intuitive fact that we now have a uniform interface for talking about three very different, but foundational ideas for programming: *Failure*, *Collections* and *Effects*.

Let’s write down a new function called `sequence`

which folds a function `mcons`

over a list of monadic computations. We can think of `mcons`

as analogous to the list constructor (i.e. `(a : b : [])`

) except it pulls the two list elements out of two monadic values (`p`

,`q`

) by means of bind. The bound values are then joined with the list constructor `:`

, before finally being rewrapped in the appropriate monadic context with `return`

.

```
sequence :: Monad m => [m a] -> m [a]
sequence = foldr mcons (return [])
mcons :: Monad m => m t -> m [t] -> m [t]
mcons p q = do
x <- p -- 'x' refers to a singleton value
y <- q -- 'y' refers to a list. Because of this fact, 'x' can be
return (x:y) -- prepended to it
```

What does this function mean in terms of each of the monads discussed above?

**Maybe**

For the Maybe monad, sequencing a list of values within the `Maybe`

context allows us to collect the results of a series of computations which can possibly fail. However, `sequence`

yields the aggregated values only if each computation succeeds. In other words, if even one of the `Maybe`

values in the initial list passed to `sequence`

is a `Nothing`

, the result of evaluating `sequence`

for the whole list will also be `Nothing`

.

```
sequence [Just 3, Just 4]
-- Just [3,4]
sequence [Just 3, Just 4, Nothing] -- Since one of the results is Nothing,
-- Nothing -- the whole computation fails
```

**List**

The bind operation for the list monad forms the pairwise list of elements from the two operands. Thus, folding the binds contained in `mcons`

over a list of lists with `sequence`

implements the general Cartesian product for an arbitrary number of lists.

**IO**

Applying `sequence`

within the IO context results in still a different result. The function takes a list of IO actions, performs them sequentially, and then gives back the list of resulting values in the order sequenced.

```
sequence [getLine, getLine, getLine]
-- a -- a, b, and 9 are the inputs given by the
-- b -- user at the prompt
-- 9
-- ["a", "b", "9"] -- All inputs are returned in a list as
-- an IO [String].
```

So there we have it, three fundamental concepts of computation that are normally defined independently of each other actually all share this similar structure. This unifying pattern can be abstracted out and reused to build higher abstractions that work for all current and future implementations. If you want a motivating reason for understanding monads, this is it! These insights are the essence of what I wish I knew about monads looking back.

See:

## Reader Monad

The reader monad lets us access shared immutable state within a monadic context.

```
ask :: Reader r r
asks :: (r -> a) -> Reader r a
local :: (r -> r) -> Reader r a -> Reader r a
runReader :: Reader r a -> r -> a
```

```
import Control.Monad.Reader
data MyContext = MyContext
{ foo :: String
, bar :: Int
} deriving (Show)
computation :: Reader MyContext (Maybe String)
computation = do
n <- asks bar
x <- asks foo
if n > 0
then return (Just x)
else return Nothing
ex1 :: Maybe String
ex1 = runReader computation $ MyContext "hello" 1
ex2 :: Maybe String
ex2 = runReader computation $ MyContext "haskell" 0
```

A simple implementation of the Reader monad:

```
newtype Reader r a = Reader { runReader :: r -> a }
instance Monad (Reader r) where
return a = Reader $ \_ -> a
m >>= k = Reader $ \r -> runReader (k (runReader m r)) r
ask :: Reader a a
ask = Reader id
asks :: (r -> a) -> Reader r a
asks f = Reader f
local :: (r -> r) -> Reader r a -> Reader r a
local f m = Reader $ runReader m . f
```

## Writer Monad

The writer monad lets us emit a lazy stream of values from within a monadic context.

```
import Control.Monad.Writer
type MyWriter = Writer [Int] String
example :: MyWriter
example = do
tell [1..3]
tell [3..5]
return "foo"
output :: (String, [Int])
output = runWriter example
-- ("foo", [1, 2, 3, 3, 4, 5])
```

A simple implementation of the Writer monad:

```
import Data.Monoid
newtype Writer w a = Writer { runWriter :: (a, w) }
instance Monoid w => Monad (Writer w) where
return a = Writer (a, mempty)
m >>= k = Writer $ let
(a, w) = runWriter m
(b, w') = runWriter (k a)
in (b, w `mappend` w')
execWriter :: Writer w a -> w
execWriter m = snd (runWriter m)
tell :: w -> Writer w ()
tell w = Writer ((), w)
```

This implementation is lazy, so some care must be taken that one actually wants to only generate a stream of thunks. Most often the lazy writer is not suitable for use, instead implement the equivalent structure by embedding some monomial object inside a StateT monad, or using the strict version.

## State Monad

The state monad allows functions within a stateful monadic context to access and modify shared state.

```
runState :: State s a -> s -> (a, s)
evalState :: State s a -> s -> a
execState :: State s a -> s -> s
```

```
import Control.Monad.State
test :: State Int Int
test = do
put 3
modify (+1)
get
main :: IO ()
main = print $ execState test 0
```

The state monad is often mistakenly described as being impure, but it is in fact entirely pure and the same effect could be achieved by explicitly passing state. A simple implementation of the State monad takes only a few lines:

```
newtype State s a = State { runState :: s -> (a,s) }
instance Monad (State s) where
return a = State $ \s -> (a, s)
State act >>= k = State $ \s ->
let (a, s') = act s
in runState (k a) s'
get :: State s s
get = State $ \s -> (s, s)
put :: s -> State s ()
put s = State $ \_ -> ((), s)
modify :: (s -> s) -> State s ()
modify f = get >>= \x -> put (f x)
evalState :: State s a -> s -> a
evalState act = fst . runState act
execState :: State s a -> s -> s
execState act = snd . runState act
```

## Why are monads confusing?

So many monad tutorials have been written that it begs the question: what makes monads so difficult when first learning Haskell? I hypothesize there are three aspects to why this is so:

*There are several levels of indirection with desugaring.*

A lot of the Haskell we write is radically rearranged and transformed into an entirely new form under the hood.

Most monad tutorials will not manually expand out the do-sugar. This leaves the beginner thinking that monads are a way of dropping into a pseudo-imperative language inside of pure code and further fuels the misconception that specific instances like IO describe monads in their *full generality*. When in fact the IO monad is only one among many instances.

Being able to manually desugar is crucial to understanding.

*Infix operators for higher order functions are not common in other languages.*

On the left hand side of the operator we have an `m a`

and on the right we have `a -> m b`

. Thus, this operator is asymmetric, utilizing a monadic value on the left and a higher order function on the right. Although some languages do have infix operators that are themselves higher order functions, it is still a rather rare occurrence.

Thus, with a function desugared, it can be confusing that `(>>=)`

operator is in fact building up a much larger function by composing functions together.

Written in prefix form, it becomes a little bit more digestible.

Perhaps even removing the operator entirely might be more intuitive coming from other languages.

*Ad-hoc polymorphism is not commonplace in other languages.*

Haskell’s implementation of overloading can be unintuitive if one is not familiar with type inference. Indeed, newcomers to Haskell often believe they can gain an intuition for monads in a way that will unify their understanding of *all monads*. This is a fallacy, however, because any particular monad instance is merely an **instantiation** of the monad typeclass functions implemented *for that particular type*.

This is all abstracted away from the user, but the `(>>=)`

or `bind`

function is really a function of 3 arguments with the extra typeclass dictionary argument (`$dMonad`

) implicitly threaded around.

In general, this is true for all typeclasses in Haskell and it’s true here as well, except in the case where the parameter of the monad class is unified (through inference) with a concrete class instance.

Now, all of these transformations are trivial once we understand them, they’re just typically not discussed. In my opinion the fundamental fallacy of monad tutorials is not that intuition for monads is hard to convey (nor are metaphors required!), but that novices often come to monads with an incomplete understanding of points (1), (2), and (3) and then trip on the simple fact that monads are the first example of a Haskell construct that is the confluence of all three.

Thus we make monads more difficult than they need to be. At the end of the day they are simple algebraic critters.

# Monad Transformers

## mtl / transformers

The descriptions of Monads in the previous chapter are a bit of a white lie. Modern Haskell monad libraries typically use a more general form of these, written in terms of monad transformers which allow us to compose monads together to form **composite monads**.

Imagine if you had an application that wanted to deal with a Maybe monad wrapped inside a State Monad, all wrapped inside the IO monad. This is the problem that monad transformers solve, a problem of composing different monads. At their core, monad transformers allow us to nest monadic computations in a stack with an interface to exchange values between the levels, called lift:

In production code, the monads mentioned previously may actually be their more general transformer form composed with the `Identity`

monad.

```
type State s = StateT s Identity
type Writer w = WriterT w Identity
type Reader r = ReaderT r Identity
```

The following table shows the relationships between these forms:

Monad | Transformer | Type | Transformed Type |
---|---|---|---|

Maybe | MaybeT | `Maybe a` |
`m (Maybe a)` |

Reader | ReaderT | `r -> a` |
`r -> m a` |

Writer | WriterT | `(a,w)` |
`m (a,w)` |

State | StateT | `s -> (a,s)` |
`s -> m (a,s)` |

Just as the base monad class has laws, monad transformers also have several laws:

**Law #1**

**Law #2**

Or equivalently:

**Law #1**

**Law #2**

It’s useful to remember that transformers compose *outside-in* but are *unrolled inside out*.

## Transformers

The lift definition provided above comes from the `transformers`

library along with an IO-specialized form called `liftIO`

:

These definitions rely on the following typeclass definitions, which describe composing one monad with another monad (the “t” is the transformed second monad):

```
class MonadTrans t where
lift :: Monad m => m a -> t m a
class (Monad m) => MonadIO m where
liftIO :: IO a -> m a
instance MonadIO IO where
liftIO = id
```

## Basics

The most basic use requires us to use the T-variants for each of the monad transformers in the outer layers and to explicitly `lift`

and `return`

values between the layers. Monads have kind `(* -> *)`

, so monad transformers which take monads to monads have `((* -> *) -> * -> *)`

:

For example, if we wanted to form a composite computation using both the Reader and Maybe monads, using `MonadTrans`

we could use Maybe inside of a `ReaderT`

to form `ReaderT t Maybe a`

.

```
import Control.Monad.Reader
type Env = [(String, Int)]
type Eval a = ReaderT Env Maybe a
data Expr
= Val Int
| Add Expr Expr
| Var String
deriving (Show)
eval :: Expr -> Eval Int
eval ex = case ex of
Val n -> return n
Add x y -> do
a <- eval x
b <- eval y
return (a+b)
Var x -> do
env <- ask
val <- lift (lookup x env)
return val
env :: Env
env = [("x", 2), ("y", 5)]
ex1 :: Eval Int
ex1 = eval (Add (Val 2) (Add (Val 1) (Var "x")))
example1, example2 :: Maybe Int
example1 = runReaderT ex1 env
example2 = runReaderT ex1 []
```

The fundamental limitation of this approach is that we find ourselves `lift.lift.lift`

ing and `return.return.return`

ing a lot.

## mtl

The mtl library is the most commonly used interface for these monad tranformers, but mtl depends on the transformers library from which it generalizes the “basic” monads described above into more general transformers, such as the following:

```
instance Monad m => MonadState s (StateT s m)
instance Monad m => MonadReader r (ReaderT r m)
instance (Monoid w, Monad m) => MonadWriter w (WriterT w m)
```

This solves the “lift.lift.lifting” problem introduced by transformers.

## ReaderT

By way of an example there exist three possible forms of the Reader monad. The first is the primitive version which no longer exists, but which is useful for understanding the underlying ideas. The other two are the *transformers* and *mtl* variants.

*Reader*

```
newtype Reader r a = Reader { runReader :: r -> a }
instance MonadReader r (Reader r) where
ask = Reader id
local f m = Reader (runReader m . f)
```

*ReaderT*

```
newtype ReaderT r m a = ReaderT { runReaderT :: r -> m a }
instance (Monad m) => Monad (ReaderT r m) where
return a = ReaderT $ \_ -> return a
m >>= k = ReaderT $ \r -> do
a <- runReaderT m r
runReaderT (k a) r
instance MonadTrans (ReaderT r) where
lift m = ReaderT $ \_ -> m
```

*MonadReader*

```
class (Monad m) => MonadReader r m | m -> r where
ask :: m r
local :: (r -> r) -> m a -> m a
instance (Monad m) => MonadReader r (ReaderT r m) where
ask = ReaderT return
local f m = ReaderT $ \r -> runReaderT m (f r)
```

So, hypothetically the three variants of ask would be:

In practice the `mtl`

variant is the one commonly used in Modern Haskell.

## Newtype Deriving

Newtype deriving is a common technique used in combination with the `mtl`

library and as such we will discuss its use for transformers in this section.

As discussed in the newtypes section, newtypes let us reference a data type with a single constructor as a new distinct type, with no runtime overhead from boxing, unlike an algebraic datatype with a single constructor. Newtype wrappers around strings and numeric types can often drastically reduce accidental errors.

Consider the case of using a newtype to distinguish between two different text blobs with different semantics. Both have the same runtime representation as a text object, but are distinguished statically, so that plaintext can not be accidentally interchanged with encrypted text.

```
newtype Plaintext = Plaintext Text
newtype Crytpotext = Cryptotext Text
encrypt :: Key -> Plaintext -> Cryptotext
decrypt :: Key -> Cryptotext -> Plaintext
```

This is a surprisingly powerful tool as the Haskell compiler will refuse to compile any function which treats Cryptotext as Plaintext or vice versa!

The other common use case is using newtypes to derive logic for deriving custom monad transformers in our business logic. Using `-XGeneralizedNewtypeDeriving`

we can recover the functionality of instances of the underlying types composed in our transformer stack.

```
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
newtype Quantity v a = Quantity a
deriving (Eq, Ord, Num, Show)
data Haskeller
type Haskellers = Quantity Haskeller Int
a = Quantity 2 :: Haskellers
b = Quantity 6 :: Haskellers
totalHaskellers :: Haskellers
totalHaskellers = a + b
newtype Velocity = Velocity { unVelocity :: Double }
deriving (Eq, Ord)
v :: Velocity
v = Velocity 2.718
x :: Double
x = 2.718
-- Type error is caught at compile time even though
-- they are the same value at runtime!
err = v + x
```

```
Couldn't match type `Double' with `Velocity'
Expected type: Velocity
Actual type: Double
In the second argument of `(+)', namely `x'
In the expression: v + x
```

Using newtype deriving with the mtl library typeclasses we can produce flattened transformer types that don’t require explicit lifting in the transform stack. For example, here is a little stack machine involving the Reader, Writer and State monads.

```
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
import Control.Monad.Reader
import Control.Monad.Writer
import Control.Monad.State
type Stack = [Int]
type Output = [Int]
type Program = [Instr]
type VM a = ReaderT Program (WriterT Output (State Stack)) a
newtype Comp a = Comp { unComp :: VM a }
deriving (Functor, Applicative, Monad, MonadReader Program, MonadWriter Output, MonadState Stack)
data Instr = Push Int | Pop | Puts
evalInstr :: Instr -> Comp ()
evalInstr instr = case instr of
Pop -> modify tail
Push n -> modify (n:)
Puts -> do
tos <- gets head
tell [tos]
eval :: Comp ()
eval = do
instr <- ask
case instr of
[] -> return ()
(i:is) -> evalInstr i >> local (const is) eval
execVM :: Program -> Output
execVM = flip evalState [] . execWriterT . runReaderT (unComp eval)
program :: Program
program = [
Push 42,
Push 27,
Puts,
Pop,
Puts,
Pop
]
main :: IO ()
main = mapM_ print $ execVM program
```

Pattern matching on a newtype constructor compiles into nothing. For example the`extractB`

function below does not scrutinize the `MkB`

constructor like `extractA`

does, because `MkB`

does not exist at runtime; it is purely a compile-time construct.

```
data A = MkA Int
newtype B = MkB Int
extractA :: A -> Int
extractA (MkA x) = x
extractB :: B -> Int
extractB (MkB x) = x
```

## Efficiency

The second monad transformer law guarantees that sequencing consecutive lift operations is semantically equivalent to lifting the results into the outer monad.

Although they are guaranteed to yield the same result, the operation of lifting the results between the monad levels is not without cost and crops up frequently when working with the monad traversal and looping functions. For example, all three of the functions on the left below are less efficient than the right hand side which performs the bind in the base monad instead of lifting on each iteration.

```
-- Less Efficient More Efficient
forever (lift m) == lift (forever m)
mapM_ (lift . f) xs == lift (mapM_ f xs)
forM_ xs (lift . f) == lift (forM_ xs f)
```

## Monad Morphisms

Although the base monad transformer package provides a `MonadTrans`

class for lifting to another monad:

But oftentimes we need to work with and manipulate our monad transformer stack to either produce new transformers, modify existing ones or extend an upstream library with new layers. The `mmorph`

library provides the capacity to compose monad morphism transformation directly on transformer stacks. This is achieved primarily by use of the `hoist`

function which maps a function from a base monad into a function over a transformed monad.

Hoist takes a *monad morphism* (a mapping from a `m a`

to a `n a`

) and applies in on the inner value monad of a transformer stack, transforming the value under the outer layer.

The monad morphism `generalize`

takes an Identity monad into any another monad `m`

.

For example, it generalizes `State s a`

(which is `StateT s Identity a`

) to `StateT s m a`

.

So we can generalize an existing transformer to lift an IO layer onto it.

```
import Control.Monad.State
import Control.Monad.Morph
type Eval a = State [Int] a
runEval :: [Int] -> Eval a -> a
runEval = flip evalState
pop :: Eval Int
pop = do
top <- gets head
modify tail
return top
push :: Int -> Eval ()
push x = modify (x:)
ev1 :: Eval Int
ev1 = do
push 3
push 4
pop
pop
ev2 :: StateT [Int] IO ()
ev2 = do
result <- hoist generalize ev1
liftIO $ putStrLn $ "Result: " ++ show result
```

See:

## Effect Systems

The mtl model has several properties which make it suboptimal from a theoretical perspective. Although it is used widely in production Haskell we will discuss its shortcomings and some future models called *effect systems*.

**Extensibility**

When you add a new custom transformer inside of our business logic we’ll typically have to derive a large number of boilerplate instances to compose it inside of existing mtl transformer stack. For example adding `MonadReader`

instance for *n* number of undecidable instances that do nothing but mostly lifts. You can see this massive boilerplate all over the design of the `mtl`

library and its transitive dependencies.

```
instance MonadReader r m => MonadReader r (ExceptT e m) where
ask = lift ask
local = mapExceptT . local
reader = lift . reader
instance MonadReader r m => MonadReader r (IdentityT m) where
ask = lift ask
local = mapIdentityT . local
reader = lift . reader
-- Same for ListT, MaybeT, ...
...
```

This is called the * n^{2} instance problem* or the

*instance boilerplate problem*and remains an open problem of mtl.

**Composing Transformers**

Effects don’t generally commute from a theoretical perspective and as such monad transformer composition is not in general commutative. For example stacking `State`

and `Except`

is not commutative:

```
stateExcept :: StateT s (Except e) a -> s -> Either e (a, s)
stateExcept m s = runExcept (runStateT m s)
exceptState :: ExceptT e (State s) a -> s -> (Either e a, s)
exceptState m s = runState (runExceptT m) s
```

In addition, the standard method of deriving mtl classes for a transformer stack breaks down when using transformer stacks with the same monad at different layers of the stack. For example stacking multiple `State`

transformers is a pattern that shows up quite frequently.

In order to get around this you would have to handwrite the instances for this transformer stack and manually lift anytime you perform a State action. This is a suboptimal design and difficult to route around without massive boilerplate.

While these problems exist, most users of mtl don’t implement new transformers at all and can get by. However in recent years there have been written many other libraries that have explored the design space of alternative effect modeling systems. These systems are still quite early compared to the `mtl`

but some are able to avoid some of the shortcomings of `mtl`

in favour of newer algebraic models of effects. The two most commonly used libraries are:

`polysemy`

`fused-effects`

## Polysemy

Polysemy is a new effect system library based on the free-monad approach to modeling effects. The library uses modern type system features to model effects on top of a `Sem`

monad. The monad will have a members constraint type which constraints a parameter `r`

by a type-level list of effects in the given unit of computation.

For example we seamlessly mix and match error handling, tracing, and stateful updates inside of one computation without the new to create a layered monad. This would look something like the following:

These effects can then be evaluated using an interpreter function which unrolls and potentially evaluates the effects of the `Sem`

free monad. Some of these interpreters for tracing, state and error are similar to the evaluations for monad transformers but evaluate one layer of type-level list of the *effect stack*.

```
runError :: Sem (Error e ': r) a -> Sem r (Either e a)
runState :: s -> Sem (State s ': r) a -> Sem r (s, a)
runTraceList :: Sem (Trace ': r) a -> Sem r ([String], a)
```

The resulting `Sem`

monad with a single field can then be lowered into a single resulting monad such as IO or Either.

```
runFinal :: Monad m => Sem '[Final m] a -> m a
embedToFinal :: (Member (Final m) r, Functor m) => Sem (Embed m ': r) a -> Sem r a
```

The library provides rich set of of effects that can replace many uses of monad transformers.

`Polysemy.Async`

- Asynchronous computations`Polysemy.AtomicState`

- Atomic operations`Polysemy.Error`

- Error handling`Polysemy.Fail`

- Computations that fail`Polysemy.IO`

- Monadic IO`Polysemy.Input`

- Input effects`Polysemy.Output`

- Output effects`Polysemy.NonDet`

- Non-determinism effect`Polysemy.Reader`

- Contextual state a la Reader monad`Polysemy.Resource`

- Resources with finalizers`Polysemy.State`

- Stateful effects`Polysemy.Trace`

- Tracing effect`Polysemy.Writer`

- Accumulation effect a la Writer monad

For example for a simple stateful computation with only a single effect.

```
data Example = Example { x :: Int, y :: Int }
deriving (Show)
-- Stateful update to Example datastructure.
example1 :: Member (State Example) r => Sem r ()
example1 = do
modify $ \s -> s {x = 1}
pure ()
runExample1 :: IO ()
runExample1 = do
(result, _) <-
runFinal
$ embedToFinal @IO
$ runState (Example 0 0) example1
print result
```

And a more complex example which combines multiple effects:

```
import Polysemy
import Polysemy.Error
import Polysemy.State
import Polysemy.Trace
data MyError = MyError
deriving (Show)
-- Stateful update to Example datastructure, with errors and tracing.
example2 :: Members '[Trace, State Example, Error MyError] r => Sem r ()
example2 = do
modify $ \s -> s {x = 1, y = 2}
trace "foo"
throw MyError
pure ()
runExample2 :: IO ()
runExample2 = do
result <-
runFinal
$ embedToFinal @IO
$ errorToIOFinal @MyError
$ runState (Example 0 0)
$ traceToIO example2
print result
```

Polysemy will require the following language extensions to operate:

```
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
```

The use of free-monads is not entirely without cost, and there are experimental GHC plugins which can abstract away some of the overhead from the effect stack. Code thats makes use of polysemy should enable the following GHC flags to enable aggressive typeclass specialisation:

`-flate-specialise`

`-fspecialise-aggressively`

## Fused Effects

Fused-effects is an alternative approach to effect systems based on an algebraic effects model. Unlike polysemy, fused-effects does not use a free monad as an intermediate form. Fused-effects has competative performance compared with mtl and doesn’t require additional GHC plugins or extension compiler fusion rules to optimise away the abstraction overhead.

The `fused-effects`

library exposes a constraint kind called `Has`

which annotates a type signature that contains effectful logic. In this signature `m`

is called the **carrier** for the `sig`

**effect signature** containing the `eff`

effect.

For example the traditional State effect is modeled by the following datatype with three parameters. The `s`

parameter is the state object, the `m`

is the effect parameter. This exposes the same interface as `Control.Monad.State`

except for the `Has`

constraint instead.

```
data State s m k
= Get (s -> m k)
| Put s (m k)
deriving (Functor)
get :: Has (State s) sig m => m s
put :: Has (State s) sig m => s -> m ()
```

The `Carrier`

for the State effect is defined as `StateC`

and the evaluators for the state carrier are defined in the same interface as `mtl`

except they evaluate into a result containing the effect parameter `m`

.

```
newtype StateC s m a = StateC (s -> m (s, a))
deriving (Functor)
runState :: s -> StateC s m a -> m (s, a)
```

The evaluators for the effect lift monadic actions from an effectful computation.

Fused-effects requires the following language extensions to operate.

```
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE UndecidableInstances #-}
```

**Minimal Example**

A minimal example using the `State`

effect to track stateful updates to a single integral value.

The evaluation of this monadic state block results in a `m Integer`

with the Algebra and Effect context. This can then be evaluated into either `Identity`

or `IO`

using `run`

.

```
ex1 :: (Algebra sig m, Effect sig) => m Integer
ex1 = evalState (1 :: Integer) example1
run1 :: Identity Integer
run1 = runM ex1
run2 :: IO Integer
run2 = runM ex1
```

**Composite Effects**

Consider a more complex example which combines exceptions with `Throw`

effect with `State`

. Importantly note that functions `runThrow`

and `evalState`

cannot infer the state type from the signature alone and thus require additional annotations. This differs from `mtl`

which typically has more optimal inference.

```
example2 ::
( Has (State (Double, Double)) sig m,
Has (Throw ArithException) sig m
) =>
m Double
example2 = do
(a, b) <- get
if b == 0
then throwError DivideByZero
else pure (a / b)
ex2 :: (Algebra sig m, Effect sig) => m (Either ArithException Double)
ex2 = runThrow $ evalState (1 :: Double, 2 :: Double) example2
ex3 :: (Algebra sig m, Effect sig) => m (Either ArithException Double)
ex3 = evalState (1 :: Double, 0 :: Double) (runThrow example2)
```

# Language Extensions

## Philosophy

Haskell takes a drastically different approach to language design than most other languages as a result of being the synthesis of input from industrial and academic users. GHC allows the core language itself to be extended with a vast range of opt-in flags which change the semantics of the language on a per-module or per-project basis. While this does add a lot of complexity at first, it also adds a level of power and flexibility for the language to evolve at a pace that is unrivaled in the broader space of programming language design.

## Classes

It’s important to distinguish between different classes of GHC language extensions: *general* and *specialized*.

The inherent problem with classifying extensions into general and specialized categories is that it is a subjective classification. Haskellers who do theorem proving research will have a very different interpretation of Haskell than people who do web programming. Thus, we will use the following classifications:

*Benign*implies both that importing the extension won’t change the semantics of the module if not used and that enabling it makes it no easier to shoot yourself in the foot.*Historical*implies that one shouldn’t use this extension, it is in GHC purely for backwards compatibility. Sometimes these are dangerous to enable.*Steals syntax*means that enabling this extension causes certain code, that is valid in vanilla Haskell, to be no longer be accepted. For example,`f $(a)`

is the same as`f $ (a)`

in Haskell98, but`TemplateHaskell`

will interpret`$(a)`

as a splice.

Benign | Historical | Steals Syntax | Use | Use | GHC Reference | Reference | |

AllowAmbiguousTypes | Specialized | Typelevel Programming | Ref | ||||

Arrows | ✓ | Specialized | Syntax Extension | Ref | Arrows | ||

AutoDeriveTypeable | ✓ | Specialized | Deriving | Ref | |||

BangPatterns | ✓ | ✓ | General | Strictness Annotation | Ref | Strictness Annotations | |

ApplicativeDo | Specialized | FFI | Ref | Applicative Do | |||

CApiFFI | Specialized | FFI | Ref | ||||

ConstrainedClassMethods | ✓ | Specialized | Typelevel Programming | Ref | |||

ConstraintKinds | ✓ | Specialized | Typelevel Programming | Ref | Constraint Kinds | ||

CPP | ✓ | General | Preprocessor | Ref | Cpp | ||

DataKinds | ✓ | Specialized | Typelevel Programming | Ref | Data Kinds | ||

DatatypeContexts | ✓ | Deprecated | Deprecated | Ref | |||

DefaultSignatures | ✓ | Specialized | Generic Programming | Ref | Generic | ||

DeriveAnyClass | ✓ | General | Deriving | Ref | |||

DeriveDataTypeable | ✓ | General | Deriving | Ref | Typeable | ||

DeriveFoldable | ✓ | General | Deriving | Ref | Foldable / Traversable | ||

DeriveFunctor | ✓ | General | Deriving | Ref | |||

DeriveGeneric | ✓ | General | Deriving | Ref | Generic | ||

DeriveLift | ✓ | General | Deriving | Ref | Template Haskell | ||

DeriveTraversable | ✓ | General | Deriving | Ref | |||

DisambiguateRecordFields | ✓ | ✓ | Specialized | Syntax Extension | Ref | ||

DuplicateRecordFields | ✓ | ✓ | Specialized | Syntax Extension | Ref | DuplicateRecordFields | |

DoRec | ✓ | ✓ | Specialized | Syntax Extension | Ref | Recursive Do | |

EmptyCase | ✓ | Specialized | Syntax Extension | Ref | EmptyCase | ||

EmptyDataDecls | ✓ | General | Syntax Extension | Ref | Void | ||

ExistentialQuantification | ✓ | Specialized | Typelevel Programming | Ref | Existential Quantification | ||

ExplicitForAll | Specialized | Typelevel Programming | Ref | Universal Quantification | |||

ExplicitNamespaces | ✓ | ✓ | Specialized | Syntax Disambiguation | Ref | ||

ExtendedDefaultRules | Specialized | Type Disambiguation | Ref | ||||

FlexibleContexts | ✓ | General | Typeclass Extension | Ref | Flexible Contexts | ||

FlexibleInstances | ✓ | General | Typeclass Extension | Ref | Flexible Instances | ||

ForeignFunctionInterface | ✓ | General | FFI | Ref | FFI | ||

FunctionalDependencies | ✓ | General | Typeclass Extension | Ref | Multiparam Typeclasses | ||

GADTs | ✓ | General | Typelevel Programming | Ref | GADTs | ||

GADTSyntax | ✓ | General | Syntax Extension | Ref | GADTs | ||

GeneralizedNewtypeDeriving | ✓ | General | Typeclass Extension | Ref | Newtype Deriving | ||

GHCForeignImportPrim | Specialized | FFI | Ref | Cmm | |||

ImplicitParams | ✓ | Specialized | Typelevel Programming | Ref | |||

ImpredicativeTypes | ✓ | Specialized | Typelevel Programming | Ref | Impredicative Types | ||

IncoherentInstances | Specialized | Typelevel Programming | Ref | Incoherent Instances | |||

InstanceSigs | ✓ | Specialized | Typelevel Programming | Ref | |||

InterruptibleFFI | Specialized | FFI | Ref | FFI | |||

KindSignatures | ✓ | Specialized | Typelevel Programming | Ref | Kind Signatures | ||

LambdaCase | ✓ | General | Syntax Extension | Ref | Lambda Case | ||

LiberalTypeSynonyms | ✓ | Specialized | Typeclass Extension | Ref | |||

MagicHash | ✓ | Specialized | GHC Internals | Ref | Unboxed Types | ||

MonadComprehensions | ✓ | Specialized | Syntax Extension | Ref | |||

MonoLocalBinds | General | Type Disambiguation | Ref | ||||

MonoPatBinds | Specialized | Type Disambiguation | Ref | ||||

MultiParamTypeClasses | ✓ | General | Typeclass Extension | Ref | Multiparam Typeclasses | ||

MultiWayIf | ✓ | Specialized | Syntax Extension | Ref | MultiWawyIf | ||

NamedFieldPuns | ✓ | Specialized | Syntax Extension | Ref | Named Field Puns | ||

NegativeLiterals | General | Type Disambiguation | Ref | ||||

NoImplicitPrelude | Specialized | Import Disambiguation | Ref | Custom Prelude | |||

NoMonomorphismRestriction | General | Type Disambiguation | Ref | Monomorphism Restriction | |||

NPlusKPatterns | ✓ | ✓ | Deprecated | Deprecated | Ref | ||

NullaryTypeClasses | Specialized | Typeclass Extension | Ref | Multiparam Typeclasses | |||

NumDecimals | General | Type Disambiguation | Ref | NumDecimals | |||

OverlappingInstances | Specialized | Typeclass Extension | Ref | Overlapping Instances | |||

OverloadedLabels | ✓ | General | Type Disambiguation | Ref | Overloaded Labels | ||

OverloadedRecordFields | ✓ | General | Syntax Extension | Ref | Overloaded Labels | ||

OverloadedLists | General | Syntax Extension | Ref | Overloaded Lists | |||

OverloadedStrings | General | Syntax Extension | Ref | Overloaded Strings | |||

PackageImports | ✓ | General | Import Disambiguation | Ref | Package Imports | ||

ParallelArrays | Specialized | Data Parallel Haskell | Ref | ||||

ParallelListComp | ✓ | ✓ | General | Syntax Extension | Ref | ||

PartialTypeSignatures | ✓ | General | Interactive Typing | Ref | Partial Type Signatures | ||

PatternGuards | ✓ | ✓ | General | Syntax Extension | Ref | Pattern Guards | |

PatternSynonyms | ✓ | ✓ | General | Syntax Extension | Ref | Pattern Synonyms | |

PolyKinds | Specialized | Typelevel Programming | Ref | Kind Polymorphism | |||

PolymorphicComponents | ✓ | Specialized | Deprecated | Ref | |||

PostfixOperators | ✓ | ✓ | Specialized | Syntax Extension | Ref | ||

QuasiQuotes | ✓ | Specialized | Metaprogramming | Ref | QuasiQuotation | ||

Rank2Types | ✓ | Specialized | Historical Artifact | Ref | Rank N Types | ||

RankNTypes | ✓ | Specialized | Typelevel Programming | Ref | Rank N Types | ||

RebindableSyntax | ✓ | Specialized | Metaprogramming | Ref | Indexed Monads | ||

RecordWildCards | ✓ | ✓ | General | Syntax Extension | Ref | Record Wildcards | |

RecursiveDo | ✓ | Specialized | Syntax Extension | Ref | MonadFix | ||

RelaxedPolyRec | Specialized | Type Disambiguation | Ref | ||||

RoleAnnotations | ✓ | Specialized | Type Disambiguation | Ref | Roles | ||

Safe | Specialized | Security Auditing | Ref | Safe Haskell | |||

SafeImports | Specialized | Security Auditing | Ref | Safe Haskell | |||

ScopedTypeVariables | ✓ | Specialized | Typelevel Programming | Ref | Scoped Type Variables | ||

StandaloneDeriving | ✓ | ✓ | General | Typeclass Extension | Ref | ||

StaticPointers | ✓ | ✓ | General | Distributed Programming | Ref | ||

Strict | General | Strictness Annotations | Ref | Strict Haskell | |||

StrictData | General | Strictness Annotations | Ref | Strict Haskell | |||

TemplateHaskell | ✓ | ✓ | Specialized | Metaprogramming | Ref | Template Haskell | |

TraditionalRecordSyntax | ✓ | ✓ | Specialized | Historical Artifact | Ref | Historical Extensions | |

TransformListComp | ✓ | Specialized | Syntax Extension | Ref | |||

Trustworthy | Specialized | Security Auditing | Ref | Safe Haskell | |||

TupleSections | ✓ | General | Syntax Extension | Ref | Tuple Sections | ||

TypeApplications | ✓ | ✓ | Specialized | Typelevel Programming | Ref | ||

TypeFamilies | ✓ | Specialized | Typelevel Programming | Ref | Type Families | ||

TypeHoles | ✓ | General | Interactive Typing | Ref | Type Holes | ||

TypeInType | Specialized | Typelevel Programming | Ref | ||||

TypeOperators | ✓ | Specialized | Typelevel Programming | Ref | Manual Proofs | ||

TypeSynonymInstances | ✓ | General | Typeclass Extension | Ref | Type Synonym Instances | ||

UnboxedTuples | ✓ | ✓ | Specialized | FFI | Ref | ||

UndecidableInstances | Specialized | Typelevel Programming | Ref | Multiparam Typeclasses | |||

UnicodeSyntax | ✓ | Specialized | Syntax Extension | Ref | |||

UnliftedFFITypes | Specialized | FFI | Ref | Cmm | |||

Unsafe | Specialized | Security Auditing | Ref | Safe Haskell | |||

ViewPatterns | ✓ | ✓ | General | Syntax Extension | Ref | View Patterns |

The golden source of truth for language extensions is the official GHC user’s guide which contains a plethora of information on the details of these extensions.

## Extension Dependencies

Some language extensions will implicitly enable other language extensions for their operation. The table below shows the dependencies between various extensions and which sets are implied.

Extension | Implies |
---|---|

TypeFamilyDependencies | TypeFamilies |

TypeInType | PolyKinds, DataKinds, KindSignatures |

PolyKinds | KindSignatures |

ScopedTypeVariables | ExplicitForAll |

RankNTypes | ExplicitForAll |

ImpredicativeTypes | RankNTypes |

TemplateHaskell | TemplateHaskellQuotes |

Strict | StrictData |

RebindableSyntax | NoImplicitPrelude |

TypeOperators | ExplicitNamespaces |

LiberalTypeSynonyms | ExplicitForAll |

ExistentialQuantification | ExplicitForAll |

GADTs | MonoLocalBinds, GADTSyntax |

DuplicateRecordFields | DisambiguateRecordFields |

RecordWildCards | DisambiguateRecordFields |

DeriveTraversable | DeriveFoldable, DeriveFunctor |

MultiParamTypeClasses | ConstrainedClassMethods |

DerivingVia | DerivingStrategies |

FunctionalDependencies | MultiParamTypeClasses |

FlexibleInstances | TypeSynonymInstances |

TypeFamilies | MonoLocalBinds, KindSignatures, ExplicitNamespaces |

IncoherentInstances | OverlappingInstances |

## The Benign

It’s not obvious which extensions are the most common but it’s fairly safe to say that these extensions are benign and are safely used extensively:

- NoImplicitPrelude
- OverloadedStrings
- LambdaCase
- FlexibleContexts
- FlexibleInstances
- GeneralizedNewtypeDeriving
- TypeSynonymInstances
- MultiParamTypeClasses
- FunctionalDependencies
- NoMonomorphismRestriction
- GADTs
- BangPatterns
- DeriveGeneric
- DeriveAnyClass
- DerivingStrategies
- ScopedTypeVariables

## The Advanced

These extensions are typically used by advanced projects that push the limits of what is possible with Haskell to enforce complex invariants and very type-safe APIs.

- PolyKinds
- DataKinds
- DerivingVia
- GADTs
- RankNTypes
- ExistentialQuantification
- TypeFamilies
- TypeOperators
- TypeApplications
- UndecidableInstances

## The Lowlevel

These extensions are typically used by low-level libraries that are striving for optimal performance or need to integrate with foreign functions and native code. Most of these are used to manipulate base machine types and interface directly with the low-level byte representations of data structures.

- CPP
- BangPatterns
- CApiFFI
- Strict
- StrictData
- RoleAnnotations
- ForeignFunctionInterface
- InterruptibleFFI
- UnliftedFFITypes
- MagicHash
- UnboxedSums
- UnboxedTuples

## The Dangerous

GHC’s typechecker sometimes casually tells us to enable language extensions when it can’t solve certain problems. Unless you know what you’re doing, these extensions almost always indicate a design flaw and shouldn’t be turned on to remedy the error at hand, as much as GHC might suggest otherwise!

- AllowAmbiguousTypes
- DatatypeContexts
- OverlappingInstances
- IncoherentInstances
- ImpredicativeTypes

## NoMonomorphismRestriction

The NoMonomorphismRestriction allows us to disable the monomorphism restriction typing rule GHC uses by default. See monomorphism restriction.

For example, if we load the following module into GHCi

And then we attempt to call the function `bar`

with a Double, we get a type error:

```
λ: bar 1.1
<interactive>:2:5: error:
• No instance for (Fractional Integer)
arising from the literal ‘1.1’
• In the first argument of ‘bar’, namely ‘1.1’
In the expression: bar 1.1
In an equation for ‘it’: it = bar 1.1
```

The problem is that GHC has inferred an overly specific type:

We can prevent GHC from specializing the type with this extension:

Now everything will work as expected:

## ExtendedDefaultRules

In the absence of explicit type signatures, Haskell normally resolves ambiguous literals using several defaulting rules. When an ambiguous literal is typechecked, if at least one of its typeclass constraints is numeric and all of its classes are standard library classes, the module’s default list is consulted, and the first type from the list that will satisfy the context of the type variable is instantiated. For instance, given the following default rules

The following set of heuristics is used to determine what to instantiate the ambiguous type variable to.

- The type variable
`a`

appears in no other constraints - All the classes
`Ci`

are standard. - At least one of the classes
`Ci`

is numerical.

The standard `default`

definition is implicitly defined as `(Integer, Double)`

This is normally fine, but sometimes we’d like more granular control over defaulting. The `-XExtendedDefaultRules`

loosens the restriction that we’re constrained with working on Numerical typeclasses and the constraint that we can only work with standard library classes. For example, if we’d like to have our string literals (using `-XOverloadedStrings`

) automatically default to the more efficient `Text`

implementation instead of `String`

we can twiddle the flag and GHC will perform the right substitution without the need for an explicit annotation on every string literal.

```
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE ExtendedDefaultRules #-}
import qualified Data.Text as T
default (T.Text)
example = "foo"
```

For code typed at the GHCi prompt, the `-XExtendedDefaultRules`

flag is always on, and cannot be switched off.

## Safe Haskell

The Safe Haskell language extensions allow us to restrict the use of unsafe language features using `-XSafe`

which restricts the import of modules which are themselves marked as Safe. It also forbids the use of certain language extensions (`-XTemplateHaskell`

) which can be used to produce unsafe code. The primary use case of these extensions is security auditing of codebases for compliance purposes.

```
{-# LANGUAGE Safe #-}
import Unsafe.Coerce
import System.IO.Unsafe
bad1 :: String
bad1 = unsafePerformIO getLine
bad2 :: a
bad2 = unsafeCoerce 3.14 ()
```

See: Safe Haskell

## PartialTypeSignatures

Normally a function is either given a full explicit type signature or none at all. The partial type signature extension allows something in between.

Partial types may be used to avoid writing uninteresting pieces of the signature, which can be convenient in development:

If the `-Wpartial-type-signatures`

GHC option is set, partial types will still trigger warnings.

See:

## RecursiveDo

Recursive do notation allows for the use of self-reference expressions on both sides of a monadic bind. For instance the following example uses lazy evaluation to generate an infinite list. This is sometimes used to instantiate a cyclic datatype inside a monadic context where the datatype needs to hold a reference to itself.

```
{-# LANGUAGE RecursiveDo #-}
justOnes :: Maybe [Int]
justOnes = do
rec xs <- Just (1:xs)
return (map negate xs)
```

## ApplicativeDo

By default GHC desugars do-notation to use implicit invocations of bind and return. With normal monad sugar the following…

… desugars into:

With `ApplicativeDo`

this instead desugars into use of applicative combinators and a laxer Applicative constraint.

Which is equivalent to the traditional notation.

## PatternGuards

Pattern guards are an extension to the pattern matching syntax. Given a `<-`

pattern qualifier, the right hand side is evaluated and matched against the pattern on the left. If the match fails then the whole guard fails and the next equation is tried. If it succeeds, then the appropriate binding takes place, and the next qualifier is matched.

```
{-# LANGUAGE PatternGuards #-}
combine env x y
| Just a <- lookup x env
, Just b <- lookup y env
= Just $ a + b
| otherwise = Nothing
```

## ViewPatterns

View patterns are like pattern guards that can be nested inside of other patterns. They are a convenient way of pattern-matching against values of algebraic data types.

```
{-# LANGUAGE ViewPatterns #-}
{-# LANGUAGE NoMonomorphismRestriction #-}
import Safe
lookupDefault :: Eq a => a -> b -> [(a,b)] -> b
lookupDefault k _ (lookup k -> Just s) = s
lookupDefault _ d _ = d
headTup :: (a, [t]) -> [t]
headTup (headMay . snd -> Just n) = [n]
headTup _ = []
headNil :: [a] -> [a]
headNil (headMay -> Just x) = [x]
headNil _ = []
```

## TupleSections

The TupleSections syntax extension allows tuples to be constructed similar to how operator sections. With this extension enabled, tuples of arbitrary size can be “partially” specified with commas and values given for specific positions in the tuple. For example for a 2-tuple:

```
{-# LANGUAGE TupleSections #-}
first :: a -> (a, Bool)
first = (,True)
second :: a -> (Bool, a)
second = (True,)
```

An example for a 7-tuple where three values are specified in the section.

## Postfix Operators

The postfix operators extensions allows user-defined operators that are placed after expressions. For example, using this extension, we could define a postfix factorial function.

```
{-# LANGUAGE PostfixOperators #-}
(!) :: Integer -> Integer
(!) n = product [1..n]
example :: Integer
example = (52!)
```

## MultiWayIf

Multi-way if expands traditional if statements to allow pattern match conditions that are equivalent to a chain of if-then-else statements. This allows us to write “pattern matching predicates” on a value. This alters the syntax of Haskell language.

```
{-# LANGUAGE MultiWayIf #-}
bmiTell :: Float -> Text
bmiTell bmi = if
| bmi <= 18.5 -> "Underweight."
| bmi <= 25.0 -> "Average weight."
| bmi <= 30.0 -> "Overweight."
| otherwise -> "Clinically overweight."
```

## EmptyCase

GHC normally requires at least one pattern branch in a case statement; this restriction can be relaxed with the `EmptyCase`

language extension. The case statement then immediately yields a `Non-exhaustive patterns in case`

if evaluated. For example, the following will compile using this language pragma:

## LambdaCase

For case statements, the language extension `LambdaCase`

allows the elimination of redundant free variables introduced purely for the case of pattern matching on.

Without *LambdaCase*:

With *LambdaCase*:

```
{-# LANGUAGE LambdaCase #-}
data Exp a
= Lam a (Exp a)
| Var a
| App (Exp a) (Exp a)
example :: Exp a -> a
example = \case
Lam a b -> a
Var a -> a
App a b -> example a
```

## NumDecimals

The extension `NumDecimals`

allows the use of exponential notation for integral literals that are not necessarily floats. Without it, any use of exponential notation induces a Fractional class constraint.

## PackageImports

The syntax language extension `PackageImports`

allows us to disambiguate hierarchical package names by their respective package key. This is useful in the case where you have two imported packages that expose the same module. In practice most of the common libraries have taken care to avoid conflicts in the namespace and this is not usually a problem in most modern Haskell.

For example we could explicitly ask GHC to resolve that `Control.Monad.Error`

package be drawn from the `mtl`

library.

```
import qualified "mtl" Control.Monad.Error as Error
import qualified "mtl" Control.Monad.State as State
import qualified "mtl" Control.Monad.Reader as Reader
```

## RecordWildCards

Record wild cards allow us to expand out the names of a record as variables scoped as the labels of the record implicitly. The extension can be used to extract variables names into a scope and/or to assign to variables in a record drawing(**?**), aligning the record’s labels with the variables in scope for the assignment. The syntax introduced is the `{..}`

pattern selector as in the following example:

```
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE OverloadedStrings #-}
import Data.Text
data Example = Example
{ e1 :: Int
, e2 :: Text
, e3 :: Text
} deriving (Show)
-- Extracting from a record using wildcards.
scope :: Example -> (Int, Text, Text)
scope Example {..} = (e1, e2, e3)
-- Assign to a record using wildcards.
assign :: Example
assign = Example {..}
where
(e1, e2, e3) = (1, "Kirk", "Picard")
```

## NamedFieldPuns

`NamedFieldPuns`

provides alternative syntax for accessing record fields in a pattern match.

```
data D = D {a :: Int, b :: Int}
f :: D -> Int
f D {a, b} = a - b
-- Order doesn't matter
g :: D -> Int
g D {b, a} = a - b
```

## PatternSynonyms

Suppose we were writing a typechecker, and we needed to parse type signatures. One common solution would to include a `TArr`

to pattern match on type function signatures. Even though, technically it could be written in terms of more basic application of the `(->)`

constructor.

With pattern synonyms we can eliminate the extraneous constructor without losing the convenience of pattern matching on arrow types. We introduce a new pattern using the `pattern`

keyword.

So now we can write a deconstructor and constructor for the arrow type very naturally.

```
{-# LANGUAGE PatternSynonyms #-}
import Data.List (foldl1')
type Name = String
type TVar = String
type TyCon = String
data Type
= TVar TVar
| TCon TyCon
| TApp Type Type
deriving (Show, Eq, Ord)
pattern TArr t1 t2 = TApp (TApp (TCon "(->)") t1) t2
tapp :: TyCon -> [Type] -> Type
tapp tcon args = foldl TApp (TCon tcon) args
arr :: [Type] -> Type
arr ts = foldl1' (\t1 t2 -> tapp "(->)" [t1, t2]) ts
elimTArr :: Type -> [Type]
elimTArr (TArr (TArr t1 t2) t3) = t1 : t2 : elimTArr t3
elimTArr (TArr t1 t2) = t1 : elimTArr t2
elimTArr t = [t]
-- (->) a ((->) b a)
-- a -> b -> a
to :: Type
to = arr [TVar "a", TVar "b", TVar "a"]
from :: [Type]
from = elimTArr to
```

Pattern synonyms can be exported from a module like any other definition by prefixing them with the prefix `pattern`

.

## DeriveFunctor

Many instances of functors over datatypes with parameters and trivial constructors are the result of trivially applying a function over the single constructor’s argument. GHC can derive this boilerplate automatically in deriving clauses if `DeriveFunctor`

is enabled.

```
{-# LANGUAGE DeriveFunctor #-}
data Tree a = Node a [Tree a]
deriving (Show, Functor)
tree :: Tree Int
tree = fmap (+1) (Node 1 [Node 2 [], Node 3 []])
```

## DeriveFoldable

Similar to how Functors can be automatically derived, many instances of Foldable for types of kind `* -> *`

have instances that derive the functions:

`foldMap`

`foldr`

`null`

For instance if we have a custom rose tree and binary tree implementation we can automatically derive the fold functions for these datatypes automatically for us.

```
{-# LANGUAGE DeriveFoldable #-}
data RoseTree a
= RoseTree a [RoseTree a]
deriving (Foldable)
data Tree a
= Leaf a
| Branch (Tree a) (Tree a)
deriving (Foldable)
```

These will generate the following instances:

```
instance Foldable RoseTree where
foldr f z (RoseTree a1 a2)
= f a1 ((\ b3 b4 -> foldr (\ b1 b2 -> foldr f b2 b1) b4 b3) a2 z)
foldMap f (RoseTree a1 a2)
= mappend (f a1) (foldMap (foldMap f) a2)
null (RoseTree _ _) = False
instance Foldable Tree where
foldr f z (Leaf a1) = f a1 z
foldr f z (Branch a1 a2)
= (\ b1 b2 -> foldr f b2 b1) a1 ((\ b3 b4 -> foldr f b4 b3) a2 z)
foldMap f (Leaf a1) = f a1
foldMap f (Branch a1 a2) = mappend (foldMap f a1) (foldMap f a2)
null (Leaf _) = False
null (Branch a1 a2) = (&&) (null a1) (null a2)
```

## DeriveTraversable

Just as with Functor and Foldable, many `Traversable`

instances for single-paramater datatypes of kind `* -> *`

have trivial implementations of the `traverse`

function which can also be derived automatically. By enabling `DeriveTraversable`

we can use stock deriving to derive these instances for us.

```
{-# LANGUAGE DeriveTraversable #-}
{-# LANGUAGE PartialTypeSignatures #-}
data Tree a = Node a [Tree a]
deriving (Show, Functor, Foldable, Traversable)
tree :: Maybe [Int]
tree = foldMap go (Node [1] [Node [2] [], Node [3,4] []])
where
go [] = Nothing
go xs = Just xs
```

## DeriveGeneric

Data types in Haskell can derived by GHC with the DeriveGenerics extension which is able to define the entire structure of the Generic instance and associated type families. See Generics for more details on what these types mean.

For example the simple custom List type deriving Generic:

```
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics
data List a
= Cons a (List a)
| Nil deriving
(Generic)
```

Will generate the following `Generic`

instance:

```
instance Generic (List a) where
type
Rep (List a) =
D1
('MetaData "List" "Ghci3" "MyModule" 'False)
( C1
('MetaCons "Cons" 'PrefixI 'False)
( S1
( 'MetaSel
'Nothing
'NoSourceUnpackedness
'NoSourceStrictness
'DecidedLazy
)
(Rec0 a)
:*: S1
( 'MetaSel
'Nothing
'NoSourceUnpackedness
'NoSourceStrictness
'DecidedLazy
)
(Rec0 (List a))
)
:+: C1 ('MetaCons "Nil" 'PrefixI 'False) U1
)
from x = M1
( case x of
Cons g1 g2 -> L1 (M1 ((:*:) (M1 (K1 g1)) (M1 (K1 g2))))
Nil -> R1 (M1 U1)
)
to (M1 x) = case x of
(L1 (M1 ((:*:) (M1 (K1 g1)) (M1 (K1 g2))))) -> Cons g1 g2
(R1 (M1 U1)) -> Nil
```

## DeriveAnyClass

With `-XDeriveAnyClass`

we can derive any class. The deriving logic generates an instance declaration for the type with no explicitly-defined methods or with all instances having a specific default implementation given. These are used extensively with Generics when instances provide empty Minimal Annotations which are all derived from generic logic.

A contrived example of a class with an empty minimal set might be the following:

```
{-# LANGUAGE DefaultSignatures #-}
{-# LANGUAGE DeriveAnyClass #-}
class MinimalClass a where
const1 :: a -> Int
default const1 :: a -> Int
const1 _ = 1
const2 :: a -> Int
default const2 :: a -> Int
const2 _ = 2
data Example = Example
deriving (MinimalClass)
main :: IO ()
main = do
print (const1 Example)
print (const2 Example)
```

## DuplicateRecordFields

GHC 8.0 introduced the `DuplicateRecordFields`

extensions which loosens GHC’s restriction on records in the same module with identical accessors. The precise type that is being projected into is now deferred to the callsite.

```
{-# LANGUAGE DuplicateRecordFields #-}
data Person = Person { id :: Int }
data Animal = Animal { id :: Int }
data Vegetable = Vegetable { id :: Int }
test :: (Person, Animal, Vegetable)
test = (Person {id = 1}, Animal {id = 2}, Vegetable {id = 3})
```

Using just `DuplicateRecordFields`

, projection is still not supported so the following will not work.

## OverloadedLabels

GHC 8.0 also introduced the `OverloadedLabels`

extension which allows a limited form of polymorphism over labels that share the same name.

To work with overloaded label types we also need to enable several language extensions that allow us to use the promoted strings and multiparam typeclasses that underlay its implementation.

```
{-# LANGUAGE OverloadedLabels #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE ExistentialQuantification #-}
import GHC.Records (HasField(..)) -- Since base 4.10.0.0
import GHC.OverloadedLabels (IsLabel(..))
data S = MkS { foo :: Int }
data T x y z = forall b . MkT { foo :: y, bar :: b }
instance HasField x r a => IsLabel x (r -> a) where
fromLabel = getField
main :: IO ()
main = do
print (#foo (MkS 42))
print (#foo (MkT True False))
```

This is used in more advanced libraries like Selda which do object relational mapping between Haskell datatype fields and database columns.

See:

## CPP

The C++ preprocessor is the fallback whenever we really need to separate out logic that has to span multiple versions of GHC and language changes while maintaining backwards compatibility. It can dispatch on the version of GHC being used to compile a module.

```
{-# LANGUAGE CPP #-}
#if (__GLASGOW_HASKELL__ > 710)
-- Imports for GHC 7.10.x
#else
-- Imports for other GHC
#endif
```

It can also demarcate code based on the operating system compiled on.

```
{-# LANGUAGE CPP #-}
#ifdef OS_Linux
-- Linux specific logic
#else
# ifdef OS_Win32
-- Windows specific logic
# else
# ifdef OS_Mac
-- Mac specific logic
# else
-- Other operating systems
# endif
# endif
#endif
```

For another example, it can distinguish the version of the base library used.

One can also use the CPP extension to emit Haskell source at compile-time. This is used in some libraries which have massive boilerplate obligations. Of course, this can be abused quite easily and doing this sort of compile-time string-munging should be a last resort.

## TypeApplications

The type system extension `TypeApplications`

allows you to use explicit annotations for subexpressions. For example if you have a subexpression which has the inferred type `a -> b -> a`

you can name the types of `a`

and `b`

by explicitly stating `@Int @Bool`

to assign `a`

to `Int`

and `b`

to `Bool`

. This is particularly useful when working with typeclasses where type inference cannot deduce the types of all subexpressions from the toplevel signature and results in an overly specific default. This is quite common when working with roundtrips of `read`

and `show`

. For example:

```
{-# LANGUAGE TypeApplications #-}
import Data.Proxy
a :: Proxy Int
a = Proxy @Int
b :: String
b = show (read @Int "42")
```

## DerivingVia

`DerivingVia`

is an extension of `GeneralizedNewtypeDeriving`

. Just as newtype deriving allows us to derive instances in terms of instances for the underlying representation of the newtype, DerivingVia allows deriving instances by specifying a custom type which has a runtime representation equal to the desired behavior we’re deriving the instance for. The derived instance can then be `coerced`

to behave as if it were operating over the given type. This is a powerful new mechanism that allows us to derive many typeclasses in terms of other typeclasses.

```
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DerivingVia #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE KindSignatures #-}
import Control.Applicative
import Data.Functor.Const (Const (..))
import GHC.Exts (Any)
-- Deriving Eq in terms of Const functor
newtype Age = MkAge Int
deriving
(Eq)
via Const Int Any
-- Deriving Num across a nested functor
newtype FNum f a = FNum (f a)
deriving stock (Functor)
deriving newtype (Applicative)
instance (Applicative f, Num a) => Num (FNum f a) where
(+) = liftA2 (+)
(-) = liftA2 (-)
(*) = liftA2 (*)
abs = fmap abs
signum = fmap signum
fromInteger = FNum . pure . fromInteger
newtype Example a b = Example (Either a b)
deriving stock (Show, Functor)
deriving newtype (Applicative)
deriving (Num) via FNum (Either a) b
a :: Example Integer Integer
a = Example (Left 1)
b :: Example Integer Integer
b = Example (Right 1)
example :: IO ()
example = do
print (a + a)
print (a + b)
print (b + b)
```

## DerivingStrategies

Deriving has proven a powerful mechanism to add typeclass instances and as such there have been a variety of bifurcations in its use. Since GHC 8.2 there are now four different algorithms that can be used to derive typeclass instances. These are enabled by different extensions and now have specific syntax for invoking each algorithm specifically. Turning on `DerivingStrategies`

allows you to disambiguate which algorithm GHC should use for individual class derivations.

`stock`

- Standard GHC builtin deriving (i.e.`Eq`

,`Ord`

,`Show`

)`anyclass`

- Deriving via minimal annotations with DeriveAnyClass.`newtype`

- Deriving with [GeneralizedNewtypeDeriving].`via`

- Deriving with DerivingVia.

These can be stacked and combined on top of a data or newtype declaration.

```
newtype Example = Example Int
deriving stock (Read, Show)
deriving newtype (Num, Floating)
deriving anyclass (ToJSON, FromJSON, ToSQL, FromSQL)
deriving (Eq) via (Const Int Any)
```

## Historical Extensions

Several language extensions have either been absorbed into the core language or become deprecated in favor of others. Others are just considered misfeatures.

`Rank2Types`

- Rank2Types has been subsumed by`RankNTypes`

`XPolymorphicComponents`

- Was an implementation detail of higher-rank polymorphism that no longer exists.`NPlusKPatterns`

- These were largely considered an ugly edge-case of pattern matching language that was best removed.`TraditionalRecordSyntax`

- Traditional record syntax was an extension to the Haskell 98 specification for what we now consider standard record syntax.`OverlappingInstances`

- Subsumed by explicit`OVERLAPPING`

pragmas.`IncoherentInstances`

- Subsumed by explicit`INCOHERENT`

pragmas.`NullaryTypeClasses`

- Subsumed by explicit Multiparameter Typeclasses with no parameters.`TypeInType`

- Is deprecated in favour of the combination of`PolyKinds`

and`DataKinds`

and extensions to the GHC typesystem after GHC 8.0.

# Type Class Extensions

Typeclasses are the bread and butter of abstractions in Haskell, and even out of the box in Haskell 98 they are quite powerful. However classes have grown quite a few extensions, additional syntax and enhancements over the years to augment their utility.

```
-- +-----+------------------ Typeclass Context
-- | | +------ Typeclass Head
-- | | |
-- ^^^^^^^^^^^^^^^ ^^^^^^^^^^^
class (Ctx1 a, Ctx2 b) => MyClass a b where
method1 :: a -> b
-- |
-- +------------------------ Typeclass Method
```

## Standard Hierarchy

In the course of writing Haskell there are seven core instances you will use and derive most frequently. Each of them are lawful classes with several equations associated with their methods.

`Semigroup`

`Monoid`

`Functor`

`Applicative`

`Monad`

`Foldable`

`Traversable`

## Instance Search

Whenever a typeclass method is invoked at a callsite, GHC will perform an instance search over all available instances defined for the given typeclass associated with the method. This instance search is quite similar to backward chaining in logic programming languages. The search is performed during compilation after all types in all modules are known and is performed *globally* across all modules and all packages available to be linked. The instance search can either result in no instances, a single instance or multiple instances which satisfy the conditions of the call site.

## Orphan Instances

Normally typeclass definitions are restricted to be defined in one of two places:

- In the same module as the declaration of the datatype in the instance head.
- In the same module as the class declaration.

These two restrictions restrict the instance search space to a system where a solution (if it exists) can always be found. If we allowed instances to be defined in any modules then we could potentially have multiple class instances defined in multiple modules and the search would be ambiguous.

This restriction can however be disabled with the `-fno-warn-orphans`

flag.

This will allow you to define orphan instances in the current module. But beware this will make the instance search contingent on your import list and may result in clashes in your codebase where the linker will fail because there are multiple modules which define the same instance head.

When used appropriately this can be the way to route around the fact that upstream modules may define datatypes that you use, but they have not defined the instances for other downstream libraries that you also use. You can then write these instances for your codebase without modifying either upstream library.

## Minimal Annotations

In the presence of default implementations for typeclass methods, there may be several ways to implement a typeclass. For instance Eq is entirely defined by either defining when two values are equal or not equal by implying taking the negation of the other. We can define equality in terms of non-equality and vice-versa.

Before 7.6.1 there was no way to specify what was the “minimal” definition required to implement a typeclass

```
class Eq a where
(==), (/=) :: a -> a -> Bool
x == y = not (x /= y)
x /= y = not (x == y)
{-# MINIMAL (==) #-}
{-# MINIMAL (/=) #-}
```

Minimal pragmas are boolean expressions. For instance, with `|`

as logical `OR`

, *either* definition of the above functions must be defined. Comma indicates logical `AND`

where *both* definitions must be defined.

```
{-# MINIMAL (==) | (/=) #-} -- Either (==) or (/=)
{-# MINIMAL (==) , (/=) #-} -- Both (==) and (/=)
```

Compiling the `-Wmissing-methods`

will warn when an instance is defined that does not meet the minimal criterion.

## TypeSynonymInstances

Normally type class definitions are restricted to being defined only over fully expanded types with all type synonym indirections removed. Type synonyms introduce a “naming indirection” that can be included in the instance search to allow you to write synonym instances for multiple synonyms which expand to concrete types.

This is used quite often in modern Haskell.

```
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE FlexibleInstances #-}
type IntList = [Int]
class MyClass a
-- Without type synonym instances, we're forced to manually expand out type
-- synonyms in the typeclass head.
instance MyClass [Int]
-- With it GHC will do this for us automatically. Type synonyms still need to
-- be fully applied.
instance MyClass IntList
```

## FlexibleInstances

Normally the head of a typeclass instance must contain only a type constructor applied to any number of type variables. There can be no nesting of other constructors or non-type variables in the head. The `FlexibleInstances`

extension loosens this restriction to allow arbitrary nesting and non-type variables to be mentioned in the head definition. This extension also implicitly enables `TypeSynonymInstances`

.

```
{-# LANGUAGE FlexibleInstances #-}
class MyClass a
-- Without flexible instances, all instance heads must be type variable. The
-- following would be legal.
instance MyClass (Maybe a)
-- With flexible instances, typeclass heads can be arbitrary nested types. The
-- following would be forbidden without it.
instance MyClass (Maybe Int)
```

## FlexibleContexts

Just as with instances, contexts normally are also constrained to consist entirely of constraints where a class is applied to just type variables. The `FlexibleContexts`

extension lifts this restriction and allows any type of type variable and nesting to occur the class constraint head. There is however still a global restriction that all class hierarchies must not contain cycles.

```
{-# LANGUAGE FlexibleContexts #-}
class MyClass a
-- Without flexible contexts, all contexts must be type variable. The
-- following would be legal.
instance (MyClass a) => MyClass (Either a b)
-- With flexible contexts, typeclass contexts can be arbitrary nested types. The
-- following would be forbidden without it.
instance (MyClass (Maybe a)) => MyClass (Either a b)
```

## OverlappingInstances

Typeclasses are normally globally coherent, there is only ever one instance that can be resolved for a type unambiguously at any call site in the program. There are however extensions to loosen this restriction and perform more manual direction of the instance search.

Overlapping instances loosens the coherent condition (there can be multiple instances) but introduces a criterion that it will resolve to the most specific one.

```
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE OverlappingInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
class MyClass a b where
fn :: (a,b)
instance MyClass Int b where
fn = error "b"
instance MyClass a Int where
fn = error "a"
instance MyClass Int Int where
fn = error "c"
example :: (Int, Int)
example = fn
```

Historically enabling on the module-level was not the best idea, since generally we define multiple classes in a module only a subset of which may be incoherent. As of GHC 7.10 we now have the capacity to just annotate instances with the `OVERLAPPING`

and `INCOHERENT`

inline pragmas.

```
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
class MyClass a b where
fn :: (a,b)
instance {-# OVERLAPPING #-} MyClass Int b where
fn = error "b"
instance {-# OVERLAPPING #-} MyClass a Int where
fn = error "a"
instance {-# OVERLAPPING #-} MyClass Int Int where
fn = error "c"
example :: (Int, Int)
example = fn
```

## IncoherentInstances

Incoherent instances loosens the restriction that there be only one specific instance, it will be chosen based on a more complex search procedure which tries to identify a *prime instance* based on information incorporated form `OVERLAPPING`

pragmas on instances in the search tree. Unless one is doing very advanced type-level programming use class constraints, this is usually a poor design decision and a sign to rethink the class hierarchy.

```
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE IncoherentInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
class MyClass a b where
fn :: (a,b)
instance MyClass Int b where
fn = error "a"
instance MyClass a Int where
fn = error "b"
example :: (Int, Int)
example = fn
```

An example with `INCOHERENT`

annotations:

```
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
class MyClass a b where
fn :: (a,b)
instance {-# INCOHERENT #-} MyClass a Int where
fn = error "general"
instance {-# INCOHERENT #-} MyClass Int Int where
fn = error "specific"
example :: (Int, Int)
example = fn
```

# Laziness

Haskell is a unique language that explores an alternative evaluation model called *lazy evaluation*. Lazy evaluation implies that expressions will be evaluated only when needed. In truth, this evaluation may even be indefinitely deferred. Consider the example in Haskell of defining an infinite list:

The primary advantage of lazy evaluation in the large is that algorithms that operate over both unbounded and bounded data structures can inhabit the same type signatures and be composed without any additional need to restructure their logic or force intermediate computations.

Still, it’s important to recognize that this is another subject on which much ink has been spilled. In fact, there is an ongoing discussion in the land of Haskell about the compromises between lazy and strict evaluation, and there are nuanced arguments for having either paradigm be the default.

Haskell takes a hybrid approach where it allows strict evaluation when needed while it uses laziness by default. Needless to say, we can always find examples where strict evaluation exhibits worse behavior than lazy evaluation and vice versa. These days Haskell can be both as lazy or as strict as you like, giving you options for however you prefer to program.

Languages that attempt to bolt laziness on to a strict evaluation model often bifurcate classes of algorithms into ones that are hand-adjusted to consume unbounded structures and those which operate over bounded structures. In strict languages, mixing and matching between lazy vs. strict processing often necessitates manifesting large intermediate structures in memory when such composition would “just work” in a lazy language.

By virtue of Haskell being the only language to actually explore this point in the design space, knowledge about lazy evaluation is not widely absorbed into the collective programmer consciousness and can often be non-intuitive to the novice. Some time is often needed to fully grok how lazy evaluation works

## Strictness

For a more strict definition of strictnees, consider that there are several evaluation models for the lambda calculus:

**Strict**- Evaluation is said to be strict if all arguments are evaluated before the body of a function.**Non-strict**- Evaluation is non-strict if the arguments are not necessarily evaluated before entering the body of a function.

These ideas give rise to several models, Haskell itself uses the *call-by-need* model.

Model | Strictness | Description |
---|---|---|

Call-by-value | Strict | Arguments evaluated before function entered |

Call-by-name | Non-strict | Arguments passed unevaluated |

Call-by-need | Non-strict | Arguments passed unevaluated but an expression is only evaluated once |

## Seq and WHNF

On the subject of laziness and evaluation, we have names for how fully evaluated an expression is. A term is said to be in *weak head normal-form* if the outermost constructor or lambda expression cannot be reduced further. A term is said to be in *normal form* if it is fully evaluated and all sub-expressions and thunks contained within are evaluated.

```
-- In Normal Form
42
(2, "foo")
\x -> x + 1
-- Not in Normal Form
1 + 2
(\x -> x + 1) 2
"foo" ++ "bar"
(1 + 1, "foo")
-- In Weak Head Normal Form
(1 + 1, "foo")
\x -> 2 + 2
'f' : ("oo" ++ "bar")
-- Not In Weak Head Normal Form
1 + 1
(\x -> x + 1) 2
"foo" ++ "bar"
```

In Haskell, normal evaluation only occurs at the outer constructor of case-statements in Core. If we pattern match on a list, we don’t implicitly force all values in the list. An element in a data structure is only evaluated up to the outermost constructor. For example, to evaluate the length of a list we need only scrutinize the outer Cons constructors without regard for their inner values:

```
λ: length [undefined, 1]
2
λ: head [undefined, 1]
Prelude.undefined
λ: snd (undefined, 1)
1
λ: fst (undefined, 1)
Prelude.undefined
```

For example, in a lazy language the following program terminates even though it contains diverging terms.

In a strict language like OCaml (ignoring its suspensions for the moment), the same program diverges.

## Thunks

In Haskell a *thunk* is created to stand for an unevaluated computation. Evaluation of a thunk is called *forcing* the thunk. The result is an *update*, a referentially transparent effect, which replaces the memory representation of the thunk with the computed value. The fundamental idea is that a thunk is only updated once (although it may be forced simultaneously in a multi-threaded environment) and its resulting value is shared when referenced subsequently.

The GHCi command `:sprint`

can be used to introspect the state of unevaluated thunks inside an expression without forcing evaluation. For instance:

```
λ: let a = [1..] :: [Integer]
λ: let b = map (+ 1) a
λ: :sprint a
a = _
λ: :sprint b
b = _
λ: a !! 4
5
λ: :sprint a
a = 1 : 2 : 3 : 4 : 5 : _
λ: b !! 10
12
λ: :sprint a
a = 1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 : 10 : 11 : _
λ: :sprint b
b = _ : _ : _ : _ : _ : _ : _ : _ : _ : _ : 12 : _
```

While a thunk is being computed its memory representation is replaced with a special form known as *blackhole* which indicates that computation is ongoing and allows for a short circuit when a computation might depend on itself to complete.

The `seq`

function introduces an artificial dependence on the evaluation of order of two terms by requiring that the first argument be evaluated to WHNF before the evaluation of the second. The implementation of the `seq`

function is an implementation detail of GHC.

For one example where laziness can bite you, the infamous foldl is well-known to leak space when used carelessly and without several compiler optimizations applied. The strict foldl’ variant uses seq to overcome this.

```
foldl' :: (a -> b -> a) -> a -> [b] -> a
foldl' _ z [] = z
foldl' f z (x:xs) = let z' = f z x in z' `seq` foldl' f z' xs
```

In practice, a combination between the strictness analyzer and the inliner on `-O2`

will ensure that the strict variant of `foldl`

is used whenever the function is inlinable at call site so manually using `foldl'`

is most often not required.

Of important note is that GHCi runs without any optimizations applied so the same program that performs poorly in GHCi may not have the same performance characteristics when compiled with GHC.

## BangPatterns

The extension `BangPatterns`

allows an alternative syntax to force arguments to functions to be wrapped in seq. A bang operator on an argument forces its evaluation to weak head normal form before performing the pattern match. This can be used to keep specific arguments evaluated throughout recursion instead of creating a giant chain of thunks.

```
{-# LANGUAGE BangPatterns #-}
sum :: Num a => [a] -> a
sum = go 0
where
go !acc (x:xs) = go (acc + x) xs
go acc [] = acc
```

This is desugared into code effectively equivalent to the following:

```
sum :: Num a => [a] -> a
sum = go 0
where
go acc _ | acc `seq` False = undefined
go acc (x:xs) = go (acc + x) xs
go acc [] = acc
```

Function application to seq’d arguments is common enough that it has a special operator.

## StrictData

As of GHC 8.0 strictness annotations can be applied to all definitions in a module automatically. In previous versions to make definitions strict it was necessary to use explicit syntactic annotations at call sites.

Enabling StrictData makes constructor fields strict by default on any module where the pragma is enabled:

Is equivalent to:

## Strict

Strict implies `-XStrictData`

and extends strictness annotations to all arguments of functions.

Is equivalent to the following function declaration with explicit bang patterns:

On a module-level this effectively makes Haskell a call-by-value language with some caveats. All arguments to functions are now explicitly evaluated and all data in constructors within this module are in head normal form by construction.

## Deepseq

There are often times when for performance reasons we need to deeply evaluate a data structure to normal form leaving no terms unevaluated. The `deepseq`

library performs this task.

The typeclass `NFData`

(Normal Form Data) allows us to seq all elements of a structure across any subtypes which themselves implement NFData.

```
class NFData a where
rnf :: a -> ()
rnf a = a `seq` ()
deepseq :: NFData a => a -> b -> b
($!!) :: (NFData a) => (a -> b) -> a -> b
```

```
instance NFData Int
instance NFData (a -> b)
instance NFData a => NFData (Maybe a) where
rnf Nothing = ()
rnf (Just x) = rnf x
instance NFData a => NFData [a] where
rnf [] = ()
rnf (x:xs) = rnf x `seq` rnf xs
```

To force a data structure itself to be fully evaluated we share the same argument in both positions of deepseq.

## Irrefutable Patterns

A lazy pattern doesn’t require a match on the outer constructor, instead it lazily calls the accessors of the values as needed. In the presence of a bottom, we fail at the usage site instead of the outer pattern match.

```
f :: (a, b) -> Int
f (a,b) = const 1 a
g :: (a, b) -> Int
g ~(a,b) = const 1 a
-- λ: f undefined
-- *** Exception: Prelude.undefined
-- λ: g undefined
-- 1
j :: Maybe t -> t
j ~(Just x) = x
k :: Maybe t -> t
k (Just x) = x
-- λ: j Nothing
-- *** Exception: src/05-laziness/lazy_patterns.hs:15:1-15: Irrefutable pattern failed for pattern (Just x)
--
-- λ: k Nothing
-- *** Exception: src/05-laziness/lazy_patterns.hs:18:1-14: Non-exhaustive patterns in function k
```

## The Debate

Laziness is a controversial design decision in Haskell. It is difficult to write production Haskell code that operates in constant memory without some insight into the evaluation model and the runtime. A lot of industrial codebases have a policy of marking all constructors as strict by default or enabling StrictData to prevent space leaks. If Haskell were being designed from scratch it probably would not choose laziness as the default model. Future implementations of Haskell compilers would not choose this point in the design space if given the option of breaking with the language specification.

There is a lot of fear, uncertainty and doubt spread about lazy evaluation that unfortunately loses the forest for the trees and ignores 30 years of advanced research on the type system. In industrial programming a lot of software is sold on the meme of being of *fast* instead of being *correct*, and lazy evaluation is an intellectually easy talking point about these upside-down priorities. Nevertheless the colloquial perception of laziness being “evil” is a meme that will continue to persist regardless of any underlying reality because software is intrinsically a social process.

# Prelude

## What to Avoid?

Haskell being a 30 year old language has witnessed several revolutions in the way we structure and compose functional programs. Yet as a result several portions of the Prelude still reflect old schools of thought that simply can’t be removed without breaking significant parts of the ecosystem.

Currently it really only exists in folklore which parts to use and which not to use, although this is a topic that almost all introductory books don’t mention and instead make extensive use of the Prelude for simplicity’s sake.

The short version of the advice on the Prelude is:

- Avoid String.
- Use
`fmap`

instead of`map`

. - Use Foldable and Traversable instead of the Control.Monad, and Data.List versions of traversals.
- Avoid partial functions like
`head`

and`read`

or use their total variants. - Avoid exceptions, use ExceptT or Either instead.
- Avoid boolean blind functions.

The instances of Foldable for the list type often conflict with the monomorphic versions in the Prelude which are left in for historical reasons. So oftentimes it is desirable to explicitly mask these functions from implicit import and force the use of Foldable and Traversable instead.

Of course oftentimes one wishes to only use the Prelude explicitly and one can explicitly import it qualified and use the pieces as desired without the implicit import of the whole namespace.

## What Should be in Prelude

To get work done on industrial projects you probably need the following libraries:

`text`

`containers`

`unordered-containers`

`mtl`

`transformers`

`vector`

`filepath`

`directory`

`process`

`bytestring`

`optparse-applicative`

`unix`

`aeson`

## Custom Preludes

The default Prelude can be disabled in its entirety by twiddling the `-XNoImplicitPrelude`

flag which allows us to replace the default import entirely with a custom prelude. Many industrial projects will roll their own `Prologue.hs`

module which replaces the legacy prelude.

For example if we wanted to build up a custom project prelude we could construct a Prologue module and dump the relevant namespaces we want from `base`

into our custom export list. Using the module reexport feature allows us to create an `Exports`

namespace which contains our Prelude’s symbols. Every subsequent module in our project will then have `import Prologue`

as the first import.

```
module Prologue (
module Exports,
) where
import Data.Int as Exports
import Data.Tuple as Exports
import Data.Maybe as Exports
import Data.String as Exports
import Data.Foldable as Exports
import Data.Traversable as Exports
import Control.Monad.Trans.Except
as Exports
(ExceptT(ExceptT), Except, except, runExcept, runExceptT,
mapExcept, mapExceptT, withExcept, withExceptT)
```

## Preludes

There are many approaches to custom preludes. The most widely used ones are all available on Hackage.

Different preludes take different approaches to defining what the Haskell standard library should be. Some are interoperable with existing code and others require an “all-in” approach that creates an ecosystem around it. Some projects are more community efforts and others are developed by consulting companies or industrial users wishing to standardise their commercial code.

In Modern Haskell there are many different perspectives on Prelude design and the degree to which more advanced ideas should be used. Which one is right for you is a matter of personal preference and constraints in your company.

## Protolude

Protolude is a minimalist Prelude which provides many sensible defaults for writing modern Haskell and is compatible with existing code.

Protolude is one of the more conservative preludes and is developed by the author of this document.

See:

## Partial Functions

A *partial function* is a function which doesn’t terminate and yield a value for all given inputs. Conversely a *total function* terminates and is always defined for all inputs. As mentioned previously, certain historical parts of the Prelude are full of partial functions.

The difference between partial and total functions is the compiler can’t reason about the runtime safety of partial functions purely from the information specified in the language and as such the proof of safety is left to the user to guarantee. They are safe to use in the case where the user can guarantee that invalid inputs cannot occur, but like any unchecked property its safety or not-safety is going to depend on the diligence of the programmer. This very much goes against the overall philosophy of Haskell and as such they are discouraged when not necessary.

A list of partial functions in the default prelude:

**Partial for all inputs**

`error`

`undefined`

`fail`

– For`Monad IO`

**Partial for empty lists**

`head`

`init`

`tail`

`last`

`foldl`

`foldr`

`foldl'`

`foldr'`

`foldr1`

`foldl1`

`cycle`

`maximum`

`minimum`

**Partial for Nothing**

`fromJust`

**Partial for invalid strings lists**

`read`

**Partial for infinite lists**

`sum`

`product`

`reverse`

**Partial for negative or unbounded numbers**

`(!)`

`(!!)`

`toEnum`

`genericIndex`

## Replacing Partiality

The Prelude has total variants of the historical partial functions (e.g. `Text.Read.readMaybe`

) in some cases, but often these are found in the various replacement preludes

The total versions provided fall into three cases:

`May`

- return Nothing when the function is not defined for the inputs`Def`

- provide a default value when the function is not defined for the inputs`Note`

- call`error`

with a custom error message when the function is not defined for the inputs. This is not safe, but slightly easier to debug!

```
-- Total
headMay :: [a] -> Maybe a
readMay :: Read a => String -> Maybe a
atMay :: [a] -> Int -> Maybe a
-- Total
headDef :: a -> [a] -> a
readDef :: Read a => a -> String -> a
atDef :: a -> [a] -> Int -> a
-- Partial
headNote :: String -> [a] -> a
readNote :: Read a => String -> String -> a
atNote :: String -> [a] -> Int -> a
```

## Boolean Blindness

Boolean blindness is a common problem found in many programming languages. Consider the following two definitions which deconstruct a Maybe value into a boolean. Is there anything wrong with the definitions and below and why is this not caught in the type system?

```
data Bool = True | False
isNotJust :: Maybe a -> Bool
isNotJust (Just x) = True -- ???
isNotJust Nothing = False
isJust :: Maybe a -> Bool
isJust (Just x) = True
isJust Nothing = False
```

The problem with the `Bool`

type is that there is effectively no difference between True and False at the type level. A proposition taking a value to a Bool takes any information given and destroys it. To reason about the behavior we have to trace the provenance of the proposition we’re getting the boolean answer from, and this introduces a whole slew of possibilities for misinterpretation. In the worst case, the only way to reason about safe and unsafe use of a function is by trusting that a predicate’s lexical name reflects its provenance!

For instance, testing some proposition over a Bool value representing whether the branch can perform the computation safely in the presence of a null is subject to accidental interchange. Consider that in a language like C or Python testing whether a value is null is indistinguishable to the language from testing whether the value is *not null*. Which of these programs encodes safe usage and which segfaults?

```
# This one?
if p(x):
# use x
elif not p(x):
# don't use x
# Or this one?
if p(x):
# don't use x
elif not p(x):
# use x
```

From inspection we can’t tell without knowing how p is defined, the compiler can’t distinguish the two either and thus the language won’t save us if we happen to mix them up. Instead of making invalid states *unrepresentable* we’ve made the invalid state *indistinguishable* from the valid one!

The more desirable practice is to match on terms which explicitly witness the proposition as a type (often in a sum type) and won’t typecheck otherwise.

```
case x of
Just a -> use x
Nothing -> don't use x
-- not ideal
case p x of
True -> use x
False -> don't use x
-- not ideal
if p x
then use x
else don't use x
```

To be fair though, many popular languages completely lack the notion of sum types (the source of many woes in my opinion) and only have product types, so this type of reasoning sometimes has no direct equivalence for those not familiar with ML family languages.

In Haskell, the Prelude provides functions like `isJust`

and `fromJust`

both of which can be used to subvert this kind of reasoning and make it easy to introduce bugs and should often be avoided.

## Foldable / Traversable

If coming from an imperative background retraining oneself to think about iteration over lists in terms of maps, folds, and scans can be challenging.

```
Prelude.foldl :: (a -> b -> a) -> a -> [b] -> a
Prelude.foldr :: (a -> b -> b) -> b -> [a] -> b
-- pseudocode
foldr f z [a...] = f a (f b ( ... (f y z) ... ))
foldl f z [a...] = f ... (f (f z a) b) ... y
```

For a concrete example consider the simple arithmetic sequence over the binary operator `(+)`

:

Foldable and Traversable are the general interface for all traversals and folds of any data structure which is parameterized over its element type ( List, Map, Set, Maybe, …). These two classes are used everywhere in modern Haskell and are extremely important.

A foldable instance allows us to apply functions to data types of monoidal values that collapse the structure using some logic over `mappend`

.

A traversable instance allows us to apply functions to data types that walk the structure left-to-right within an applicative context.

```
class (Functor f, Foldable f) => Traversable f where
traverse :: Applicative g => (a -> g b) -> f a -> g (f b)
class Foldable f where
foldMap :: Monoid m => (a -> m) -> f a -> m
```

The `foldMap`

function is extremely general and non-intuitively many of the monomorphic list folds can themselves be written in terms of this single polymorphic function.

`foldMap`

takes a function of values to a monoidal quantity, a functor over the values and collapses the functor into the monoid. For instance for the trivial Sum monoid:

For instance if we wanted to map a list of some abstract element types into a hashtable of elements based on pattern matching we could use it.

```
import Data.Foldable
import qualified Data.Map as Map
data Elt
= Elt Int Double
| Nil
foo :: [Elt] -> Map.Map Int Double
foo = foldMap go
where
go (Elt x y) = Map.singleton x y
go Nil = Map.empty
```

The full Foldable class (with all default implementations) contains a variety of derived functions which themselves can be written in terms of `foldMap`

and `Endo`

.

```
newtype Endo a = Endo {appEndo :: a -> a}
instance Monoid (Endo a) where
mempty = Endo id
Endo f `mappend` Endo g = Endo (f . g)
```

```
class Foldable t where
fold :: Monoid m => t m -> m
foldMap :: Monoid m => (a -> m) -> t a -> m
foldr :: (a -> b -> b) -> b -> t a -> b
foldr' :: (a -> b -> b) -> b -> t a -> b
foldl :: (b -> a -> b) -> b -> t a -> b
foldl' :: (b -> a -> b) -> b -> t a -> b
foldr1 :: (a -> a -> a) -> t a -> a
foldl1 :: (a -> a -> a) -> t a -> a
```

For example:

Most of the operations over lists can be generalized in terms of combinations of Foldable and Traversable to derive more general functions that work over all data structures implementing Foldable.

```
Data.Foldable.elem :: (Eq a, Foldable t) => a -> t a -> Bool
Data.Foldable.sum :: (Num a, Foldable t) => t a -> a
Data.Foldable.minimum :: (Ord a, Foldable t) => t a -> a
Data.Traversable.mapM :: (Monad m, Traversable t) => (a -> m b) -> t a -> m (t b)
```

Unfortunately for historical reasons the names exported by Foldable quite often conflict with ones defined in the Prelude, either import them qualified or just disable the Prelude. The operations in the Foldable class all specialize to the same and behave the same as the ones in Prelude for List types.

```
import Control.Applicative
import Control.Monad.Identity (runIdentity)
import Data.Foldable
import Data.Monoid
import Data.Traversable
import Prelude hiding (foldr, mapM_)
-- Rose Tree
data Tree a = Node a [Tree a] deriving (Show)
instance Functor Tree where
fmap f (Node x ts) = Node (f x) (fmap (fmap f) ts)
instance Traversable Tree where
traverse f (Node x ts) = Node <$> f x <*> traverse (traverse f) ts
instance Foldable Tree where
foldMap f (Node x ts) = f x `mappend` foldMap (foldMap f) ts
tree :: Tree Integer
tree = Node 1 [Node 1 [], Node 2 [], Node 3 []]
example1 :: IO ()
example1 = mapM_ print tree
example2 :: Integer
example2 = foldr (+) 0 tree
example3 :: Maybe (Tree Integer)
example3 = traverse (\x -> if x > 2 then Just x else Nothing) tree
example4 :: Tree Integer
example4 = runIdentity $ traverse (\x -> pure (x + 1)) tree
```

The instances we defined above can also be automatically derived by GHC using several language extensions. The automatic instances are identical to the hand-written versions above.

```
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
data Tree a = Node a [Tree a]
deriving (Show, Functor, Foldable, Traversable)
```

# Strings

The string situation in Haskell is a sad affair. The default String type is defined as linked list of pointers to characters which is an extremely pathological and inefficient way of representing textual data. Unfortunately for historical reasons large portions of GHC and Base depend on String.

The String problem is intrinsically linked to the fact that the default GHC Prelude provides a set of broken defaults that are difficult to change because GHC and the entire ecosystem historically depend on it. There are however high performance string libraries that can swapped in for the broken `String`

type and we will discuss some ways of working with high-performance and memory efficient replacements.

## String

The default Haskell string type is implemented as a naive linked list of characters, this is hilariously terrible for most purposes but no one knows how to fix it without rewriting large portions of all code that exists, and simply nobody wants to commit the time to fix it. So it remains broken, likely forever.

However, fear not as there are are two replacement libraries for processing textual data: `text`

and `bytestring`

.

`text`

- Used for handling unicode data.`bytestring`

- Used for handling ASCII data that needs to interchange with C code or network protocols.

For each of these there are two variants for both text and bytestring.

**lazy**- Lazy text objects are encoded as lazy lists of strict chunks of bytes.**strict**- Byte vectors are encoded as strict Word8 arrays of bytes or code points

Giving rise to the Cartesian product of the four common string types:

Variant | Module |
---|---|

strict text `Da |
ta.Text` |

lazy text `Da |
ta.Text.Lazy` |

strict bytestring `Da |
ta.ByteString` |

lazy bytestring `Da |
ta.ByteString.Lazy` |

## String Conversions

Conversions between strings types are done with several functions across the bytestring and text libraries. The mapping between text and bytestring is inherently lossy so there is some degree of freedom in choosing the encoding. We’ll just consider utf-8 for simplicity.

Data.Text | Data.Text.Lazy | Data.ByteString | Data.ByteString.Lazy | |
---|---|---|---|---|

Data.Text | id | fromStrict | encodeUtf8 | encodeUtf8 |

Data.Text.Lazy | toStrict | id | encodeUtf8 | encodeUtf8 |

Data.ByteString | decodeUtf8 | decodeUtf8 | id | fromStrict |

Data.ByteString.Lazy | decodeUtf8 | decodeUtf8 | toStrict | id |

Be careful with the functions (`decodeUtf8`

, `decodeUtf16LE`

, etc.) as they are partial and will throw errors if the byte array given does not contain unicode code points. Instead use one of the following functions which will allow you to explicitly handle the error case:

```
decodeUtf8' :: ByteString -> Either UnicodeException Text
decodeUtf8With :: OnDecodeError -> ByteString -> Text
```

## OverloadedStrings

With the `-XOverloadedStrings`

extension string literals can be overloaded without the need for explicit packing and can be written as string literals in the Haskell source and overloaded via the typeclass `IsString`

. Sometimes this is desirable.

For instance:

We can also derive IsString for newtypes using `GeneralizedNewtypeDeriving`

, although much of the safety of the newtype is then lost if it is used interchangeable with other strings.

**Import Conventions**

Since there are so many modules that provide string datatypes, and these modules are used ubiquitously, some conventions are often adopted to import these modules as specific agreed-upon qualified names. In many Haskell projects you will see the following social conventions used for distinguish text types.

For datatypes:

```
import qualified Data.Text as T
import qualified Data.Text.Lazy as TL
import qualified Data.ByteString as BS
import qualified Data.ByteString.Lazy as BL
import qualified Data.ByteString.Char8 as C
import qualified Data.ByteString.Lazy.Char8 as CL
```

For IO operations:

For encoding operations:

In addition many libraries and alternative preludes will define the following type synonyms:

## Text

The `Text`

type is a packed blob of Unicode characters.

```
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.Text as T
-- From pack
myTStr1 :: T.Text
myTStr1 = T.pack ("foo" :: String)
-- From overloaded string literal.
myTStr2 :: T.Text
myTStr2 = "bar"
```

See: Text

## Text.Builder

```
toLazyText :: Builder -> Data.Text.Lazy.Internal.Text
fromLazyText :: Data.Text.Lazy.Internal.Text -> Builder
```

The Text.Builder allows the efficient monoidal construction of lazy Text types without having to go through inefficient forms like String or List types as intermediates.

```
{-# LANGUAGE OverloadedStrings #-}
import Data.Monoid (mconcat, (<>))
import Data.Text.Lazy.Builder (Builder, toLazyText)
import Data.Text.Lazy.Builder.Int (decimal)
import qualified Data.Text.Lazy.IO as L
beer :: Int -> Builder
beer n = decimal n <> " bottles of beer on the wall.\n"
wall :: Builder
wall = mconcat $ fmap beer [1..1000]
main :: IO ()
main = L.putStrLn $ toLazyText wall
```

## ByteString

ByteStrings are arrays of unboxed characters with either strict or lazy evaluation.

```
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.ByteString as S
import qualified Data.ByteString.Char8 as S8
-- From pack
bstr1 :: S.ByteString
bstr1 = S.pack [102, 111, 111] -- ascii encoding of foo as [Word8]
-- From overloaded string literal.
bstr2 :: S.ByteString
bstr2 = "bar"
```

## Printf

Haskell also has a variadic `printf`

function in the style of C.

```
import Data.Text
import Text.Printf
a :: Int
a = 3
b :: Double
b = 3.14159
c :: String
c = "haskell"
example :: String
example = printf "(%i, %f, %s)" a b c
-- "(3, 3.14159, haskell)"
```

## Overloaded Lists

It is ubiquitous for data structure libraries to expose `toList`

and `fromList`

functions to construct various structures out of lists. As of GHC 7.8 we now have the ability to overload the list syntax in the surface language with the typeclass `IsList`

.

```
class IsList l where
type Item l
fromList :: [Item l] -> l
fromListN :: Int -> [Item l] -> l
toList :: l -> [Item l]
instance IsList [a] where
type Item [a] = a
fromList = id
toList = id
```

```
λ: :seti -XOverloadedLists
λ: :type [1,2,3]
[1,2,3] :: (Num (GHC.Exts.Item l), GHC.Exts.IsList l) => l
```

For example we could write an overloaded list instance for hash tables that simply converts to the hash table using `fromList`

. Some math libraries that use vector-like structures will use overloaded lists in this fashion.

```
{-# LANGUAGE OverloadedLists #-}
{-# LANGUAGE TypeFamilies #-}
import qualified Data.Map as Map
import GHC.Exts (IsList (..))
instance (Ord k) => IsList (Map.Map k v) where
type Item (Map.Map k v) = (k, v)
fromList = Map.fromList
toList = Map.toList
example1 :: Map.Map String Int
example1 = [("a", 1), ("b", 2)]
```

## Regex

`regex-tdfa`

implements POSIX extended regular expressions. These can operate over any of the major string types and with OverloadedStrings enabled allows you to write well-typed regex expressions as strings.

```
{-# LANGUAGE OverloadedStrings #-}
import Data.Text
import Text.Regex.TDFA
-- | Verify url address
url :: Text -> Bool
url input = input =~ urlRegex
where
urlRegex :: Text
urlRegex = "https?:\\/\\/(www\\.)?[-a-zA-Z0-9@:%._\\+~#=]{1,256}\\.[a-zA-Z0-9()]{1,6}\\b([-a-zA-Z0-9()@:%_\\+.~#?&//=]*)"
-- | Verify email address
email :: Text -> Bool
email input = input =~ emailRegex
where
emailRegex :: Text
emailRegex = "[a-zA-Z0-9+._-]+@[a-zA-Z-]+\\.[a-z]+"
```

## Escaping Text

Haskell uses C-style single-character escape codes

Escape | Unicode | Character |
---|---|---|

\n | U+000A | newline |

\0 | U+0000 | null character |

\& | n/a | empty string |

\’ | U+0027 | single quote |

\\ | U+005C | backslash |

\a | U+0007 | alert |

\b | U+0008 | backspace |

\f | U+000C | form feed |

\r | U+000D | carriage return |

\t | U+0009 | horizontal tab |

\v | U+000B | vertical tab |

\" | U+0022 | double quote |

## String Splitting

The split package provides a variety of missing functions for splitting list and string types.

```
import Data.List.Split
example1 :: [String]
example1 = splitOn "." "foo.bar.baz"
-- ["foo","bar","baz"]
example2 :: [String]
example2 = chunksOf 10 "To be or not to be that is the question."
-- ["To be or n","ot to be t","hat is the"," question."]
```

# Applicatives

Like monads Applicatives are an abstract structure for a wide class of computations that sit between functors and monads in terms of generality.

```
pure :: Applicative f => a -> f a
(<$>) :: Functor f => (a -> b) -> f a -> f b
(<*>) :: f (a -> b) -> f a -> f b
```

As of GHC 7.6, Applicative is defined as:

```
class Functor f => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
(<$>) :: Functor f => (a -> b) -> f a -> f b
(<$>) = fmap
```

With the following laws:

```
pure id <*> v = v
pure f <*> pure x = pure (f x)
u <*> pure y = pure ($ y) <*> u
u <*> (v <*> w) = pure (.) <*> u <*> v <*> w
```

As an example, consider the instance for Maybe:

```
instance Applicative Maybe where
pure = Just
Nothing <*> _ = Nothing
_ <*> Nothing = Nothing
Just f <*> Just x = Just (f x)
```

As a rule of thumb, whenever we would use `m >>= return . f`

what we probably want is an applicative functor, and not a monad.

```
import Control.Applicative ((<$>), (<*>))
import Network.HTTP
example1 :: Maybe Integer
example1 = (+) <$> m1 <*> m2
where
m1 = Just 3
m2 = Nothing
-- Nothing
example2 :: [(Int, Int, Int)]
example2 = (,,) <$> m1 <*> m2 <*> m3
where
m1 = [1, 2]
m2 = [10, 20]
m3 = [100, 200]
-- [(1,10,100),(1,10,200),(1,20,100),(1,20,200),(2,10,100),(2,10,200),(2,20,100),(2,20,200)]
example3 :: IO String
example3 = (++) <$> fetch1 <*> fetch2
where
fetch1 = simpleHTTP (getRequest "http://www.python.org/") >>= getResponseBody
fetch2 = simpleHTTP (getRequest "http://www.haskell.org/") >>= getResponseBody
```

The pattern `f <$> a <*> b ...`

shows up so frequently that there is a family of functions to lift applicatives of a fixed number arguments. This pattern also shows up frequently with monads (`liftM`

, `liftM2`

, `liftM3`

).

```
liftA :: Applicative f => (a -> b) -> f a -> f b
liftA f a = pure f <*> a
liftA2 :: Applicative f => (a -> b -> c) -> f a -> f b -> f c
liftA2 f a b = f <$> a <*> b
liftA3 :: Applicative f => (a -> b -> c -> d) -> f a -> f b -> f c -> f d
liftA3 f a b c = f <$> a <*> b <*> c
```

Applicative also has functions `*>`

and `<*`

that sequence applicative actions while discarding the value of one of the arguments. The operator `*>`

discards the left while `<*`

discards the right. For example in a monadic parser combinator library the `*>`

would parse with first parser argument but return the second.

The Applicative functions `<$>`

and `<*>`

are generalized by `liftM`

and `ap`

for monads.

```
import Control.Monad
import Control.Applicative
data C a b = C a b
mnd :: Monad m => m a -> m b -> m (C a b)
mnd a b = C `liftM` a `ap` b
apl :: Applicative f => f a -> f b -> f (C a b)
apl a b = C <$> a <*> b
```

See: Applicative Programming with Effects

## Alternative

Alternative is an extension of the Applicative class with a zero element and an associative binary operation respecting the zero.

```
class Applicative f => Alternative f where
-- | The identity of '<|>'
empty :: f a
-- | An associative binary operation
(<|>) :: f a -> f a -> f a
-- | One or more.
some :: f a -> f [a]
-- | Zero or more.
many :: f a -> f [a]
optional :: Alternative f => f a -> f (Maybe a)
when :: (Alternative f) => Bool -> f () -> f ()
when p s = if p then s else return ()
guard :: (Alternative f) => Bool -> f ()
guard True = pure ()
guard False = mzero
```

```
instance Alternative Maybe where
empty = Nothing
Nothing <|> r = r
l <|> _ = l
instance Alternative [] where
empty = []
(<|>) = (++)
```

These instances show up very frequently in parsers where the alternative operator can model alternative parse branches.

## Arrows

A category is an algebraic structure that includes a notion of an identity and a composition operation that is associative and preserves identities. In practice arrows are not often used in modern Haskell and are often considered a code smell.

```
(<<<) :: Category cat => cat b c -> cat a b -> cat a c
(<<<) = (.)
(>>>) :: Category cat => cat a b -> cat b c -> cat a c
f >>> g = g . f
```

Arrows are an extension of categories with the notion of products.

```
class Category a => Arrow a where
arr :: (b -> c) -> a b c
first :: a b c -> a (b,d) (c,d)
second :: a b c -> a (d,b) (d,c)
(***) :: a b c -> a b' c' -> a (b,b') (c,c')
(&&&) :: a b c -> a b c' -> a b (c,c')
```

The canonical example is for functions.

```
instance Arrow (->) where
arr f = f
first f = f *** id
second f = id *** f
(***) f g ~(x,y) = (f x, g y)
```

In this form, functions of multiple arguments can be threaded around using the arrow combinators in a much more pointfree form. For instance a histogram function has a nice one-liner.

```
import Data.List (group, sort)
histogram :: Ord a => [a] -> [(a, Int)]
histogram = map (head &&& length) . group . sort
```

**Arrow notation**

GHC has builtin syntax for composing arrows using `proc`

notation. The following are equivalent after desugaring:

```
{-# LANGUAGE Arrows #-}
addA :: Arrow a => a b Int -> a b Int -> a b Int
addA f g = proc x -> do
y <- f -< x
z <- g -< x
returnA -< y + z
```

```
addA f g = arr (\ x -> (x, x)) >>>
first f >>> arr (\ (y, x) -> (x, y)) >>>
first g >>> arr (\ (z, y) -> y + z)
```

In practice this notation is not often used and may become deprecated in the future.

See: Arrow Notation

## Bifunctors

Bifunctors are a generalization of functors to include types parameterized by two parameters and include two map functions for each parameter.

```
class Bifunctor p where
bimap :: (a -> b) -> (c -> d) -> p a c -> p b d
first :: (a -> b) -> p a c -> p b c
second :: (b -> c) -> p a b -> p a c
```

The bifunctor laws are a natural generalization of the usual functor laws. Namely they respect identities and composition in the usual way:

The canonical example is for 2-tuples.

```
λ: first (+1) (1,2)
(2,2)
λ: second (+1) (1,2)
(1,3)
λ: bimap (+1) (+1) (1,2)
(2,3)
λ: first (+1) (Left 3)
Left 4
λ: second (+1) (Left 3)
Left 3
λ: second (+1) (Right 3)
Right 4
```

## Polyvariadic Functions

One surprising application of typeclasses is the ability to construct functions which take an arbitrary number of arguments by defining instances over function types. The arguments may be of arbitrary type, but the resulting collected arguments must either be converted into a single type or unpacked into a sum type.

```
{-# LANGUAGE FlexibleInstances #-}
class Arg a where
collect' :: [String] -> a
-- extract to IO
instance Arg (IO ()) where
collect' acc = mapM_ putStrLn acc
-- extract to [String]
instance Arg [String] where
collect' acc = acc
instance (Show a, Arg r) => Arg (a -> r) where
collect' acc = \x -> collect' (acc ++ [show x])
collect :: Arg t => t
collect = collect' []
example1 :: [String]
example1 = collect 'a' 2 3.0
example2 :: IO ()
example2 = collect () "foo" [1,2,3]
```

# Error Handling

There are a plethora of ways of handling errors in Haskell. While Haskell’s runtime supports throwing and handling exceptions, it is important to use the right method in the right context.

## Either Monad

In keeping with the Haskell tradition it is always preferable to use pure logic when possible. In many simple cases error handling can be done quite simply by using the `Monad`

instance of Either. Monadic bind simply threads a `Right`

value through the monad and “short-circuits” evaluation when a `Left`

is introduced. This is simple enough error handling which privileges the `Left`

constructor to hold the error. Many simple functions which can fail can simply use the `Either Error a`

in the result type to encode simple error handling.

The downside to this is that it forces every consumer of the function to pattern match on the result to handle the error case. It also assumes that all `Error`

types can be encoded inside of the sum type holding the possible failures.

```
saveDiv -> Float -> Float -> Either DivError Float
safeDiv x 0 = Left NoDivZero
safeDiv x y = Right (x `div` y)
```

## ExceptT

When using the `transformers`

style effect stacks it is quite common to need to have a layer of the stack which can fail. When using the style of composing effects a monad transformer (which is a wrapper around Either monad) can be added which lifts the error handling into an `ExceptT`

effect layer.

As of mtl 2.2 or higher, the `ErrorT`

class has been replaced by `ExceptT`

at the transformers level.

```
newtype ExceptT e m a = ExceptT (m (Either e a))
runExceptT :: ExceptT e m a -> m (Either e a)
runExceptT (ExceptT m) = m
instance (Monad m) => Monad (ExceptT e m) where
return a = ExceptT $ return (Right a)
m >>= k = ExceptT $ do
a <- runExceptT m
case a of
Left e -> return (Left e)
Right x -> runExceptT (k x)
fail = ExceptT . fail
throwE :: (Monad m) => e -> ExceptT e m a
throwE = ExceptT . return . Left
catchE :: (Monad m) =>
ExceptT e m a -- ^ the inner computation
-> (e -> ExceptT e' m a) -- ^ a handler for exceptions in the inner
-- computation
-> ExceptT e' m a
m `catchE` h = ExceptT $ do
a <- runExceptT m
case a of
Left l -> runExceptT (h l)
Right r -> return (Right r)
```

And also this can be extended to the mtl `MonadError`

instance for which we can write instances for IO and Either themselves:

```
instance MonadTrans (ExceptT e) where
lift = ExceptT . liftM Right
class (Monad m) => MonadError e m | m -> e where
throwError :: e -> m a
catchError :: m a -> (e -> m a) -> m a
instance MonadError IOException IO where
throwError = ioError
catchError = catch
instance MonadError e (Either e) where
throwError = Left
Left l `catchError` h = h l
Right r `catchError` _ = Right r
```

See:

## Control.Exception

GHC has a builtin system for propagating errors up at the runtime level, below the business logic level. These are used internally for all sorts of concurrency and system interfaces. The runtime provides builtin operations `throw`

and `catch`

functions which allow us to throw exceptions in pure code and catch the resulting exception within IO. Note that the return value of `throw`

inhabits all types.

```
throw :: Exception e => e -> a
catch :: Exception e => IO a -> (e -> IO a) -> IO a
try :: Exception e => IO a -> IO (Either e a)
evaluate :: a -> IO a
```

```
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Typeable
import Control.Exception
data MyException = MyException
deriving (Show, Typeable)
instance Exception MyException
evil :: [Int]
evil = [throw MyException]
example1 :: Int
example1 = head evil
example2 :: Int
example2 = length evil
main :: IO ()
main = do
a <- try (evaluate example1) :: IO (Either MyException Int)
print a
b <- try (return example2) :: IO (Either MyException Int)
print b
```

Because a value will not be evaluated unless needed, if one desires to know for sure that an exception is either caught or not it can be deeply forced into head normal form before invoking catch. The `strictCatch`

is not provided by the standard library but has a simple implementation in terms of `deepseq`

.

```
strictCatch :: (NFData a, Exception e) => IO a -> (e -> IO a) -> IO a
strictCatch = catch . (toNF =<<)
```

## Exceptions

The problem with the previous approach is having to rely on GHC’s asynchronous exception handling inside of IO to handle basic operations and the bifurcation of APIs which need to expose different APIs for any monad that has failure (`IO`

, `STM`

, `ExceptT`

, etc.).

The `exceptions`

package provides the same API as `Control.Exception`

but loosens the dependency on IO. It instead provides a granular set of typeclasses which can operate over different monads which require a precise subset of error handling methods.

`MonadThrow`

- Monads which expose an interface for throwing exceptions.`MonadCatch`

- Monads which expose an interface for handling exceptions.`MonadMask`

- Monads which expose an interface for masking asynchronous exceptions.

There are three core primitives that are used in handling runtime exceptions:

`finally`

- For handling guaranteed finalisation of code in the presence of exceptions.`onException`

- For handing exception case only if an exception is thrown.`bracket`

- For implementing resource handling with custom acquisition and finalizer logic, in the presence of exceptions.

`finally`

takes an `IO`

action to run as a computation and a secondary function to run after the evaluation of the first.

```
finally :: IO a -- ^ computation to run first
-> IO b -- ^ computation to run afterward (even if an exception was raised)
-> IO a -- returns the value from the first computation
```

`onException`

has a similar signature but the second function is run **only if** an exception is raised.

The `bracket`

function takes two functions, an acquisition function and a finalizer function which “bracket” the evaluation of the third. The finaliser will be run if the computation throwns an exception and unwinds.

```
bracket
:: IO a -- ^ computation to run first
-> (a -> IO b) -- ^ computation to run last
-> (a -> IO c) -- ^ computation to run in-between
-> IO c -- returns the value from the in-between computation
```

A simple example of usage is bracket logic that handles file descriptors which need to be explicitly closed after evaluation is done. The initialiser in this case will return a file descriptor to the body and then run `hClose`

on the file descriptor after the body is done with evaluation.

```
bracket
(openFile "myfile" ReadMode) -- acquisition
(hClose) -- finaliser
(\fileHandle -> ... ) -- body
```

In addition the `exceptions`

library exposes several functions for explicitly handling a variety of exceptions of various forms. Toplevel handlers that need to “catch em’ all” should use `catchAny`

for wildcard error handling.

```
catch :: (MonadCatch m, Exception e) => m a -> (e -> m a) -> m a
catchIO :: MonadCatch m => m a -> (IOException -> m a) -> m a
catchAny :: MonadCatch m => m a -> (SomeException -> m a) -> m a
catchAsync :: (MonadCatch m, Exception e) => m a -> (e -> m a) -> m a
```

A simple example of usage:

```
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Typeable
import Control.Monad.Catch
import Control.Monad.Identity
data MyException = MyException
deriving (Show, Typeable)
instance Exception MyException
example :: MonadCatch m => Int -> Int -> m Int
example x y | y == 0 = throwM MyException
| otherwise = return $ x `div` y
pure :: MonadCatch m => m (Either MyException Int)
pure = do
a <- try (example 1 2)
b <- try (example 1 0)
return (a >> b)
```

See: exceptions

## Spoon

Sometimes you’ll be forced to deal with seemingly pure functions that can throw up at any point. There are many functions in the standard library like this, and many more on Hackage. You’d like to handle this logic purely as if it were returning a proper `Maybe a`

but to catch the logic you’d need to install an IO handler inside IO to catch it. Spoon allows us to safely (and “purely”, although it uses a referentially transparent invocation of unsafePerformIO) to catch these exceptions and put them in Maybe where they belong.

The `spoon`

function evaluates its argument to head normal form, while `teaspoon`

evaluates to weak head normal form.

```
import Control.Spoon
goBoom :: Int -> Int -> Int
goBoom x y = x `div` y
-- evaluate to normal form
test1 :: Maybe [Int]
test1 = spoon [1, 2, undefined]
-- evaluate to weak head normal form
test2 :: Maybe [Int]
test2 = teaspoon [1, 2, undefined]
main :: IO ()
main = do
maybe (putStrLn "Nothing") (print . length) test1
maybe (putStrLn "Nothing") (print . length) test2
```

# Advanced Monads

When working with the wider library you will find there a variety of “advanced monads” which are higher-level constructions on top of of the monadic interface which enrich the structure with additional rules or build APIs for combining different types of monads. Some of the most-used cases are mentioned in this section.

## Function Monad

If one writes Haskell long enough one might eventually encounter the curious beast that is the `((->) r)`

monad instance. It generally tends to be non-intuitive to work with, but is quite simple when one considers it as an unwrapped Reader monad.

```
instance Functor ((->) r) where
fmap = (.)
instance Monad ((->) r) where
return = const
f >>= k = \r -> k (f r) r
```

This just uses a prefix form of the arrow type operator.

```
import Control.Monad
id' :: (->) a a
id' = id
const' :: (->) a ((->) b a)
const' = const
-- Monad m => a -> m a
fret :: a -> b -> a
fret = return
-- Monad m => m a -> (a -> m b) -> m b
fbind :: (r -> a) -> (a -> (r -> b)) -> (r -> b)
fbind f k = f >>= k
-- Monad m => m (m a) -> m a
fjoin :: (r -> (r -> a)) -> (r -> a)
fjoin = join
fid :: a -> a
fid = const >>= id
-- Functor f => (a -> b) -> f a -> f b
fcompose :: (a -> b) -> (r -> a) -> (r -> b)
fcompose = (.)
```

```
type Reader r = (->) r -- pseudocode
instance Monad (Reader r) where
return a = \_ -> a
f >>= k = \r -> k (f r) r
ask' :: r -> r
ask' = id
asks' :: (r -> a) -> (r -> a)
asks' f = id . f
runReader' :: (r -> a) -> r -> a
runReader' = id
```

## RWS Monad

The RWS monad combines the functionality of the three monads discussed above, the **R**eader, **W**riter, and **S**tate. There is also a `RWST`

transformer.

```
runReader :: Reader r a -> r -> a
runWriter :: Writer w a -> (a, w)
runState :: State s a -> s -> (a, s)
```

These three eval functions are now combined into the following functions:

```
runRWS :: RWS r w s a -> r -> s -> (a, s, w)
execRWS :: RWS r w s a -> r -> s -> (s, w)
evalRWS :: RWS r w s a -> r -> s -> (a, w)
```

```
import Control.Monad.RWS
type R = Int
type W = [Int]
type S = Int
computation :: RWS R W S ()
computation = do
e <- ask
a <- get
let b = a + e
put b
tell [b]
example = runRWS computation 2 3
```

The usual caveat about Writer laziness also applies to RWS.

## Cont

```
runCont :: Cont r a -> (a -> r) -> r
callCC :: MonadCont m => ((a -> m b) -> m a) -> m a
cont :: ((a -> r) -> r) -> Cont r a
```

In continuation passing style, composite computations are built up from sequences of nested computations which are terminated by a final continuation which yields the result of the full computation by passing a function into the continuation chain.

```
import Control.Monad
import Control.Monad.Cont
add :: Int -> Int -> Cont k Int
add x y = return $ x + y
mult :: Int -> Int -> Cont k Int
mult x y = return $ x * y
contt :: ContT () IO ()
contt = do
k <- do
callCC $ \exit -> do
lift $ putStrLn "Entry"
exit $ \_ -> do
putStrLn "Exit"
lift $ putStrLn "Inside"
lift $ k ()
callcc :: Cont String Integer
callcc = do
a <- return 1
b <- callCC (\k -> k 2)
return $ a+b
ex1 :: IO ()
ex1 = print $ runCont (f >>= g) id
where
f = add 1 2
g = mult 3
-- 9
ex2 :: IO ()
ex2 = print $ runCont callcc show
-- "3"
ex3 :: IO ()
ex3 = runContT contt print
-- Entry
-- Inside
-- Exit
main :: IO ()
main = do
ex1
ex2
ex3
```

```
newtype Cont r a = Cont { runCont :: ((a -> r) -> r) }
instance Monad (Cont r) where
return a = Cont $ \k -> k a
(Cont c) >>= f = Cont $ \k -> c (\a -> runCont (f a) k)
class (Monad m) => MonadCont m where
callCC :: ((a -> m b) -> m a) -> m a
instance MonadCont (Cont r) where
callCC f = Cont $ \k -> runCont (f (\a -> Cont $ \_ -> k a)) k
```

## MonadPlus

Choice and failure.

```
class (Alternative m, Monad m) => MonadPlus m where
mzero :: m a
mplus :: m a -> m a -> m a
instance MonadPlus [] where
mzero = []
mplus = (++)
instance MonadPlus Maybe where
mzero = Nothing
Nothing `mplus` ys = ys
xs `mplus` _ys = xs
```

MonadPlus forms a monoid with

```
asum :: (Foldable t, Alternative f) => t (f a) -> f a
asum = foldr (<|>) empty
msum :: (Foldable t, MonadPlus m) => t (m a) -> m a
msum = asum
```

```
import Safe
import Control.Monad
list1 :: [(Int,Int)]
list1 = [(a,b) | a <- [1..25], b <- [1..25], a < b]
list2 :: [(Int,Int)]
list2 = do
a <- [1..25]
b <- [1..25]
guard (a < b)
return $ (a,b)
maybe1 :: String -> String -> Maybe Double
maybe1 a b = do
a' <- readMay a
b' <- readMay b
guard (b' /= 0.0)
return $ a'/b'
maybe2 :: Maybe Int
maybe2 = msum [Nothing, Nothing, Just 3, Just 4]
```

## MonadFail

Before the great awakening, Monads used to be defined as the following class.

```
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
(>>) :: m a -> m b -> m b
return :: a -> m a
fail :: String -> m a
m >> k = m >>= \_ -> k
fail s = error s
```

This was eventually deemed not to be an great design and in particular the `fail`

function was a misplaced lawless entity that would generate bottoms. It was also necessary to define `fail`

for all monads, even those without a notion of failure. This was considered quite ugly and eventually a breaking change to base (landed in 4.9) was added which split out `MonadFail`

into a separate class where it belonged.

Some of the common instances of MonadFail are shown below:

```
instance MonadFail Maybe where
fail _ = Nothing
instance MonadFail [] where
{-# INLINE fail #-}
fail _ = []
instance MonadFail IO where
fail = failIO
```

## MonadFix

The fixed point of a monadic computation. `mfix f`

executes the action `f`

only once, with the eventual output fed back as the input.

```
class Monad m => MonadFix m where
mfix :: (a -> m a) -> m a
instance MonadFix Maybe where
mfix f = let a = f (unJust a) in a
where unJust (Just x) = x
unJust Nothing = error "mfix Maybe: Nothing"
```

The regular do-notation can also be extended with `-XRecursiveDo`

to accommodate recursive monadic bindings.

```
{-# LANGUAGE RecursiveDo #-}
import Control.Applicative
import Control.Monad.Fix
stream1 :: Maybe [Int]
stream1 = do
rec xs <- Just (1:xs)
return (map negate xs)
stream2 :: Maybe [Int]
stream2 = mfix $ \xs -> do
xs' <- Just (1:xs)
return (map negate xs')
```

## ST Monad

The ST monad models “threads” of stateful computations which can manipulate mutable references but are restricted to only return pure values when evaluated and are statically confined to the ST monad of a `s`

thread.

```
runST :: (forall s. ST s a) -> a
newSTRef :: a -> ST s (STRef s a)
readSTRef :: STRef s a -> ST s a
writeSTRef :: STRef s a -> a -> ST s ()
```

```
import Control.Monad
import Control.Monad.ST
import Control.Monad.State.Strict
import Data.STRef
example1 :: Int
example1 = runST $ do
x <- newSTRef 0
forM_ [1 .. 1000] $ \j -> do
writeSTRef x j
readSTRef x
example2 :: Int
example2 = runST $ do
count <- newSTRef 0
replicateM_ (10 ^ 6) $ modifySTRef' count (+ 1)
readSTRef count
example3 :: Int
example3 = flip evalState 0 $ do
replicateM_ (10 ^ 6) $ modify' (+ 1)
get
```

Using the ST monad we can create a class of efficient purely functional data structures that use mutable references in a referentially transparent way.

## Free Monads

```
Pure :: a -> Free f a
Free :: f (Free f a) -> Free f a
liftF :: (Functor f, MonadFree f m) => f a -> m a
retract :: Monad f => Free f a -> f a
```

Free monads are monads which instead of having a `join`

operation that combines computations, instead forms composite computations from application of a functor.

One of the best examples is the Partiality monad which models computations which can diverge. Haskell allows unbounded recursion, but for example we can create a free monad from the `Maybe`

functor which can be used to fix the call-depth of, for example the Ackermann function.

```
import Control.Monad.Fix
import Control.Monad.Free
type Partiality a = Free Maybe a
-- Non-termination.
never :: Partiality a
never = fix (Free . Just)
fromMaybe :: Maybe a -> Partiality a
fromMaybe (Just x) = Pure x
fromMaybe Nothing = Free Nothing
runPartiality :: Int -> Partiality a -> Maybe a
runPartiality 0 _ = Nothing
runPartiality _ (Pure a) = Just a
runPartiality _ (Free Nothing) = Nothing
runPartiality n (Free (Just a)) = runPartiality (n-1) a
ack :: Int -> Int -> Partiality Int
ack 0 n = Pure $ n + 1
ack m 0 = Free $ Just $ ack (m-1) 1
ack m n = Free $ Just $ ack m (n-1) >>= ack (m-1)
main :: IO ()
main = do
let diverge = never :: Partiality ()
print $ runPartiality 1000 diverge
print $ runPartiality 1000 (ack 3 4)
print $ runPartiality 5500 (ack 3 4)
```

The other common use for free monads is to build embedded domain-specific languages to describe computations. We can model a subset of the IO monad by building up a pure description of the computation inside of the IOFree monad and then using the free monad to encode the translation to an effectful IO computation.

```
{-# LANGUAGE DeriveFunctor #-}
import Control.Monad.Free
import System.Exit
data Interaction x
= Puts String x
| Gets (Char -> x)
| Exit
deriving (Functor)
type IOFree a = Free Interaction a
puts :: String -> IOFree ()
puts s = liftF $ Puts s ()
get :: IOFree Char
get = liftF $ Gets id
exit :: IOFree r
exit = liftF Exit
gets :: IOFree String
gets = do
c <- get
if c == '\n'
then return ""
else gets >>= \line -> return (c : line)
-- Collapse our IOFree DSL into IO monad actions.
interp :: IOFree a -> IO a
interp (Pure r) = return r
interp (Free x) = case x of
Puts s t -> putStrLn s >> interp t
Gets f -> getChar >>= interp . f
Exit -> exitSuccess
echo :: IOFree ()
echo = do
puts "Enter your name:"
str <- gets
puts str
if length str > 10
then puts "You have a long name."
else puts "You have a short name."
exit
main :: IO ()
main = interp echo
```

An implementation such as the one found in free might look like the following:

```
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
data Free f a
= Pure a
| Free (f (Free f a))
instance Functor f => Functor (Free f) where
fmap f (Pure a) = Pure (f a)
fmap f x = go x
where
go (Free fa) = Free (go <$> fa)
instance Applicative f => Applicative (Free f) where
pure = Pure
Pure a <*> Pure b = Pure $ a b
Pure a <*> Free mb = Free $ fmap a <$> mb
Free ma <*> Pure b = Free $ fmap ($ b) <$> ma
Free ma <*> Free mb = Free $ fmap (<*>) ma <*> mb
instance Applicative f => Monad (Free f) where
return = Pure
Pure a >>= f = f a
Free f >>= g = Free (fmap (>>= g) f)
class Monad m => MonadFree f m where
wrap :: f (m a) -> m a
instance Applicative f => MonadFree f (Free f) where
wrap = Free
liftF :: (Functor f, MonadFree f m) => f a -> m a
liftF = wrap . fmap return
iter :: Functor f => (f a -> a) -> Free f a -> a
iter _ (Pure a) = a
iter phi (Free m) = phi (iter phi <$> m)
retract :: Monad f => Free f a -> f a
retract (Pure a) = return a
retract (Free as) = as >>= retract
```

## Indexed Monads

Indexed monads are a generalisation of monads that adds an additional type parameter to the class that carries information about the computation or structure of the monadic implementation.

The canonical use-case is a variant of the vanilla State which allows type-changing on the state for intermediate steps inside of the monad. This indeed turns out to be very useful for handling a class of problems involving resource management since the extra index parameter gives us space to statically enforce the sequence of monadic actions by allowing and restricting certain state transitions on the index parameter at compile-time.

To make this more usable we’ll use the somewhat esoteric `-XRebindableSyntax`

allowing us to overload the do-notation and if-then-else syntax by providing alternative definitions local to the module.

```
{-# LANGUAGE RebindableSyntax #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE NoMonomorphismRestriction #-}
import Data.IORef
import Data.Char
import Prelude hiding (fmap, (>>=), (>>), return)
import Control.Applicative
newtype IState i o a = IState { runIState :: i -> (a, o) }
evalIState :: IState i o a -> i -> a
evalIState st i = fst $ runIState st i
execIState :: IState i o a -> i -> o
execIState st i = snd $ runIState st i
ifThenElse :: Bool -> a -> a -> a
ifThenElse b i j = case b of
True -> i
False -> j
return :: a -> IState s s a
return a = IState $ \s -> (a, s)
fmap :: (a -> b) -> IState i o a -> IState i o b
fmap f v = IState $ \i -> let (a, o) = runIState v i
in (f a, o)
join :: IState i m (IState m o a) -> IState i o a
join v = IState $ \i -> let (w, m) = runIState v i
in runIState w m
(>>=) :: IState i m a -> (a -> IState m o b) -> IState i o b
v >>= f = IState $ \i -> let (a, m) = runIState v i
in runIState (f a) m
(>>) :: IState i m a -> IState m o b -> IState i o b
v >> w = v >>= \_ -> w
get :: IState s s s
get = IState $ \s -> (s, s)
gets :: (a -> o) -> IState a o a
gets f = IState $ \s -> (s, f s)
put :: o -> IState i o ()
put o = IState $ \_ -> ((), o)
modify :: (i -> o) -> IState i o ()
modify f = IState $ \i -> ((), f i)
data Locked = Locked
data Unlocked = Unlocked
type Stateful a = IState a Unlocked a
acquire :: IState i Locked ()
acquire = put Locked
-- Can only release the lock if it's held, try release the lock
-- that's not held is a now a type error.
release :: IState Locked Unlocked ()
release = put Unlocked
-- Statically forbids improper handling of resources.
lockExample :: Stateful a
lockExample = do
ptr <- get :: IState a a a
acquire :: IState a Locked ()
-- ...
release :: IState Locked Unlocked ()
return ptr
-- Couldn't match type `Locked' with `Unlocked'
-- In a stmt of a 'do' block: return ptr
failure1 :: Stateful a
failure1 = do
ptr <- get
acquire
return ptr -- didn't release
-- Couldn't match type `a' with `Locked'
-- In a stmt of a 'do' block: release
failure2 :: Stateful a
failure2 = do
ptr <- get
release -- didn't acquire
return ptr
-- Evaluate the resulting state, statically ensuring that the
-- lock is released when finished.
evalReleased :: IState i Unlocked a -> i -> a
evalReleased f st = evalIState f st
example :: IO (IORef Integer)
example = evalReleased <$> pure lockExample <*> newIORef 0
```

## Lifted Base

The default prelude predates a lot of the work on monad transformers and as such many of the common functions for handling errors and interacting with IO are bound strictly to the IO monad and not to functions implementing stacks on top of IO or ST. The lifted-base provides generic control operations such as `catch`

can be lifted from IO or any other base monad.

#### monad-base

Monad base provides an abstraction over `liftIO`

and other functions to explicitly lift into a “privileged” layer of the transformer stack. It’s implemented as a multiparameter typeclass with the “base” monad as the parameter b.

```
-- | Lift a computation from the base monad
class (Applicative b, Applicative m, Monad b, Monad m)
=> MonadBase b m | m -> b where
liftBase ∷ b a -> m a
```

#### monad-control

Monad control builds on top of monad-base to extended lifting operation to control operations like `catch`

and `bracket`

can be written generically in terms of any transformer with a base layer supporting these operations. Generic operations can then be expressed in terms of a `MonadBaseControl`

and written in terms of the combinator `control`

which handles the bracket and automatic handler lifting.

For example the function catch provided by `Control.Exception`

is normally locked into IO.

`catch :: Exception e => IO a -> (e -> IO a) -> IO a`

By composing it in terms of control we can construct a generic version which automatically lifts inside of any combination of the usual transformer stacks that has `MonadBaseControl`

instance.

```
catch
:: (MonadBaseControl IO m, Exception e)
=> m a -- ^ Computation
-> (e -> m a) -- ^ Handler
-> m a
catch a handler = control $ \runInIO ->
E.catch (runInIO a)
(\e -> runInIO $ handler e)
```

# Quantification

In logic a predicate is a statement about a subject. For instance the statement: Socrates is a man, can be written as:

`Man(Socrates)`

A predicate assigned to a variable Man(x) has a truth value if the predicate holds for the subject. The domain of a variable is the set of all variables that may be assigned to the variable. A quantifier turns predicates into propositions by assigning values to all variables. For example the statement: All men are mortal. This is an example of a universal quantifier which describe a predicate that holds forall inhabitants of the domain of variables.

`Forall x. If Man(x) then Mortal(x)`

The truth value that that Socrates is mortal can be derived from above relation. Programming with quantifiers in Haskell follows this same kind of logical convention except we will be working with types and constraints on types.

## Universal Quantification

Universal quantification the primary mechanism of encoding polymorphism in Haskell. The essence of universal quantification is that we can express functions which operate the same way for a set of types and whose function behavior is entirely determined *only* by the behavior of all types in this span. These are represented at the type-level by in the introduction of a universal quantifier (`forall`

or `∀`

) over a set of the type variables in the signature.

```
{-# LANGUAGE ExplicitForAll #-}
-- ∀a. [a]
example1 :: forall a. [a]
example1 = []
-- ∀a. [a]
example2 :: forall a. [a]
example2 = [undefined]
-- ∀a. ∀b. (a → b) → [a] → [b]
map' :: forall a. forall b. (a -> b) -> [a] -> [b]
map' f = foldr ((:) . f) []
-- ∀a. [a] → [a]
reverse' :: forall a. [a] -> [a]
reverse' = foldl (flip (:)) []
```

Normally quantifiers are omitted in type signatures since in Haskell’s vanilla surface language it is unambiguous to assume to that free type variables are universally quantified. So the following two are equivalent:

## Free Theorems

A universally quantified type-variable actually implies quite a few rather deep properties about the implementation of a function that can be deduced from its type signature. For instance the identity function in Haskell is guaranteed to only have one implementation since the only information that the information that can present in the body:

These so called *free theorems* are properties that hold for any well-typed inhabitant of a universally quantified signature.

For example a free theorem of `fmap`

is that every implementation of functor *can only ever have the property* that composition of maps of functions is the same as maps of the functions composed together.

## Type Systems

**Hindley-Milner type system**

The Hindley-Milner type system is historically important as one of the first typed lambda calculi that admitted both polymorphism and a variety of inference techniques that could always decide principal types.

```
e : x
| λx:t.e -- value abstraction
| e1 e2 -- application
| let x = e1 in e2 -- let
t : t -> t -- function types
| a -- type variables
σ : ∀ a . t -- type scheme
```

In an type checker implementation, a *generalize* function converts all type variables within the type into polymorphic type variables yielding a type scheme. While a *instantiate* function maps a scheme to a type, but with any polymorphic variables converted into unbound type variables.

## Rank-N Types

System-F is the type system that underlies Haskell. System-F subsumes the HM type system in the sense that every type expressible in HM can be expressed within System-F. System-F is sometimes referred to in texts as the *Girald-Reynolds polymorphic lambda calculus* or *second-order lambda calculus*.

```
t : t -> t -- function types
| a -- type variables
| ∀ a . t -- forall
e : x -- variables
| λ(x:t).e -- value abstraction
| e1 e2 -- value application
| Λa.e -- type abstraction
| e_t -- type application
```

An example with equivalents of GHC Core in comments:

```
id : ∀ t. t -> t
id = Λt. λx:t. x
-- id :: forall t. t -> t
-- id = \ (@ t) (x :: t) -> x
tr : ∀ a. ∀ b. a -> b -> a
tr = Λa. Λb. λx:a. λy:b. x
-- tr :: forall a b. a -> b -> a
-- tr = \ (@ a) (@ b) (x :: a) (y :: b) -> x
fl : ∀ a. ∀ b. a -> b -> b
fl = Λa. Λb. λx:a. λy:b. y
-- fl :: forall a b. a -> b -> b
-- fl = \ (@ a) (@ b) (x :: a) (y :: b) -> y
nil : ∀ a. [a]
nil = Λa. Λb. λz:b. λf:(a -> b -> b). z
-- nil :: forall a. [a]
-- nil = \ (@ a) (@ b) (z :: b) (f :: a -> b -> b) -> z
cons : ∀ a. a -> [a] -> [a]
cons = Λa. λx:a. λxs:(∀ b. b -> (a -> b -> b) -> b).
Λb. λz:b. λf : (a -> b -> b). f x (xs_b z f)
-- cons :: forall a. a -> [a] -> [a]
-- cons = \ (@ a) (x :: a) (xs :: forall b. b -> (a -> b -> b) -> b)
-- (@ b) (z :: b) (f :: a -> b -> b) -> f x (xs @ b z f)
```

Normally when Haskell’s typechecker infers a type signature it places all quantifiers of type variables at the outermost position such that no quantifiers appear within the body of the type expression, called the prenex restriction. This restricts an entire class of type signatures that would otherwise be expressible within System-F, but has the benefit of making inference much easier.

`-XRankNTypes`

loosens the prenex restriction such that we may explicitly place quantifiers within the body of the type. The bad news is that the general problem of inference in this relaxed system is undecidable in general, so we’re required to explicitly annotate functions which use RankNTypes or they are otherwise inferred as rank 1 and may not typecheck at all.

```
{-# LANGUAGE RankNTypes #-}
-- Can't unify ( Bool ~ Char )
rank1 :: forall a. (a -> a) -> (Bool, Char)
rank1 f = (f True, f 'a')
rank2 :: (forall a. a -> a) -> (Bool, Char)
rank2 f = (f True, f 'a')
auto :: (forall a. a -> a) -> (forall b. b -> b)
auto x = x
xauto :: forall a. (forall b. b -> b) -> a -> a
xauto f = f
```

```
Monomorphic Rank 0: t
Polymorphic Rank 1: forall a. a -> t
Polymorphic Rank 2: (forall a. a -> t) -> t
Polymorphic Rank 3: ((forall a. a -> t) -> t) -> t
```

Of important note is that the type variables bound by an explicit quantifier in a higher ranked type may not escape their enclosing scope. The typechecker will explicitly enforce this by enforcing that variables bound inside of rank-n types (called skolem constants) will not unify with free meta type variables inferred by the inference engine.

```
{-# LANGUAGE RankNTypes #-}
escape :: (forall a. a -> a) -> Int
escape f = f 0
g x = escape (\a -> x)
```

In this example in order for the expression to be well typed, `f`

would necessarily have (`Int -> Int`

) which implies that `a ~ Int`

over the whole type, but since `a`

is bound under the quantifier it must not be unified with `Int`

and so the typechecker must fail with a skolem capture error.

```
Couldn't match expected type `a' with actual type `t'
`a' is a rigid type variable bound by a type expected by the context: a -> a
`t' is a rigid type variable bound by the inferred type of g :: t -> Int
In the expression: x In the first argument of `escape', namely `(\ a -> x)'
In the expression: escape (\ a -> x)
```

This can actually be used for our advantage to enforce several types of invariants about scope and use of specific type variables. For example the ST monad uses a second rank type to prevent the capture of references between ST monads with separate state threads where the `s`

type variable is bound within a rank-2 type and cannot escape, statically guaranteeing that the implementation details of the ST internals can’t leak out and thus ensuring its referential transparency.

## Existential Quantification

An existential type is a pair of a type and a term with a special set of packing and unpacking semantics. The type of the value encoded in the existential is known by the producer but not by the consumer of the existential value.

```
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE RankNTypes #-}
-- ∃ t. (t, t → t, t → String)
data Box = forall a. Box
{ value :: a
, update :: a -> a
, print :: a -> String
}
boxa :: Box
boxa = Box 1 negate show
boxb :: Box
boxb = Box "foo" reverse show
apply :: Box -> String
apply (Box x f p) = p (f x)
-- ∃ t. Show t => t
data SBox = forall a. Show a => SBox a
boxes :: [SBox]
boxes = [SBox (), SBox 2, SBox "foo"]
showBox :: SBox -> String
showBox (SBox a) = show a
main :: IO ()
main = mapM_ (putStrLn . showBox) boxes
-- ()
-- 2
-- "foo"
```

The existential over `SBox`

gathers a collection of values defined purely in terms of their Show interface and an opaque pointer, no other information is available about the values and they can’t be accessed or unpacked in any other way.

Passing around existential types allows us to hide information from consumers of data types and restrict the behavior that functions can use. Passing records around with existential variables allows a type to be “bundled” with a fixed set of functions that operate over its hidden internals.

## Impredicative Types

Although extremely brittle, GHC also has limited support for impredicative polymorphism which allows instantiating type variable with a polymorphic type. Implied is that this loosens the restriction that quantifiers must precede arrow types and now they may be placed inside of type-constructors.

```
-- Can't unify ( Int ~ Char )
revUni :: forall a. Maybe ([a] -> [a]) -> Maybe ([Int], [Char])
revUni (Just g) = Just (g [3], g "hello")
revUni Nothing = Nothing
```

```
{-# LANGUAGE ImpredicativeTypes #-}
-- Uses higher-ranked polymorphism.
f :: (forall a. [a] -> a) -> (Int, Char)
f get = (get [1,2], get ['a', 'b', 'c'])
-- Uses impredicative polymorphism.
g :: Maybe (forall a. [a] -> a) -> (Int, Char)
g Nothing = (0, '0')
g (Just get) = (get [1,2], get ['a','b','c'])
```

Use of this extension is very rare, and there is some consideration that `-XImpredicativeTypes`

is fundamentally broken. Although GHC is very liberal about telling us to enable it when one accidentally makes a typo in a type signature!

Some notable trivia, the `($)`

operator is wired into GHC in a very special way as to allow impredicative instantiation of `runST`

to be applied via `($)`

by special-casing the `($)`

operator only when used for the ST monad.

For example if we define a function `apply`

which should behave identically to `($)`

we’ll get an error about polymorphic instantiation even though they are defined identically!

```
{-# LANGUAGE RankNTypes #-}
import Control.Monad.ST
f `apply` x = f x
foo :: (forall s. ST s a) -> a
foo st = runST $ st
bar :: (forall s. ST s a) -> a
bar st = runST `apply` st
```

```
Couldn't match expected type `forall s. ST s a'
with actual type `ST s0 a'
In the second argument of `apply', namely `st'
In the expression: runST `apply` st
In an equation for `bar': bar st = runST `apply` st
```

See:

## Scoped Type Variables

Normally the type variables used within the toplevel signature for a function are only scoped to the type-signature and not the body of the function and its rigid signatures over terms and let/where clauses. Enabling `-XScopedTypeVariables`

loosens this restriction allowing the type variables mentioned in the toplevel to be scoped within the value-level body of a function and all signatures contained therein.

```
{-# LANGUAGE ExplicitForAll #-}
{-# LANGUAGE ScopedTypeVariables #-}
poly :: forall a b c. a -> b -> c -> (a, a)
poly x y z = (f x y, f x z)
where
-- second argument is universally quantified from inference
-- f :: forall t0 t1. t0 -> t1 -> t0
f x' _ = x'
mono :: forall a b c. a -> b -> c -> (a, a)
mono x y z = (f x y, f x z)
where
-- b is not implicitly universally quantified because it is in scope
f :: a -> b -> a
f x' _ = x'
example :: IO ()
example = do
x :: [Int] <- readLn
print x
```

# GADTs

*Generalized Algebraic Data types* (GADTs) are an extension to algebraic datatypes that allow us to qualify the constructors to datatypes with type equality constraints, allowing a class of types that are not expressible using vanilla ADTs.

`-XGADTs`

implicitly enables an alternative syntax for datatype declarations ( `-XGADTSyntax`

) such that the following declarations are equivalent:

```
-- Vanilla
data List a
= Empty
| Cons a (List a)
-- GADTSyntax
data List a where
Empty :: List a
Cons :: a -> List a -> List a
```

For an example use consider the data type `Term`

, we have a term in which we `Succ`

which takes a `Term`

parameterized by `a`

which spans all types. Problems arise between the clash whether (`a ~ Bool`

) or (`a ~ Int`

) when trying to write the evaluator.

```
data Term a
= Lit a
| Succ (Term a)
| IsZero (Term a)
-- can't be well-typed :(
eval (Lit i) = i
eval (Succ t) = 1 + eval t
eval (IsZero i) = eval i == 0
```

And we admit the construction of meaningless terms which forces more error handling cases.

Using a GADT we can express the type invariants for our language (i.e. only type-safe expressions are representable). Pattern matching on this GADT then carries type equality constraints without the need for explicit tags.

```
{-# Language GADTs #-}
data Term a where
Lit :: a -> Term a
Succ :: Term Int -> Term Int
IsZero :: Term Int -> Term Bool
If :: Term Bool -> Term a -> Term a -> Term a
eval :: Term a -> a
eval (Lit i) = i -- Term a
eval (Succ t) = 1 + eval t -- Term (a ~ Int)
eval (IsZero i) = eval i == 0 -- Term (a ~ Int)
eval (If b e1 e2) = if eval b then eval e1 else eval e2 -- Term (a ~ Bool)
example :: Int
example = eval (Succ (Succ (Lit 3)))
```

This time around:

Explicit equality constraints (`a ~ b`

) can be added to a function’s context. For example the following expand out to the same types.

```
(Int ~ Int) => ...
(a ~ Int) => ...
(Int ~ a) => ...
(a ~ b) => ...
(Int ~ Bool) => ... -- Will not typecheck.
```

This is effectively the implementation detail of what GHC is doing behind the scenes to implement GADTs ( implicitly passing and threading equality terms around ). If we wanted we could do the same setup that GHC does just using equality constraints and existential quantification. Indeed, the internal representation of GADTs is as regular algebraic datatypes that carry coercion evidence as arguments.

```
{-# LANGUAGE GADTs #-}
{-# LANGUAGE ExistentialQuantification #-}
-- Using Constraints
data Exp a
= (a ~ Int) => LitInt a
| (a ~ Bool) => LitBool a
| forall b. (b ~ Bool) => If (Exp b) (Exp a) (Exp a)
-- Using GADTs
-- data Exp a where
-- LitInt :: Int -> Exp Int
-- LitBool :: Bool -> Exp Bool
-- If :: Exp Bool -> Exp a -> Exp a -> Exp a
eval :: Exp a -> a
eval e = case e of
LitInt i -> i
LitBool b -> b
If b tr fl -> if eval b then eval tr else eval fl
```

In the presence of GADTs inference becomes intractable in many cases, often requiring an explicit annotation. For example `f`

can either have `T a -> [a]`

or `T a -> [Int]`

and neither is principal.

## Kind Signatures

Haskell’s kind system (i.e. the “type of the types”) is a system consisting the single kind `*`

and an arrow kind `->`

.

There are in fact some extensions to this system that will be covered later ( see: PolyKinds and Unboxed types in later sections ) but most kinds in everyday code are simply either stars or arrows.

With the KindSignatures extension enabled we can now annotate top level type signatures with their explicit kinds, bypassing the normal kind inference procedures.

On top of default GADT declaration we can also constrain the parameters of the GADT to specific kinds. For basic usage Haskell’s kind inference can deduce this reasonably well, but combined with some other type system extensions that extend the kind system this becomes essential.

```
{-# Language GADTs #-}
{-# LANGUAGE KindSignatures #-}
data Term a :: * where
Lit :: a -> Term a
Succ :: Term Int -> Term Int
IsZero :: Term Int -> Term Bool
If :: Term Bool -> Term a -> Term a -> Term a
data Vec :: * -> * -> * where
Nil :: Vec n a
Cons :: a -> Vec n a -> Vec n a
data Fix :: (* -> *) -> * where
In :: f (Fix f) -> Fix f
```

## Void

The Void type is the type with no inhabitants. It unifies only with itself.

Using a newtype wrapper we can create a type where recursion makes it impossible to construct an inhabitant.

Or using `-XEmptyDataDecls`

we can also construct the uninhabited type equivalently as a data declaration with no constructors.

The only inhabitant of both of these types is a diverging term like (`undefined`

).

## Phantom Types

Phantom types are parameters that appear on the left hand side of a type declaration but which are not constrained by the values of the types inhabitants. They are effectively slots for us to encode additional information at the type-level.

```
import Data.Void
data Foo tag a = Foo a
combine :: Num a => Foo tag a -> Foo tag a -> Foo tag a
combine (Foo a) (Foo b) = Foo (a+b)
-- All identical at the value level, but differ at the type level.
a :: Foo () Int
a = Foo 1
b :: Foo t Int
b = Foo 1
c :: Foo Void Int
c = Foo 1
-- () ~ ()
example1 :: Foo () Int
example1 = combine a a
-- t ~ ()
example2 :: Foo () Int
example2 = combine a b
-- t0 ~ t1
example3 :: Foo t Int
example3 = combine b b
-- Couldn't match type `t' with `Void'
example4 :: Foo t Int
example4 = combine b c
```

Notice the type variable `tag`

does not appear in the right hand side of the declaration. Using this allows us to express invariants at the type-level that need not manifest at the value-level. We’re effectively programming by adding extra information at the type-level.

Consider the case of using newtypes to statically distinguish between plaintext and cryptotext.

```
newtype Plaintext = Plaintext Text
newtype Crytpotext = Cryptotext Text
encrypt :: Key -> Plaintext -> Cryptotext
decrypt :: Key -> Cryptotext -> Plaintext
```

Using phantom types we use an extra parameter.

```
import Data.Text
data Cryptotext
data Plaintext
data Msg a = Msg Text
encrypt :: Msg Plaintext -> Msg Cryptotext
encrypt = undefined
decrypt :: Msg Cryptotext -> Msg Plaintext
decrypt = undefined
```

Using `-XEmptyDataDecls`

can be a powerful combination with phantom types that contain no value inhabitants and are “anonymous types”.

The tagged library defines a similar `Tagged`

newtype wrapper.

## Typelevel Operations

With a richer language for datatypes we can express terms that witness the relationship between terms in the constructors, for example we can now express a term which expresses propositional equality between two types.

The type `Eql a b`

is a proof that types `a`

and `b`

are equal, by pattern matching on the single `Refl`

constructor we introduce the equality constraint into the body of the pattern match.

```
{-# LANGUAGE GADTs #-}
{-# LANGUAGE ExplicitForAll #-}
-- a ≡ b
data Eql a b where
Refl :: Eql a a
-- Congruence
-- (f : A → B) {x y} → x ≡ y → f x ≡ f y
cong :: Eql a b -> Eql (f a) (f b)
cong Refl = Refl
-- Symmetry
-- {a b : A} → a ≡ b → a ≡ b
sym :: Eql a b -> Eql b a
sym Refl = Refl
-- Transitivity
-- {a b c : A} → a ≡ b → b ≡ c → a ≡ c
trans :: Eql a b -> Eql b c -> Eql a c
trans Refl Refl = Refl
-- Coerce one type to another given a proof of their equality.
-- {a b : A} → a ≡ b → a → b
castWith :: Eql a b -> a -> b
castWith Refl = id
-- Trivial cases
a :: forall n. Eql n n
a = Refl
b :: forall. Eql () ()
b = Refl
```

As of GHC 7.8 these constructors and functions are included in the Prelude in the Data.Type.Equality module.

# Interpreters

The lambda calculus forms the theoretical and practical foundation for many languages. At the heart of every calculus is three components:

**Var**- A variable**Lam**- A lambda abstraction**App**- An application

There are many different ways of modeling these constructions and data structure representations, but they all more or less contain these three elements. For example, a lambda calculus that uses String names on lambda binders and variables might be written like the following:

A lambda expression in which all variables that appear in the body of the expression are referenced in an outer lambda binder is said to be *closed* while an expression with unbound free variables is *open*.

## HOAS

Higher Order Abstract Syntax (*HOAS*) is a technique for implementing the lambda calculus in a language where the binders of the lambda expression map directly onto lambda binders of the host language ( i.e. Haskell ) to give us substitution machinery in our custom language by exploiting Haskell’s implementation.

```
{-# LANGUAGE GADTs #-}
data Expr a where
Con :: a -> Expr a
Lam :: (Expr a -> Expr b) -> Expr (a -> b)
App :: Expr (a -> b) -> Expr a -> Expr b
i :: Expr (a -> a)
i = Lam (\x -> x)
k :: Expr (a -> b -> a)
k = Lam (\x -> Lam (\y -> x))
s :: Expr ((a -> b -> c) -> (a -> b) -> (a -> c))
s = Lam (\x -> Lam (\y -> Lam (\z -> App (App x z) (App y z))))
eval :: Expr a -> a
eval (Con v) = v
eval (Lam f) = \x -> eval (f (Con x))
eval (App e1 e2) = (eval e1) (eval e2)
skk :: Expr (a -> a)
skk = App (App s k) k
example :: Integer
example = eval skk 1
-- 1
```

Pretty printing HOAS terms can also be quite complicated since the body of the function is under a Haskell lambda binder.

## PHOAS

A slightly different form of HOAS called PHOAS uses lambda datatype parameterized over the binder type. In this form evaluation requires unpacking into a separate Value type to wrap the lambda expression.

```
{-# LANGUAGE RankNTypes #-}
data ExprP a
= VarP a
| AppP (ExprP a) (ExprP a)
| LamP (a -> ExprP a)
| LitP Integer
data Value
= VLit Integer
| VFun (Value -> Value)
fromVFun :: Value -> (Value -> Value)
fromVFun val = case val of
VFun f -> f
_ -> error "not a function"
fromVLit :: Value -> Integer
fromVLit val = case val of
VLit n -> n
_ -> error "not a integer"
newtype Expr = Expr { unExpr :: forall a . ExprP a }
eval :: Expr -> Value
eval e = ev (unExpr e) where
ev (LamP f) = VFun(ev . f)
ev (VarP v) = v
ev (AppP e1 e2) = fromVFun (ev e1) (ev e2)
ev (LitP n) = VLit n
i :: ExprP a
i = LamP (\a -> VarP a)
k :: ExprP a
k = LamP (\x -> LamP (\y -> VarP x))
s :: ExprP a
s = LamP (\x -> LamP (\y -> LamP (\z -> AppP (AppP (VarP x) (VarP z)) (AppP (VarP y) (VarP z)))))
skk :: ExprP a
skk = AppP (AppP s k) k
example :: Integer
example = fromVLit $ eval $ Expr (AppP skk (LitP 3))
```

See:

## Final Interpreters

Using typeclasses we can implement a *final interpreter* which models a set of extensible terms using functions bound to typeclasses rather than data constructors. Instances of the typeclass form interpreters over these terms.

For example we can write a small language that includes basic arithmetic, and then retroactively extend our expression language with a multiplication operator without changing the base. At the same time our interpreter logic remains invariant under extension with new expressions.

```
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE NoMonomorphismRestriction #-}
class Expr repr where
lit :: Int -> repr
neg :: repr -> repr
add :: repr -> repr -> repr
mul :: repr -> repr -> repr
instance Expr Int where
lit n = n
neg a = -a
add a b = a + b
mul a b = a * b
instance Expr String where
lit n = show n
neg a = "(-" ++ a ++ ")"
add a b = "(" ++ a ++ " + " ++ b ++ ")"
mul a b = "(" ++ a ++ " * " ++ b ++ ")"
class BoolExpr repr where
eq :: repr -> repr -> repr
tr :: repr
fl :: repr
instance BoolExpr Int where
eq a b = if a == b then tr else fl
tr = 1
fl = 0
instance BoolExpr String where
eq a b = "(" ++ a ++ " == " ++ b ++ ")"
tr = "true"
fl = "false"
eval :: Int -> Int
eval = id
render :: String -> String
render = id
expr :: (BoolExpr repr, Expr repr) => repr
expr = eq (add (lit 1) (lit 2)) (lit 3)
result :: Int
result = eval expr
-- 1
string :: String
string = render expr
-- "((1 + 2) == 3)"
```

## Finally Tagless

Writing an evaluator for the lambda calculus can likewise also be modeled with a final interpreter and a Identity functor.

```
import Prelude hiding (id)
class Expr rep where
lam :: (rep a -> rep b) -> rep (a -> b)
app :: rep (a -> b) -> (rep a -> rep b)
lit :: a -> rep a
newtype Interpret a = R { reify :: a }
instance Expr Interpret where
lam f = R $ reify . f . R
app f a = R $ reify f $ reify a
lit = R
eval :: Interpret a -> a
eval e = reify e
e1 :: Expr rep => rep Int
e1 = app (lam (\x -> x)) (lit 3)
e2 :: Expr rep => rep Int
e2 = app (lam (\x -> lit 4)) (lam $ \x -> lam $ \y -> y)
example1 :: Int
example1 = eval e1
-- 3
example2 :: Int
example2 = eval e2
-- 4
```

See: Typed Tagless Interpretations and Typed Compilation

## Datatypes

The usual hand-wavy way of describing algebraic datatypes is to indicate the how natural correspondence between sum types, product types, and polynomial expressions arises.

```
data Void -- 0
data Unit = Unit -- 1
data Sum a b = Inl a | Inr b -- a + b
data Prod a b = Prod a b -- a * b
type (->) a b = a -> b -- b ^ a
```

Intuitively it follows the notion that the cardinality of set of inhabitants of a type can always be given as a function of the number of its holes. A product type admits a number of inhabitants as a function of the product (i.e. cardinality of the Cartesian product), a sum type as the sum of its holes and a function type as the exponential of the span of the domain and codomain.

Recursive types correspond to infinite series of these terms.

```
-- pseudocode
-- μX. 1 + X
data Nat a = Z | S Nat
Nat a = μ a. 1 + a
= 1 + (1 + (1 + ...))
-- μX. 1 + A * X
data List a = Nil | Cons a (List a)
List a = μ a. 1 + a * (List a)
= 1 + a + a^2 + a^3 + a^4 ...
-- μX. A + A*X*X
data Tree a f = Leaf a | Tree a f f
Tree a = μ a. 1 + a * (List a)
= 1 + a^2 + a^4 + a^6 + a^8 ...
```

## F-Algebras

The *initial algebra* approach differs from the final interpreter approach in that we now represent our terms as algebraic datatypes and the interpreter implements recursion and evaluation occurs through pattern matching.

```
type Algebra f a = f a -> a
type Coalgebra f a = a -> f a
newtype Fix f = Fix { unFix :: f (Fix f) }
cata :: Functor f => Algebra f a -> Fix f -> a
ana :: Functor f => Coalgebra f a -> a -> Fix f
hylo :: Functor f => Algebra f b -> Coalgebra f a -> a -> b
```

In Haskell a F-algebra is a functor `f a`

together with a function `f a -> a`

. A coalgebra reverses the function. For a functor `f`

we can form its recursive unrolling using the recursive `Fix`

newtype wrapper.

```
Fix f = f (f (f (f (f (f ( ... ))))))
newtype T b a = T (a -> b)
Fix (T a)
Fix T -> a
(Fix T -> a) -> a
(Fix T -> a) -> a -> a
...
```

In this form we can write down a generalized fold/unfold function that are datatype generic and written purely in terms of the recursing under the functor.

```
cata :: Functor f => Algebra f a -> Fix f -> a
cata alg = alg . fmap (cata alg) . unFix
ana :: Functor f => Coalgebra f a -> a -> Fix f
ana coalg = Fix . fmap (ana coalg) . coalg
```

We call these functions *catamorphisms* and *anamorphisms*. Notice especially that the types of these two functions simply reverse the direction of arrows. Interpreted in another way they transform an algebra/coalgebra which defines a flat structure-preserving mapping between `Fix f`

`f`

into a function which either rolls or unrolls the fixpoint. What is particularly nice about this approach is that the recursion is abstracted away inside the functor definition and we are free to just implement the flat transformation logic!

For example a construction of the natural numbers in this form:

```
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}
type Algebra f a = f a -> a
type Coalgebra f a = a -> f a
newtype Fix f = Fix {unFix :: f (Fix f)}
-- catamorphism
cata :: Functor f => Algebra f a -> Fix f -> a
cata alg = alg . fmap (cata alg) . unFix
-- anamorphism
ana :: Functor f => Coalgebra f a -> a -> Fix f
ana coalg = Fix . fmap (ana coalg) . coalg
-- hylomorphism
hylo :: Functor f => Algebra f b -> Coalgebra f a -> a -> b
hylo f g = cata f . ana g
type Nat = Fix NatF
data NatF a = S a | Z deriving (Eq, Show)
instance Functor NatF where
fmap f Z = Z
fmap f (S x) = S (f x)
plus :: Nat -> Nat -> Nat
plus n = cata phi
where
phi Z = n
phi (S m) = s m
times :: Nat -> Nat -> Nat
times n = cata phi
where
phi Z = z
phi (S m) = plus n m
int :: Nat -> Int
int = cata phi
where
phi Z = 0
phi (S f) = 1 + f
nat :: Integer -> Nat
nat = ana (psi Z S)
where
psi f _ 0 = f
psi _ f n = f (n -1)
z :: Nat
z = Fix Z
s :: Nat -> Nat
s = Fix . S
type Str = Fix StrF
data StrF x = Cons Char x | Nil
instance Functor StrF where
fmap f (Cons a as) = Cons a (f as)
fmap f Nil = Nil
nil :: Str
nil = Fix Nil
cons :: Char -> Str -> Str
cons x xs = Fix (Cons x xs)
str :: Str -> String
str = cata phi
where
phi Nil = []
phi (Cons x xs) = x : xs
str' :: String -> Str
str' = ana (psi Nil Cons)
where
psi f _ [] = f
psi _ f (a : as) = f a as
map' :: (Char -> Char) -> Str -> Str
map' f = hylo g unFix
where
g Nil = Fix Nil
g (Cons a x) = Fix $ Cons (f a) x
type Tree a = Fix (TreeF a)
data TreeF a f = Leaf a | Tree a f f deriving (Show)
instance Functor (TreeF a) where
fmap f (Leaf a) = Leaf a
fmap f (Tree a b c) = Tree a (f b) (f c)
depth :: Tree a -> Int
depth = cata phi
where
phi (Leaf _) = 0
phi (Tree _ l r) = 1 + max l r
example1 :: Int
example1 = int (plus (nat 125) (nat 25))
-- 150
```

Or for example an interpreter for a small expression language that depends on a scoping dictionary.

```
{-# LANGUAGE GADTs #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE UndecidableInstances #-}
import Control.Applicative
import qualified Data.Map as M
type Algebra f a = f a -> a
type Coalgebra f a = a -> f a
newtype Fix f = Fix { unFix :: f (Fix f) }
cata :: Functor f => Algebra f a -> Fix f -> a
cata alg = alg . fmap (cata alg) . unFix
ana :: Functor f => Coalgebra f a -> a -> Fix f
ana coalg = Fix . fmap (ana coalg) . coalg
hylo :: Functor f => Algebra f b -> Coalgebra f a -> a -> b
hylo f g = cata f . ana g
type Id = String
type Env = M.Map Id Int
type Expr = Fix ExprF
data ExprF a
= Lit Int
| Var Id
| Add a a
| Mul a a
deriving (Show, Eq, Ord, Functor)
deriving instance Eq (f (Fix f)) => Eq (Fix f)
deriving instance Ord (f (Fix f)) => Ord (Fix f)
deriving instance Show (f (Fix f)) => Show (Fix f)
eval :: M.Map Id Int -> Fix ExprF -> Maybe Int
eval env = cata phi where
phi ex = case ex of
Lit c -> pure c
Var i -> M.lookup i env
Add x y -> liftA2 (+) x y
Mul x y -> liftA2 (*) x y
expr :: Expr
expr = Fix (Mul n (Fix (Add x y)))
where
n = Fix (Lit 10)
x = Fix (Var "x")
y = Fix (Var "y")
env :: M.Map Id Int
env = M.fromList [("x", 1), ("y", 2)]
compose :: (f (Fix f) -> c) -> (a -> Fix f) -> a -> c
compose x y = x . unFix . y
example :: Maybe Int
example = eval env expr
-- Just 30
```

What is especially elegant about this approach is how naturally catamorphisms compose into efficient composite transformations.

## Recursion Schemes & The Morphism Zoo

Recursion schemes are a generally way of classifying a families of traversal algorithms that modify data structures recursively. Recursion schemes give rise to a rich set of algebraic structures which can be composed to devise all sorts of elaborate term rewrite systems. Most applications of recursion schemes occur in the context of graph rewriting or abstract syntax tree manipulation.

Several basic recursion schemes form the foundation of these rules. Grossly, a *anamorphism* is an unfolding of a data structure into a list of terms, while a *catamorphism* is a is the refolding of a data structure from a list of terms.

Name | Type Signature |
---|---|

Catamorphism | `cata :: (a -> b -> b) -> b -> [a] -> b` |

Anamorphism | `ana :: (b -> Maybe (a, b)) -> b -> [a]` |

Paramorphism | `para :: (a -> ([a], b) -> b) -> b -> [a] -> b` |

Apomorphism | `apo :: (b -> (a, Either [a] b)) -> b -> [a]` |

Hylomorphism | `hylo :: Functor f => (f b -> b) -> (a -> f a) -> a -> b` |

For a `Fix`

point type over a type with a Functor instance for the parameter `f`

we can write down the recursion schemes as the following definitions:

```
-- | A fix-point type.
newtype Fix f = Fix { unFix :: f (Fix f) }
-- | Catamorphism or generic function fold.
cata :: Functor f => (f a -> a) -> (Fix f -> a)
cata f = f . fmap (cata f) . unFix
-- | Anamorphism or generic function unfold.
ana :: Functor f => (a -> f a) -> (a -> Fix f)
ana f = Fix . fmap (ana f) . f
-- | Hylomorphism
hylo :: Functor f => (f b -> b) -> (a -> f a) -> a -> b
hylo f g = h where h = f . fmap h . g
-- Paramorphism
para :: Functor f => (f (Fix f, t) -> t) -> Fix f -> t
para f (Fix x) = psi (fmap l x) where
l x = (x, para f x)
```

One can also construct monadic versions of these functions which have a result type inside of a monad. Instead of using function composition we use Kleisi composition.

```
-- Monadic catamorphism
cataM :: (Traversable f, Monad m) => (f a -> m a) -> Fix f -> m a
cataM f = f <=< traverse (cataM f) . unfix
```

The library `recursion-schemes`

implements these basic recursion schemes as well as whole family of higher-order combinators off the shelf. These are implemented in terms of two typeclases `Recursive`

and `Corecursive`

which extend an instance of Functor with default methods for catamorphisms and anamorphisms. For the `Fix`

type above these functions expand into the following definitions:

```
class Functor t => Recursive t where
project :: t -> t t
cata :: (t a -> a) -> t -> a
cata f = c where c = f . fmap c . project
class Functor t => Corecursive t where
embed :: t -> t t
ana :: (a -> t a) -> a -> t
ana g = a where a = embed . fmap a . g
-- Additional ListF helper
data ListF a b = Nil | Cons a b
```

The canonical example of a catamorphism is the factorial function which is a composition of a coalgebra which creates a list from `n`

to `1`

and an algebra which multiplies the resulting list to a single result:

```
import Data.Functor.Foldable
factorial :: Int -> Int
factorial = hylo alg coalg
where
coalg :: Int -> ListF Int Int
coalg m
| m <= 1 = Nil
| otherwise = Cons m (m - 1)
alg :: ListF Int Int -> Int
alg Nil = 1
alg (Cons a x) = a * x
```

Another example is unfolding of lambda calculus to perform a substitution over a variable. We can define a catamoprhism for traversing over the AST.

```
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE TypeSynonymInstances #-}
import Control.Monad hiding (forM_, mapM, sequence)
import qualified Data.Map as M
import Data.Traversable
import Prelude hiding (mapM)
newtype Fix (f :: * -> *) = Fix {outF :: f (Fix f)}
-- Catamorphism
cata :: Functor f => (f a -> a) -> Fix f -> a
cata f = f . fmap (cata f) . outF
-- Monadic catamorphism
cataM :: (Traversable f, Monad m) => (f a -> m a) -> Fix f -> m a
cataM f = f <=< mapM (cataM f) . outF
data ExprF r
= EVar String
| EApp r r
| ELam r r
deriving (Show, Eq, Ord, Functor)
type Expr = Fix ExprF
instance Show (Fix ExprF) where
show (Fix f) = show f
instance Eq (Fix ExprF) where
Fix x == Fix y = x == y
instance Ord (Fix ExprF) where
compare (Fix x) (Fix y) = compare x y
mkApp :: Fix ExprF -> Fix ExprF -> Fix ExprF
mkApp x y = Fix (EApp x y)
mkVar :: String -> Fix ExprF
mkVar x = Fix (EVar x)
mkLam :: Fix ExprF -> Fix ExprF -> Fix ExprF
mkLam x y = Fix (ELam x y)
i :: Fix ExprF
i = mkLam (mkVar "x") (mkVar "x")
k :: Fix ExprF
k = mkLam (mkVar "x") $ mkLam (mkVar "y") $ (mkVar "x")
subst :: M.Map String (ExprF Expr) -> Expr -> Expr
subst env = cata alg
where
alg (EVar x) | Just e <- M.lookup x env = Fix e
alg e = Fix e
```

Another use case would be to collect the free variables inside of the AST. This example use the `recursion-schemes`

library.

```
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE TypeFamilies #-}
import Data.Functor.Foldable
type Var = String
data Exp
= Var Var
| App Exp Exp
| Lam [Var] Exp
deriving (Show)
data ExpF a
= VarF Var
| AppF a a
| LamF [Var] a
deriving (Functor)
type instance Base Exp = ExpF
instance Recursive Exp where
project (Var a) = VarF a
project (App a b) = AppF a b
project (Lam a b) = LamF a b
instance Corecursive Exp where
embed (VarF a) = Var a
embed (AppF a b) = App a b
embed (LamF a b) = Lam a b
fvs :: Exp -> [Var]
fvs = cata phi
where
phi (VarF a) = [a]
phi (AppF a b) = a ++ b
phi (LamF a b) = foldr (filter . (/=)) a b
```

See:

## Hint and Mueval

GHC itself can actually interpret arbitrary Haskell source on the fly by hooking into the GHC’s bytecode interpreter ( the same used for GHCi ). The hint package allows us to parse, typecheck, and evaluate arbitrary strings into arbitrary Haskell programs and evaluate them.

```
import Language.Haskell.Interpreter
foo :: Interpreter String
foo = eval "(\\x -> x) 1"
example :: IO (Either InterpreterError String)
example = runInterpreter foo
```

This is generally not a wise thing to build a library around, unless of course the purpose of the program is itself to evaluate arbitrary Haskell code ( something like an online Haskell shell or the likes ).

Both hint and mueval do effectively the same task, designed around slightly different internals of the GHC Api.

See:

# Testing

Unit testing frameworks are an important component in the Haskell ecosystem. Program correctness is a central philosophical concept and unit testing forms the third part of the ecosystem that includes strong type system and property testing. Generally speaking unit tests tend to be of less importance in Haskell since the type system makes an enormous amount of invalid programs completely inexpressible by construction. Unit tests tend to be written later in the development lifecycle and generally tend to be about the core logic of the program and not the intermediate plumbing.

A prominent school of thought on Haskell library design tends to favor constructing programs built around strong equational laws which guarantee strong invariants about program behavior under composition. Many of the testing tools are built around this style of design.

## QuickCheck

Probably the most famous Haskell library, QuickCheck is a testing framework. This is a framework for generating large random tests for arbitrary functions automatically based on the types of their arguments.

```
quickCheck :: Testable prop => prop -> IO ()
(==>) :: Testable prop => Bool -> prop -> Property
forAll :: (Show a, Testable prop) => Gen a -> (a -> prop) -> Property
choose :: Random a => (a, a) -> Gen a
```

```
import Test.QuickCheck
qsort :: [Int] -> [Int]
qsort [] = []
qsort (x:xs) = qsort lhs ++ [x] ++ qsort rhs
where lhs = filter (< x) xs
rhs = filter (>= x) xs
prop_maximum :: [Int] -> Property
prop_maximum xs = not (null xs) ==>
last (qsort xs) == maximum xs
main :: IO ()
main = quickCheck prop_maximum
```

```
$ runhaskell qcheck.hs
*** Failed! Falsifiable (after 3 tests and 4 shrinks):
[0]
[1]
$ runhaskell qcheck.hs
+++ OK, passed 1000 tests.
```

The test data generator can be extended with custom types and refined with predicates that restrict the domain of cases to test.

```
import Test.QuickCheck
data Color = Red | Green | Blue deriving Show
instance Arbitrary Color where
arbitrary = do
n <- choose (0,2) :: Gen Int
return $ case n of
0 -> Red
1 -> Green
2 -> Blue
example1 :: IO [Color]
example1 = sample' arbitrary
-- [Red,Green,Red,Blue,Red,Red,Red,Blue,Green,Red,Red]
```

See: QuickCheck: An Automatic Testing Tool for Haskell

## SmallCheck

Like QuickCheck, SmallCheck is a property testing system but instead of producing random arbitrary test data it instead enumerates a deterministic series of test data to a fixed depth.

```
smallCheck :: Testable IO a => Depth -> a -> IO ()
list :: Depth -> Series Identity a -> [a]
sample' :: Gen a -> IO [a]
```

```
λ: list 3 series :: [Int]
[0,1,-1,2,-2,3,-3]
λ: list 3 series :: [Double]
[0.0,1.0,-1.0,2.0,0.5,-2.0,4.0,0.25,-0.5,-4.0,-0.25]
λ: list 3 series :: [(Int, String)]
[(0,""),(1,""),(0,"a"),(-1,""),(0,"b"),(1,"a"),(2,""),(1,"b"),(-1,"a"),(-2,""),(-1,"b"),(2,"a"),(-2,"a"),(2,"b"),(-2,"b")]
```

It is useful to generate test cases over *all* possible inputs of a program up to some depth.

```
import Test.SmallCheck
distrib :: Int -> Int -> Int -> Bool
distrib a b c = a * (b + c) == a * b + a * c
cauchy :: [Double] -> [Double] -> Bool
cauchy xs ys = (abs (dot xs ys))^2 <= (dot xs xs) * (dot ys ys)
failure :: [Double] -> [Double] -> Bool
failure xs ys = abs (dot xs ys) <= (dot xs xs) * (dot ys ys)
dot :: Num a => [a] -> [a] -> a
dot xs ys = sum (zipWith (*) xs ys)
main :: IO ()
main = do
putStrLn "Testing distributivity..."
smallCheck 25 distrib
putStrLn "Testing Cauchy-Schwarz..."
smallCheck 4 cauchy
putStrLn "Testing invalid Cauchy-Schwarz..."
smallCheck 4 failure
```

```
$ runhaskell smallcheck.hs
Testing distributivity...
Completed 132651 tests without failure.
Testing Cauchy-Schwarz...
Completed 27556 tests without failure.
Testing invalid Cauchy-Schwarz...
Failed test no. 349.
there exist [1.0] [0.5] such that
condition is false
```

Just like for QuickCheck we can implement series instances for our custom datatypes. For example there is no default instance for Vector, so let’s implement one:

```
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
import Test.SmallCheck
import Test.SmallCheck.Series
import Control.Applicative
import qualified Data.Vector as V
dot :: Num a => V.Vector a -> V.Vector a -> a
dot xs ys = V.sum (V.zipWith (*) xs ys)
cauchy :: V.Vector Double -> V.Vector Double -> Bool
cauchy xs ys = (abs (dot xs ys))^2 <= (dot xs xs) * (dot ys ys)
instance (Serial m a, Monad m) => Serial m (V.Vector a) where
series = V.fromList <$> series
main :: IO ()
main = smallCheck 4 cauchy
```

SmallCheck can also use Generics to derive Serial instances, for example to enumerate all trees of a certain depth we might use:

```
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics
import Test.SmallCheck.Series
data Tree a = Null | Fork (Tree a) a (Tree a)
deriving (Show, Generic)
instance Serial m a => Serial m (Tree a)
example :: [Tree ()]
example = list 3 series
main = print example
```

## QuickSpec

Using the QuickCheck arbitrary machinery we can also rather remarkably enumerate a large number of combinations of functions to try and deduce algebraic laws from trying out inputs for small cases. Of course the fundamental limitation of this approach is that a function may not exhibit any interesting properties for small cases or for simple function compositions. So in general case this approach won’t work, but practically it still quite useful.

```
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeOperators #-}
import Data.List
import Data.Typeable
import QuickSpec hiding (arith, bools, lists)
import Test.QuickCheck.Arbitrary
type Var k a = (Typeable a, Arbitrary a, CoArbitrary a, k a)
listCons :: forall a. Var Ord a => a -> Sig
listCons a =
background
[ "[]" `fun0` ([] :: [a]),
":" `fun2` ((:) :: a -> [a] -> [a])
]
lists :: forall a. Var Ord a => a -> [Sig]
lists a =
[ -- Names to print arbitrary variables
funs',
funvars',
vars',
-- Ambient definitions
listCons a,
-- Expressions to deduce properties of
"sort" `fun1` (sort :: [a] -> [a]),
"map" `fun2` (map :: (a -> a) -> [a] -> [a]),
"id" `fun1` (id :: [a] -> [a]),
"reverse" `fun1` (reverse :: [a] -> [a]),
"minimum" `fun1` (minimum :: [a] -> a),
"length" `fun1` (length :: [a] -> Int),
"++" `fun2` ((++) :: [a] -> [a] -> [a])
]
where
funs' = funs (undefined :: a)
funvars' = vars ["f", "g", "h"] (undefined :: a -> a)
vars' = ["xs", "ys", "zs"] `vars` (undefined :: [a])
tvar :: A
tvar = undefined
main :: IO ()
main = quickSpec (lists tvar)
```

Running this we rather see it is able to deduce most of the laws for list functions.

```
$ runhaskell src/quickspec.hs
-- background functions --
id :: A -> A
(:) :: A -> [A] -> [A]
(.) :: (A -> A) -> (A -> A) -> A -> A
[] :: [A]
-- variables --
f, g, h :: A -> A
xs, ys, zs :: [A]
== Equations about map ==
1: map f [] == []
2: map id xs == xs
3: map (f.g) xs == map f (map g xs)
== Equations about minimum ==
4: minimum [] == undefined
== Equations about (++) ==
5: xs++[] == xs
6: []++xs == xs
7: (xs++ys)++zs == xs++(ys++zs)
== Equations about sort ==
8: sort [] == []
9: sort (sort xs) == sort xs
== Equations about id ==
10: id xs == xs
== Equations about reverse ==
11: reverse [] == []
12: reverse (reverse xs) == xs
== Equations about several functions ==
13: minimum (xs++ys) == minimum (ys++xs)
14: length (map f xs) == length xs
15: length (xs++ys) == length (ys++xs)
16: sort (xs++ys) == sort (ys++xs)
17: map f (reverse xs) == reverse (map f xs)
18: minimum (sort xs) == minimum xs
19: minimum (reverse xs) == minimum xs
20: minimum (xs++xs) == minimum xs
21: length (sort xs) == length xs
22: length (reverse xs) == length xs
23: sort (reverse xs) == sort xs
24: map f xs++map f ys == map f (xs++ys)
25: reverse xs++reverse ys == reverse (ys++xs)
```

Keep in mind the rather remarkable fact that this is all deduced automatically from the types alone!

## Tasty

Tasty is the commonly used unit testing framework. It combines all of the testing frameworks (Quickcheck, SmallCheck, HUnit) into a common API for forming runnable batches of tests and collecting the results.

```
import Test.Tasty
import Test.Tasty.HUnit
import Test.Tasty.QuickCheck
import qualified Test.Tasty.SmallCheck as SC
arith :: Integer -> Integer -> Property
arith x y = (x > 0) && (y > 0) ==> (x+y)^2 > x^2 + y^2
negation :: Integer -> Bool
negation x = abs (x^2) >= x
suite :: TestTree
suite = testGroup "Test Suite" [
testGroup "Units"
[ testCase "Equality" $ True @=? True
, testCase "Assertion" $ assert $ (length [1,2,3]) == 3
],
testGroup "QuickCheck tests"
[ testProperty "Quickcheck test" arith
],
testGroup "SmallCheck tests"
[ SC.testProperty "Negation" negation
]
]
main :: IO ()
main = defaultMain suite
```

```
$ runhaskell TestSuite.hs
Unit tests
Units
Equality: OK
Assertion: OK
QuickCheck tests
Quickcheck test: OK
+++ OK, passed 100 tests.
SmallCheck tests
Negation: OK
11 tests completed
```

## Silently

Often in the process of testing IO heavy code we’ll need to redirect stdout to compare it some known quantity. The `silently`

package allows us to capture anything done to stdout across any library inside of IO block and return the result to the test runner.

```
import Test.Tasty
import Test.Tasty.HUnit
import System.IO.Silently
test :: Int -> IO ()
test n = print (n * n)
testCapture n = do
(stdout, result) <- capture (test n)
assert (stdout == show (n*n) ++ "\n")
suite :: TestTree
suite = testGroup "Test Suite" [
testGroup "Units"
[ testCase "Equality" $ testCapture 10
]
]
main :: IO ()
main = defaultMain suite
```

# Type Families

Type families are a powerful extension the Haskell type system, developed in 2005, that provide type-indexed data types and named functions on types. This allows a whole new level of computation to occur at compile-time and opens an entire arena of type-level abstractions that were previously impossible to express. Type families proved to be nearly as fruitful as typeclasses and indeed, many previous approaches to type-level programming using classes are achieved much more simply with type families.

## MultiParam Typeclasses

Resolution of vanilla Haskell 98 typeclasses proceeds via very simple context reduction that minimizes interdependency between predicates, resolves superclasses, and reduces the types to head normal form. For example:

If a single parameter typeclass expresses a property of a type ( i.e. whether it’s in a class or not in class ) then a multiparameter typeclass expresses relationships between types. For example if we wanted to express the relation that a type can be converted to another type we might use a class like:

```
{-# LANGUAGE MultiParamTypeClasses #-}
import Data.Char
class Convertible a b where
convert :: a -> b
instance Convertible Int Integer where
convert = toInteger
instance Convertible Int Char where
convert = chr
instance Convertible Char Int where
convert = ord
```

Of course now our instances for `Convertible Int`

are not unique anymore, so there no longer exists a nice procedure for determining the inferred type of `b`

from just `a`

. To remedy this let’s add a functional dependency `a -> b`

, which tells GHC that an instance `a`

uniquely determines the instance that b can be. So we’ll see that our two instances relating `Int`

to both `Integer`

and `Char`

conflict.

```
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FunctionalDependencies #-}
import Data.Char
class Convertible a b | a -> b where
convert :: a -> b
instance Convertible Int Char where
convert = chr
instance Convertible Char Int where
convert = ord
```

```
Functional dependencies conflict between instance declarations:
instance Convertible Int Integer
instance Convertible Int Char
```

Now there’s a simpler procedure for determining instances uniquely and multiparameter typeclasses become more usable and inferable again. Effectively a functional dependency `| a -> b`

says that we can’t define multiple multiparamater typeclass instances with the same `a`

but different `b`

.

Now let’s make things not so simple. Turning on `UndecidableInstances`

loosens the constraint on context reduction that can only allow constraints of the class to become structural smaller than its head. As a result implicit computation can now occur *within in the type class instance search*. Combined with a type-level representation of Peano numbers we find that we can encode basic arithmetic at the type-level.

```
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE UndecidableInstances #-}
data Z
data S n
type Zero = Z
type One = S Zero
type Two = S One
type Three = S Two
type Four = S Three
zero :: Zero
zero = undefined
one :: One
one = undefined
two :: Two
two = undefined
three :: Three
three = undefined
four :: Four
four = undefined
class Eval a where
eval :: a -> Int
instance Eval Zero where
eval _ = 0
instance Eval n => Eval (S n) where
eval m = 1 + eval (prev m)
class Pred a b | a -> b where
prev :: a -> b
instance Pred Zero Zero where
prev = undefined
instance Pred (S n) n where
prev = undefined
class Add a b c | a b -> c where
add :: a -> b -> c
instance Add Zero a a where
add = undefined
instance Add a b c => Add (S a) b (S c) where
add = undefined
f :: Three
f = add one two
```