Beginning Haskell

Stephen Diehl ( @smdiehl )

December 10, 2013

Essentials

Why Haskell?

I used to think the best skill for programmers was the ability to hold the whole system in your head. In retrospect this was a crappy way to work on large codebases over extended periods of time. It doesn't scale, and you can't communicate deep understanding to others. The wise thing to do is to eliminate the need to load everything into one person's head and offload that responsibility to systems with strong theoretic safeguards.

Some personal views:

The Bare Minimum You Need to Know about Haskell Syntax

Arguments applied to functions separated by spaces, like a Unix shell. Application is left associative.

f x y z 
((f x) y) z 

Functions are written at the toplevel simply.

f x y = x + y

Let statements are expressions which allows us to extend the scope of other expressions.

f x y = let z = 3 in x + y + z

Where statements can proceed definitions to add additional variables. The variables of the function's arguments are scoped within where statements.

f x y = x + y + a + b
  where
    a = 3 
    b = a + 1

Where statements are bound a closest surrounding syntactic construct, while let statements are first class expressions that are valid anywhere in a definition.

Anonymous lambdas are written with a backslash:

f = \x -> \y -> x + y
f = \x y -> x + y

Infix operators in Haskell can be written in several ways

Binary functions can also be written as infix expressions using the backtick notation with the left hand side applied to the first argument.


op a b = a * b + 1
res = 2 `op` 3

When an argument is pattern matched but not used on the right hand side we can write a underscore to indicate it is irrelevant.

const x _ = x

When pattern matching against patterns arbitrary variables can be assigned to segments of the matched expression using as-patterns.

f x@(Foo a y@(Bar b c)) = (x, y)

When the contents of a datatype are not necessary on the right hand side the we can write the constructor followed by empty braces.

f (Foo {}) = True

The Bare Minimum You Need to Know about Types

Type signatures are often optional in the presence of type inference, though recommended.

f :: Integer -> Integer
f x = x + 1

Expressions can have rigid type signatures

f = 3 :: Integer

The type operator (->) is right associative:

a -> a -> a 
a -> (a -> a)

Type variables in the signature are polymorphic

id :: forall a. a -> a

forall is usually implicit.

id :: a -> a
id x = x

All functions take 1 argument, which may return another function or a value.

const :: forall a b. a -> b -> a
const x y = x

The function (const 3) returns a function which will always return 3.

f :: b -> Int
f = const 3 

Equality constraints on type variables are propagated across the whole signature through a process called unification. The implementation of this typing scheme is called parametric polymorphism.

map :: forall a b. (a -> b) -> [a] -> [b]
g :: [Int] -> [Int]
g = map (+1)

We can rename compound types with type synonyms, GHC will expand them out when needed.

type String = [Char]

Type constructors are "functions" for types, they can be parameterized by type variables.

data Bool = False | True 
data Either a b = Left a | Right b
a :: Either Bool b
a = Left True

b :: Either a String
b = Right "foo"

Certain types have syntactic sugar.

data List a = Nil | Cons a (List a)
data [a] = [] | a : [a]  -- pseudocode

data Tuple a b = MkTuple a b
data (a,b) = (a,b) -- pseudocode

data () = () -- pseudocode

Types also have types called kinds. Haskell only has one type of types (*).

Int :: *
[] :: * -> *
(->) :: * -> * -> *
Maybe :: * -> *
Either :: * -> * -> *

Type constructors build up data structures, pattern matching tears down data structures.

head :: [a] -> a
head (x:xs) = x

tail :: [a] -> [a]
tail (x:xs) = xs
either :: (a -> c) -> (b -> c) -> Either a b -> c
either f g (Left x)  =  f x
either f g (Right y) =  g y

Pattern matching and case statements are equivelant.

either :: (a -> c) -> (b -> c) -> Either a b -> c
either f g a = case a of 
  (Left x)  -> f x
  (Right y) -> g y

Records are data constructors with named fields.

data MyRecordType a = MyRecord { foo :: a, bar :: a }

Arguments can be passed explictly by name or by order.

MyRecord False True
MyRecord {foo = False, bar = True}

We can update fields on records selectively.

rec = MyRecord {foo = False, bar = True} 
rec2 = rec { bar = False }

The field names are scoped as functions in the current module.

foo :: MyRecordType a -> a
bar :: MyRecordType a -> a
foo (MyRecord False True) -- False
bar (MyRecord False True) -- True

Types can be qualified by constraints (shown on the left hand side of the =>) which indicate that type variables belong to instances of type class. As an example (+) requires instances of Num in it's arguments.

(+) :: Num a => a -> a -> a

Type classes allow overloading different behavior for functions when called with different types. This is called ad-hoc polymorphism.

data Animal = Dog | Cat
data Car = Honda | Toyota

class EqClass t where
  equal :: t -> t -> Bool
  neq   :: t -> t -> Bool

  neq a b = not (equal a b)

instance EqClass Animal where
  equal Dog Dog = True
  equal Cat Cat = True
  equal _ _ = False

instance EqClass Car where
  equal Honda Honda = True
  equal Toyota Toyota = True
  equal _ _ = False

You can extend this typeclass by writing down a new implmentation of equal for any type we like. The function neq has a default implementation that we get for free, since it can be written in terms of equal and is independent of type.

Equal can now be used for both Car and Animal types and a constraint appears in it's type signature.

equal :: EqClass t => t -> t -> Bool

But typeclasses aren't magic. You could implement them without buitin support by passing around parameterized records of functions. GHC does something quite similar behind the scenes.

data EqDict t = EqDict { equal' :: t -> t -> Bool }

equalAnimal Dog Dog = True
equalAnimal Cat Cat = True
equalAnimal _ _ = False

animalEq :: EqDict Animal
animalEq = EqDict equalAnimal


-- With type classes
neqAnimal :: EqClass t => t -> t -> Bool
neqAnimal a b = neq a b

-- With dictionary passing
neqAnimal' :: EqDict t -> t -> t -> Bool
neqAnimal' dict a b = not (equal' dict a b)

Many type classes have obvious implementations that can be written trivially in terms of the data declaration. GHC can write many of them for us using deriving.

data Day = Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday deriving (Eq, Ord)

Monday == Monday -- True
Saturday > Monday -- True

As an example, if we know how to print a type (a) then we automatically know how to print a list of them.

instance Show a => Show [a] where
  show (x:xs) = ...

newtype declarations are similar to data declarations, but while data constructors create new runtime containers for values, a newtype wrapper is only a type introduction and looks the same to the runtime as it's underlying type.

newtype Age = Age { unAge :: Int }

This introduces both a constructor and a deconstructor.

Age   :: Int -> Age
unAge :: Age -> Int

Newtype wrappers and type classes form the foundation for much of the utility of the transformers and mtl libraries.


Don't get caught by the the monomorphism restriction.

-- sum :: [Integer] -> Integer -- Inferred type!
sum :: (Num a) => [a] -> a
sum = foldl (+) 0

This is just the beginning.

The Bare Minimum You Need to Know about GHCi

Functions must be let bound:

λ: let f x = x + 1

We extend the current scope with external modules by using :module:

λ: :m + Data.List

The current module can be switched using :load.

λ: :l Example.hs

Introspection:

λ: :type 3
3 :: Num a => a
λ: :kind Either
Either :: * -> * -> *
λ: :info Functor
class Functor f where
  fmap :: (a -> b) -> f a -> f b
  (<$) :: a -> f b -> f a
    -- Defined in `GHC.Base'
  ...

Code reload will drastically improve our workflow for testing code.

λ: :r
... hack hack hack ...

λ: :r
 <Test.hs>:6:3:
 Couldn't match expected type `Int' with actual type `Bool'

... fix fix fix ...

λ: :r
λ:

GHCi is slow. As in orders of magnitude slower than ghc -O2. Don't base your performance assumptions on it.

cabal install hoogle

Hoogle can be added to our prompt by adding the following snippet to ~/.ghc/ghci.conf:

:def hoogle \s -> return $ ":! hoogle --count=15 \"" ++ s ++ "\""

The Bare Minimum You Need to Know about Cabal

To start a new Haskell project run.

$ cabal init

A .cabal file will be created.

Sandboxes ( in cabal > 1.8 ) are self contained paths of Haskell packages. They fix much of the earlier "cabal hell" problems.

$ cabal sandbox init
$ cabal update
$ cabal install
$ cabal install dependencies

To run the "executable" for a library under the cabal sandbox:

$ cabal run

To load the "library" into a GHCi shell under the cabal sandbox:

$ cabal repl

An example cabal file:

name:               mylibrary
version:            0.1
cabal-version:      >= 1.10
author:             Paul Atreides 
license:            MIT
license-file:       LICENSE
synopsis:           It does some things.
category:           Math
tested-with:        GHC
build-type:         Simple
  
library                    
    build-depends: 
      base >= 4 && < 5
    default-language: Haskell2010

executable "example"
    build-depends: 
        base >= 4 && < 5
    default-language: Haskell2010
    main-is: Main.hs    

The Bare Minimum You Need to Know about Language Extensions

Many language extensions are ubiquities and define modern Haskell development. These are safe to enable, but may complicate type inference and require explicit annotations.

GHC will sometimes casually tell us to enable dangerous extensions that shouldn't be used in normal code. Avoid:

Eightfold Path to Monad Satori

  1. Don't read the monad tutorials.
  2. No really, don't read the monad tutorials.
  3. Learn about Haskell types.
  4. Learn what a typeclass is.
  5. Read the Typeclassopedia.
  6. Read the monad definitions.
  7. Use monads in real code.
  8. Don't write monad-analogy tutorials.

Monads

Monads are not complicated structures.

class Monad m where
    (>>=)  :: m a -> (a -> m b) -> m b
    return :: a -> m a
return a >>= f   ≡  f a
m >>= return     ≡  m
(m >>= f) >>= g  ≡  m >>= (\x -> f x >>= g)

(>>=) is pronounced "bind".

Also (>>) in terms of (>>=):

(>>) :: Monad m => m a -> (a -> m b) -> m b
m >> k = m >>= \_ -> k

Ok, so about Monads...

Implementing monads in other languages has become a small cottage industry. Some implementations are proper monads, many are not.

The most common error is an implementation which implements bind and return, but fails to consider the laws entirely. The burden of proof to be a monad rests on having an instance which demonstrably adheres to the monad laws. If we don't prove the laws, our pseudo-monad won't compose properly.


"Monads are just monoids in the category of endofunctors. What's the problem?".

This was originally a joke and while true, is a very ambigious way of describing them. Monads are not algebraic monoids, but are monoid objects in the specific monoidal category, End the category of endofunctors. This is irrelevant to most Haskell programming.

Maybe Monad

data Maybe a = Just a | Nothing
instance Functor Maybe where
    fmap _ Nothing  = Nothing
    fmap f (Just a) = Just (f a)
instance  Monad Maybe where
  (Just x) >>= k = k x
  Nothing  >>= k = Nothing
  return         = Just
(Just 3) >>= (\x -> return (x + 1))
-- Just 4

Nothing >>= (\x -> return (x + 1))
-- Nothing

return 4 :: Maybe Int
-- Just 4

State Monad Tutorial

Our state is held inside of a tuple that is implictly threaded through each bind.

newtype State s a = State { runState :: s -> (a,s) }

instance Monad (State s) where
  return a = State $ \s -> (a, s)

  State act >>= k = State $ \s ->
    let (a, s') = act s
    in runState (k a) s'

get :: State s s
get = State $ \s -> (s, s)

put :: s -> State s ()
put s = State $ \_ -> ((), s)

modify :: (s -> s) -> State s ()
modify f = get >>= \x -> put (f x)

The state monad is purely functional. We can evaluate it and extract it's value either with it's state or without.

runState  :: State s a -> s -> (a, s)
evalState :: State s a -> s ->  a
execState :: State s a -> s ->     s
evalState :: State s a -> s -> a
evalState act = fst . runState act

execState :: State s a -> s -> s
execState act = snd . runState act

Trying it out...

test :: State Int Int
test = do
  put 3
  modify (+1)
  get

main = print $ execState test 0

Do Notation

do  pattern <- expression
    morelines
expression >>= (\ pattern -> do morelines)

Desugaring...

main :: IO ()
main = do putStrLn "What is your name: "
          name <- getLine
          putStrLn name
main :: IO ()
main = putStrLn "What is your name:" >>= 
       \_    -> getLine >>=
       \name -> putStrLn name
main :: IO ()
main = putStrLn "What is your name: " >> getLine >>= (\name -> putStrLn name)

Let's talk about IO

Likely everything you've heard about IO and side-effects in Haskell is wrong! There is a ton of misinformation out there.

The usual FUD you'll read on the Hacker News:

"I would say that the difference is that the Web is almost entirely about I/O, which Haskell treats as an unfortunate aspect of programming that is to be made as unpleasant as possible so it will be avoided except where unavoidable."

Some straight talk from Gabriel Gonzalez on /r/haskell.

"Haskell is useful because the real world is all about side effects. In my opinion, Haskell is the only language that gets side effects correct, since it reifies side effects as first-class values within the language. This makes it far easier to manipulate and reason about side effects since they are ordinary values rather than mysterious machinery lurking in the background of my program."

Haskell is used to handle some very heavy IO tasks.

"We also show that with Mio, McNettle (an SDN controller written in Haskell), can scale effectively to 40+ cores, reach a throughput of over 20 million new requests per second on a single machine, and hence become the fastest of all existing SDN controllers."

Mio: A High-Performance Multicore IO Manager for GHC

Loops and Iteration

foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f a [b1, b2, b3, b4]
=> f (f (f (f a b1) b2) b3) b4
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f a [b1, b2, b3, b4]
=> f b1 (f b2 (f b3 (f b4 a)))

Most loops can be written as cases of higher order functions. Explicit recursion is often unnecessary.

Kitten Break

Questions?

Diagrams

$ cabal install diagrams

runhaskell diagram1.hs -w 256 -h 256 -o diagram1.svg
import Diagrams.Prelude
import Diagrams.Backend.SVG.CmdLine

example :: Diagram SVG R2
example = eqTriangle 5

main :: IO ()
main = defaultMain example

($) :: (a -> b) -> a -> b
(#) :: a -> (a -> b) -> b

($) operator is extremely common in place of nested parentheses.

succ $ succ $ succ 1 == 4
succ (succ (succ 1)) == 4
1 # succ # succ # succ == 4

Place two diagrams vertically adjacent to one another:

a === b

Place two diagrams horizontally adjacent to one another:

a ||| b

Foreground colour

fc :: HasStyle a => Colour Double -> a -> a

import Diagrams.Prelude
import Diagrams.Backend.SVG.CmdLine

example :: Diagram SVG R2
example = s ||| s
  where
    s = fc black $ eqTriangle 5

main :: IO ()
main = defaultMain example

import Diagrams.Prelude
import Diagrams.Backend.SVG.CmdLine

sierpinksi :: Int -> Diagram SVG R2
sierpinksi 1 = eqTriangle 1
sierpinksi n = 
      s 
     === 
  (s ||| s) # centerX
  where
    s = sierpinksi (n - 1)

example :: Diagram SVG R2
example = sierpinksi 5 # fc black

main = defaultMain example

Quickcheck

import Test.QuickCheck

prop_reverse :: [Int] -> [Int] -> Bool
prop_reverse xs ys = reverse (xs ++ ys) == reverse xs ++ reverse ys

main ::  IO ()
main = quickCheck prop_reverse
$ runhaskell qcheck.hs
*** Failed! Falsifiable (after 3 tests and 4 shrinks):    
[0]
[1]

$ ghci
λ: reverse ([0] ++ [1]) == reverse [0] ++ reverse [1]
False

import Test.QuickCheck

prop_reverse :: [Int] -> [Int] -> Bool
prop_reverse xs ys = reverse (ys ++ xs) == reverse xs ++ reverse ys

main ::  IO ()
main = quickCheck prop_reverse
$ runhaskell qcheck.hs
+++ OK, passed 100 tests.

Parsec

module Parser (parseExpr) where

import Text.Parsec
import Text.Parsec.String (Parser)
import Text.Parsec.Language (haskellStyle)

import qualified Text.Parsec.Expr as Ex
import qualified Text.Parsec.Token as Tok

data Binop = Add | Sub | Mul deriving (Show)

type Id = String
data Expr = Lam Id Expr
          | App Expr Expr
          | Var Id
          | Num Int
          | Op Binop Expr Expr
          deriving (Show)

--
-- Lexer
--

lexer :: Tok.TokenParser ()
lexer = Tok.makeTokenParser style
  where ops = ["->","\\","+","*","-","="]
        style = haskellStyle {Tok.reservedOpNames = ops }

reservedOp :: String -> Parser ()
reservedOp = Tok.reservedOp lexer

identifier :: Parser String
identifier = Tok.identifier lexer

parens :: Parser a -> Parser a
parens = Tok.parens lexer

contents :: Parser a -> Parser a
contents p = do
  Tok.whiteSpace lexer
  r <- p
  eof
  return r

--
-- Parser
--

natural :: Parser Integer
natural = Tok.natural lexer

variable :: Parser Expr
variable = do
  x <- identifier
  return (Var x)

number :: Parser Expr
number = do
  n <- natural
  return (Num (fromIntegral n))

lambda :: Parser Expr
lambda = do
  reservedOp "\\"
  x <- identifier
  reservedOp "->"
  e <- expr
  return (Lam x e)

aexp :: Parser Expr
aexp =  parens expr
    <|> variable
    <|> number
    <|> lambda

term :: Parser Expr
term = Ex.buildExpressionParser table aexp
  where infixOp x f = Ex.Infix (reservedOp x >> return f)
        table = [[infixOp "*" (Op Mul) Ex.AssocLeft],
                 [infixOp "+" (Op Add) Ex.AssocLeft]]

expr :: Parser Expr
expr = do
  es <- many1 term
  return (foldl1 App es)

parseExpr :: String -> Expr
parseExpr input = 
  case parse (contents expr) "<stdin>" input of
    Left err -> error (show err)
    Right ast -> ast

main :: IO ()
main = getLine >>= print . parseExpr >> main

In about 70 lines of code we have the start to an interpreter for a small functional language.

λ: main
1+2
Op Add (Num 1) (Num 2)

\i -> \x -> x
Lam "i" (Lam "x" (Var "x"))

\s -> \f -> \g -> \x -> f x (g x)
Lam "s" (Lam "f" (Lam "g" (Lam "x" (App (App (Var "f") (Var "x")) (App (Var "g") (Var "x"))))))

Web Applications

$ cabal install blaze
$ cabal install scotty
{-# LANGUAGE OverloadedStrings #-}

import Web.Scotty

import qualified Text.Blaze.Html5 as H
import Text.Blaze.Html5 hiding (html, param)
import Text.Blaze.Html.Renderer.Text (renderHtml)

greet :: String -> Html
greet user = H.html $ do
  H.head $
    H.title "Welcome!"
  H.body $ do
    H.h1 "Greetings!"
    H.p ("Hello " >> toHtml user >> "!")

main :: IO ()
main = scotty 8000 $ do
         get "/" $
           text "Home Page"

         get "/greet/:name" $ do
           name <- param "name"
           html $ renderHtml (greet name)

Questions?