# Common GHC Language Extensions
# RankNTypes
Imagine the following situation:
foo :: Show a => (a -> String) -> String -> Int -> IO ()
foo show' string int = do
putStrLn (show' string)
putStrLn (show' int)
Here, we want to pass in a function that converts a value into a String, apply that function to both a string parameter and and int parameter and print them both. In my mind, there is no reason this should fail! We have a function that works on both types of the parameters we're passing in.
Unfortunately, this won't type check! GHC infers the a
type based off of its first occurrence in the function body. That is, as soon as we hit:
putStrLn (show' string)
GHC will infer that show' :: String -> String
, since string
is a String
. It will proceed to blow up while trying to show' int
.
RankNTypes
lets you instead write the type signature as follows, quantifying over all functions that satisfy the show'
type:
foo :: (forall a. Show a => (a -> String)) -> String -> Int -> IO ()
This is rank 2 polymorphism: We are asserting that the show'
function must work for all a
s within our function, and the previous implementation now works.
The RankNTypes
extension allows arbitrary nesting of forall ...
blocks in type signatures. In other words, it allows rank N polymorphism.
# OverloadedStrings
Normally, string literals in Haskell have a type of String
(which is a type alias for [Char]
). While this isn't a problem for smaller, educational programs, real-world applications often require more efficient storage such as Text
or ByteString
.
OverloadedStrings
simply changes the type of literals to
"test" :: Data.String.IsString a => a
Allowing them to be directly passed to functions expecting such a type. Many libraries implement this interface for their string-like types including Data.Text (opens new window) and Data.ByteString (opens new window) which both provide certain time and space advantages over [Char]
.
There are also some unique uses of OverloadedStrings
like those from the Postgresql-simple (opens new window) library which allows SQL queries to be written in double quotes like a normal string, but provides protections against improper concatenation, a notorious source of SQL injection attacks.
To create a instance of the IsString
class you need to impliment the fromString
function. Example†:
data Foo = A | B | Other String deriving Show
instance IsString Foo where
fromString "A" = A
fromString "B" = B
fromString xs = Other xs
tests :: [ Foo ]
tests = [ "A", "B", "Testing" ]
† This example courtesy of Lyndon Maydwell (sordina
on GitHub) found here (opens new window).
# BinaryLiterals
Standard Haskell allows you to write integer literals in decimal (without any prefix), hexadecimal (preceded by 0x
or 0X
), and octal (preceded by 0o
or 0O
). The BinaryLiterals
extension adds the option of binary (preceded by 0b
or 0B
).
0b1111 == 15 -- evaluates to: True
# ExistentialQuantification
This is a type system extension that allows types that are existentially quantified, or, in other words, have type variables that only get instantiated at runtime†.
A value of existential type is similar to an abstract-base-class reference in OO languages: you don't know the exact type in contains, but you can constrain the class of types.
data S = forall a. Show a => S a
or equivalently, with GADT syntax:
{-# LANGUAGE GADTs #-}
data S where
S :: Show a => a -> S
Existential types open the door to things like almost-heterogenous containers: as said above, there can actually be various types in an S
value, but all of them can be show
n, hence you can also do
instance Show S where
show (S a) = show a -- we rely on (Show a) from the above
Now we can create a collection of such objects:
ss = [S 5, S "test", S 3.0]
Which also allows us to use the polymorphic behaviour:
mapM_ print ss
Existentials can be very powerful, but note that they are actually not necessary very often in Haskell. In the example above, all you can actually do with the Show
instance is show (duh!) the values, i.e. create a string representation. The entire S
type therefore contains exactly as much information as the string you get when showing it. Therefore, it is usually better to simply store that string right away, especially since Haskell is lazy and therefore the string will at first only be an unevaluated thunk anyway.
On the other hand, existentials cause some unique problems. For instance, the way the type information is “hidden” in an existential. If you pattern-match on an S
value, you will have the contained type in scope (more precisely, its Show
instance), but this information can never escape its scope, which therefore becomes a bit of a “secret society”: the compiler doesn't let anything escape the scope except values whose type is already known from the outside. This can lead to strange errors like Couldn't match type ‘a0’ with ‘()’ ‘a0’ is untouchable
(opens new window).
†Contrast this with ordinary parametric polymorphism, which is generally resolved at compile time (allowing full type erasure).
Existential types are different from Rank-N types – these extensions are, roughly speaking, dual to each other: to actually use values of an existential type, you need a (possibly constrained-) polymorphic function, like show
in the example. A polymorphic function is universally quantified, i.e. it works for any type in a given class, whereas existential quantification means it works for some particular type which is a priori unknown. If you have a polymorphic function, that's sufficient, however to pass polymorphic functions as such as arguments, you need {-# LANGUAGE Rank2Types #-}
:
genShowSs :: (∀ x . Show x => x -> String) -> [S] -> [String]
genShowSs f = map (\(S a) -> f a)
# LambdaCase
A syntactic extension that lets you write \case
in place of \arg -> case arg of
.
Consider the following function definition:
dayOfTheWeek :: Int -> String
dayOfTheWeek 0 = "Sunday"
dayOfTheWeek 1 = "Monday"
dayOfTheWeek 2 = "Tuesday"
dayOfTheWeek 3 = "Wednesday"
dayOfTheWeek 4 = "Thursday"
dayOfTheWeek 5 = "Friday"
dayOfTheWeek 6 = "Saturday"
If you want to avoid repeating the function name, you might write something like:
dayOfTheWeek :: Int -> String
dayOfTheWeek i = case i of
0 -> "Sunday"
1 -> "Monday"
2 -> "Tuesday"
3 -> "Wednesday"
4 -> "Thursday"
5 -> "Friday"
6 -> "Saturday"
Using the LambdaCase extension, you can write that as a function expression, without having to name the argument:
{-# LANGUAGE LambdaCase #-}
dayOfTheWeek :: Int -> String
dayOfTheWeek = \case
0 -> "Sunday"
1 -> "Monday"
2 -> "Tuesday"
3 -> "Wednesday"
4 -> "Thursday"
5 -> "Friday"
6 -> "Saturday"
# FunctionalDependencies
If you have a multi-parameter type-class with arguments a, b, c, and x, this extension lets you express that the type x can be uniquely identified from a, b, and c:
class SomeClass a b c x | a b c -> x where ...
When declaring an instance of such class, it will be checked against all other instances to make sure that the functional dependency holds, that is, no other instance with same a b c
but different x
exists.
You can specify multiple dependencies in a comma-separated list:
class OtherClass a b c d | a b -> c d, a d -> b where ...
For example in MTL we can see:
class MonadReader r m| m -> r where ...
instance MonadReader r ((->) r) where ...
Now, if you have a value of type MonadReader a ((->) Foo) => a
, the compiler can infer that a ~ Foo
, since the second argument completely determines the first, and will simplify the type accordingly.
The SomeClass
class can be thought of as a function of the arguments a b c
that results in x
. Such classes can be used to do computations in the typesystem.
# FlexibleInstances
Regular instances require:
All instance types must be of the form (T a1 ... an)
where a1 ... an are *distinct type variables*,
and each type variable appears at most once in the instance head.
That means that, for example, while you can create an instance for [a]
you can't create an instance for specifically [Int]
.; FlexibleInstances
relaxes that:
class C a where
-- works out of the box
instance C [a] where
-- requires FlexibleInstances
instance C [Int] where
# GADTs
Conventional algebraic datatypes are parametric in their type variables. For example, if we define an ADT like
data Expr a = IntLit Int
| BoolLit Bool
| If (Expr Bool) (Expr a) (Expr a)
with the hope that this will statically rule out non-well-typed conditionals, this will not behave as expected since the type of IntLit :: Int -> Expr a
is universially quantified: for any choice of a
, it produces a value of type Expr a
. In particular, for a ~ Bool
, we have IntLit :: Int -> Expr Bool
, allowing us to construct something like If (IntLit 1) e1 e2
which is what the type of the If
constructor was trying to rule out.
Generalised Algebraic Data Types allows us to control the resulting type of a data constructor so that they are not merely parametric. We can rewrite our Expr
type as a GADT like this:
data Expr a where
IntLit :: Int -> Expr Int
BoolLit :: Bool -> Expr Bool
If :: Expr Bool -> Expr a -> Expr a -> Expr a
Here, the type of the constructor IntLit
is Int -> Expr Int
, and so IntLit 1 :: Expr Bool
will not typecheck.
Pattern matching on a GADT value causes refinement of the type of the term returned. For example, it is possible to write an evaluator for Expr a
like this:
crazyEval :: Expr a -> a
crazyEval (IntLit x) =
-- Here we can use `(+)` because x :: Int
x + 1
crazyEval (BoolLit b) =
-- Here we can use `not` because b :: Bool
not b
crazyEval (If b thn els) =
-- Because b :: Expr Bool, we can use `crazyEval b :: Bool`.
-- Also, because thn :: Expr a and els :: Expr a, we can pass either to
-- the recursive call to `crazyEval` and get an a back
crazyEval $ if crazyEval b then thn else els
Note that we are able to use (+)
in the above definitions because when e.g. IntLit x
is pattern matched, we also learn that a ~ Int
(and likewise for not
and if_then_else_
when a ~ Bool
).
# TupleSections
A syntactic extension that allows applying the tuple constructor (which is an operator) in a section way:
(a,b) == (,) a b
-- With TupleSections
(a,b) == (,) a b == (a,) b == (,b) a
# N-tuples
It also works for tuples with arity greater than two
(,2,) 1 3 == (1,2,3)
# Mapping
This can be useful in other places where sections are used:
map (,"tag") [1,2,3] == [(1,"tag"), (2, "tag"), (3, "tag")]
The above example without this extension would look like this:
map (\a -> (a, "tag")) [1,2,3]
# OverloadedLists
added in GHC 7.8.
OverloadedLists, similar to OverloadedStrings (opens new window), allows list literals to be desugared as follows:
[] -- fromListN 0 []
[x] -- fromListN 1 (x : [])
[x .. ] -- fromList (enumFrom x)
This comes handy when dealing with types such as Set
, Vector
and Map
s.
['0' .. '9'] :: Set Char
[1 .. 10] :: Vector Int
[("default",0), (k1,v1)] :: Map String Int
['a' .. 'z'] :: Text
IsList
(opens new window) class in GHC.Exts
is intended to be used with this extension.
IsList
is equipped with one type function, Item
, and three functions, fromList :: [Item l] -> l
, toList :: l -> [Item l]
and fromListN :: Int -> [Item l] -> l
where fromListN
is optional. Typical implementations are:
instance IsList [a] where
type Item [a] = a
fromList = id
toList = id
instance (Ord a) => IsList (Set a) where
type Item (Set a) = a
fromList = Set.fromList
toList = Set.toList
Examples taken from OverloadedLists – GHC (opens new window).
# MultiParamTypeClasses
It's a very common extension that allows type classes with multiple type parameters. You can think of MPTC as a relation between types.
{-# LANGUAGE MultiParamTypeClasses #-}
class Convertable a b where
convert :: a -> b
instance Convertable Int Float where
convert i = fromIntegral i
The order of parameters matters.
MPTCs can sometimes be replaced with type families.
# UnicodeSyntax
An extension that allows you to use Unicode characters in lieu of certain built-in operators and names.
ASCII | Unicode | Use(s) |
---|---|---|
:: | ∷ | has type |
-> | → | function types, lambdas, case branches, etc. |
=> | ⇒ | class constraints |
forall | ∀ | explicit polymorphism |
<- | ← | do notation |
* | ★ | the type (or kind) of types (e.g., Int :: ★ ) |
>- | ⤚ | proc notation (opens new window) for Arrows (opens new window) |
-< | ⤙ | proc notation (opens new window) for Arrows (opens new window) |
>>- | ⤜ | proc notation (opens new window) for Arrows (opens new window) |
-<< | ⤛ | proc notation (opens new window) for Arrows (opens new window) |
For example:
runST :: (forall s. ST s a) -> a
would become
runST ∷ (∀ s. ST s a) → a
Note that the *
vs. ★
example is slightly different: since *
isn't reserved, ★
also works the same way as *
for multiplication, or any other function named (*)
, and vice-versa. For example:
ghci> 2 ★ 3
6
ghci> let (*) = (+) in 2 ★ 3
5
ghci> let (★) = (-) in 2 * 3
-1
# PatternSynonyms
Pattern synonyms (opens new window) are abstractions of patterns similar to how functions are abstractions of expressions.
For this example, let's look at the interface Data.Sequence
(opens new window) exposes, and let's see how it can be improved with pattern synonyms. The Seq
(opens new window) type is a data type that, internally, uses a complicated representation (opens new window) to achieve good asymptotic complexity for various operations, most notably both O(1) (un)consing and (un)snocing.
But this representation is unwieldy and some of its invariants cannot be expressed in Haskell's type system. Because of this, the Seq
type is exposed to users as an abstract type, along with invariant-preserving accessor and constructor functions, among them:
empty :: Seq a
(<|) :: a -> Seq a -> Seq a
data ViewL a = EmptyL | a :< (Seq a)
viewl :: Seq a -> ViewL a
(|>) :: Seq a -> a -> Seq a
data ViewR a = EmptyR | (Seq a) :> a
viewr :: Seq a -> ViewR a
But using this interface can be a bit cumbersome:
uncons :: Seq a -> Maybe (a, Seq a)
uncons xs = case viewl xs of
x :< xs' -> Just (x, xs')
EmptyL -> Nothing
We can use view patterns (opens new window) to clean it up somewhat:
{-# LANGUAGE ViewPatterns #-}
uncons :: Seq a -> Maybe (a, Seq a)
uncons (viewl -> x :< xs) = Just (x, xs)
uncons _ = Nothing
Using the PatternSynonyms
language extension, we can give an even nicer interface by allowing pattern matching to pretend that we have a cons- or a snoc-list:
{-# LANGUAGE PatternSynonyms #-}
import Data.Sequence (Seq)
import qualified Data.Sequence as Seq
pattern Empty :: Seq a
pattern Empty <- (Seq.viewl -> Seq.EmptyL)
pattern (:<) :: a -> Seq a -> Seq a
pattern x :< xs <- (Seq.viewl -> x Seq.:< xs)
pattern (:>) :: Seq a -> a -> Seq a
pattern xs :> x <- (Seq.viewr -> xs Seq.:> x)
This allows us to write uncons
in a very natural style:
uncons :: Seq a -> Maybe (a, Seq a)
uncons (x :< xs) = Just (x, xs)
uncons _ = Nothing
# ScopedTypeVariables
ScopedTypeVariables
let you refer to universally quantified types inside of a declaration. To be more explicit:
import Data.Monoid
foo :: forall a b c. (Monoid b, Monoid c) => (a, b, c) -> (b, c) -> (a, b, c)
foo (a, b, c) (b', c') = (a :: a, b'', c'')
where (b'', c'') = (b <> b', c <> c') :: (b, c)
The important thing is that we can use a
, b
and c
to instruct the compiler in subexpressions of the declaration (the tuple in the where
clause and the first a
in the final result). In practice, ScopedTypeVariables
assist in writing complex functions as a sum of parts, allowing the programmer to add type signatures to intermediate values that don't have concrete types.
# RecordWildCards
See RecordWildCards (opens new window)
# Remarks
These language extensions are typically available when using the Glasgow Haskell Compiler (GHC) as they are not part of the approved Haskell 2010 language Report (opens new window). To use these extensions, one must either inform the compiler using a flag (opens new window) or place a LANGUAGE
programa (opens new window) before the module
keyword in a file. The official documentation can be found in section 7 (opens new window) of the GCH users guide.
The format of the LANGUAGE
programa is {-# LANGUAGE ExtensionOne, ExtensionTwo ... #-}
. That is the literal {-#
followed by LANGUAGE
followed by a comma separated list of extensions, and finally the closing #-}
. Multiple LANGUAGE
programas may be placed in one file.