# Lists

# Processing lists

To process lists, we can simply pattern match on the constructors of the list type:

listSum :: [Int] -> Int
listSum []          = 0
listSum (x:xs) = x + listSum xs

We can match more values by specifying a more elaborate pattern:

sumTwoPer :: [Int] -> Int
sumTwoPer [] = 0
sumTwoPer (x1:x2:xs) = x1 + x2 + sumTwoPer xs
sumTwoPer (x:xs) = x + sumTwoPer xs

Note that in the above example, we had to provide a more exhaustive pattern match to handle cases where an odd length list is given as an argument.

The Haskell Prelude defines many built-ins for handling lists, like map, filter, etc.. Where possible, you should use these instead of writing your own recursive functions.

# List basics

The type constructor for lists in the Haskell Prelude is []. The type declaration for a list holding values of type Int is written as follows:

xs :: [Int]    -- or equivalently, but less conveniently,
xs :: [] Int

Lists in Haskell are homogeneous sequences, which is to say that all elements must be of the same type. Unlike tuples, list type is not affected by length:

[1,2,3]   :: [Int]
[1,2,3,4] :: [Int]

Lists are constructed using two constructors:

  • `[]` constructs an empty list.
  • `(:)`, pronounced "cons", prepends elements to a list. Consing `x` (a value of type `a`) onto `xs` (a list of values of the same type `a`) creates a new list, whose **head** (the first element) is `x`, and **tail** (the rest of the elements) is `xs`.
  • We can define simple lists as follows:

    ys :: [a]
    ys = []
    
    xs :: [Int]
    xs = 12 : (99 : (37 : []))   
    -- or  = 12 : 99 : 37 : []     -- ((:) is right-associative)
    -- or  = [12, 99, 37]          -- (syntactic sugar for lists)
    
    

    Note that (++), which can be used to build lists is defined recursively in terms of (:) and [].

    # Ranges

    Creating a list from 1 to 10 is simple using range notation:

    [1..10]    -- [1,2,3,4,5,6,7,8,9,10]
    
    

    To specify a step, add a comma and the next element after the start element:

    [1,3..10]  -- [1,3,5,7,9]
    
    

    Note that Haskell always takes the step as the arithmetic difference between terms, and that you cannot specify more than the first two elements and the upper bound:

    [1,3,5..10] -- error
    [1,3,9..20] -- error
    
    

    To generate a range in descending order, always specify the negative step:

    [5..1]     -- []
    
    [5,4..1]   -- [5,4,3,2,1]
    
    

    Because Haskell is non-strict, the elements of the list are evaluated only if they are needed, which allows us to use infinite lists. [1..] is an infinite list starting from 1. This list can be bound to a variable or passed as a function argument:

    take 5 [1..]   -- returns [1,2,3,4,5] even though [1..] is infinite
    
    

    Be careful when using ranges with floating-point values, because it accepts spill-overs up to half-delta, to fend off rounding issues:

    [1.0,1.5..2.4]    -- [1.0,1.5,2.0,2.5] , though 2.5 > 2.4
    
    [1.0,1.1..1.2]    -- [1.0,1.1,1.2000000000000002] , though 1.2000000000000002 > 1.2
    
    

    Ranges work not just with numbers but with any type that implements Enum typeclass. Given some enumerable variables a, b, c, the range syntax is equivalent to calling these Enum methods:

    [a..]    == enumFrom a
    [a..c]   == enumFromTo a c
    [a,b..]  == enumFromThen a b
    [a,b..c] == enumFromThenTo a b c
    
    

    For example, with Bool it's

    
    [False ..]      -- [False,True]
    
    

    Notice the space after False, to prevent this to be parsed as a module name qualification (i.e. False.. would be parsed as . from a module False).

    # List Literals

    emptyList     = []
    
    singletonList = [0]               -- = 0 : []
    
    listOfNums    = [1, 2, 3]         -- = 1 : 2 : [3]
    
    listOfStrings = ["A", "B", "C"]
    
    

    # List Concatenation

    listA      = [1, 2, 3]
    
    listB      = [4, 5, 6]
    
    listAThenB = listA ++ listB       -- [1, 2, 3, 4, 5, 6]
    
    (++) xs     [] = xs
    (++) []     ys = ys
    (++) (x:xs) ys = x : (xs ++ ys)
    
    

    # Accessing elements in lists

    Access the nth element of a list (zero-based):

    list = [1 .. 10]
    
    firstElement = list !! 0           -- 1
    
    

    Note that !! is a partial function, so certain inputs produce errors:

    list !! (-1)     -- *** Exception: Prelude.!!: negative index  
    
    list !! 1000     -- *** Exception: Prelude.!!: index too large
    
    

    There's also Data.List.genericIndex, an overloaded version of !!, which accepts any Integral value as the index.

    import Data.List (genericIndex)
    
    list `genericIndex` 4              -- 5
    
    

    When implemented as singly-linked lists, these operations take O(n) time. If you frequently access elements by index, it's probably better to use Data.Vector (from the vector package) or other data structures.

    # Basic Functions on Lists

    head [1..10]       --    1
    
    last [1..20]       --    20
    
    tail [1..5]        --    [2, 3, 4, 5]
    
    init [1..5]        --    [1, 2, 3, 4]
    
    length [1 .. 10]   --    10
    
    reverse [1 .. 10]  --    [10, 9 .. 1]
    
    take 5 [1, 2 .. ]  --    [1, 2, 3, 4, 5]
    
    drop 5 [1 .. 10]   --    [6, 7, 8, 9, 10]
    
    concat [[1,2], [], [4]]   --    [1,2,4]
    
    

    # foldl

    This is how the left fold is implemented. Notice how the order of the arguments in the step function is flipped compared to foldr (the right fold):

    foldl :: (b -> a -> b) -> b -> [a] -> b
    foldl f acc []     =  acc
    foldl f acc (x:xs) =  foldl f (f acc x) xs         -- = foldl f (acc `f` x) xs  
    
    

    The left fold, foldl, associates to the left. That is:

    foldl (+) 0 [1, 2, 3]     -- is equivalent to ((0 + 1) + 2) + 3
    
    

    The reason is that foldl is evaluated like this (look at foldl's inductive step):

    foldl (+) 0 [1, 2, 3]                        --  foldl (+)    0   [ 1,   2,   3 ]
    foldl (+) ((+) 0 1) [2, 3]                   --  foldl (+)   (0 + 1)   [ 2,   3 ]
    foldl (+) ((+) ((+) 0 1) 2) [3]              --  foldl (+)  ((0 + 1) + 2)   [ 3 ]
    foldl (+) ((+) ((+) ((+) 0 1) 2) 3) []       --  foldl (+) (((0 + 1) + 2) + 3) []
    ((+) ((+) ((+) 0 1) 2) 3)                    --            (((0 + 1) + 2) + 3)
    
    

    The last line is equivalent to ((0 + 1) + 2) + 3. This is because (f a b) is the same as (afb) in general, and so ((+) 0 1) is the same as (0 + 1) in particular.

    # foldr

    This is how the right fold is implemented:

    foldr :: (a -> b -> b) -> b -> [a] -> b
    foldr f z []     = z
    foldr f z (x:xs) = f x (foldr f z xs)              -- = x `f` foldr f z xs
    
    

    The right fold, foldr, associates to the right. That is:

    foldr (+) 0 [1, 2, 3]      -- is equivalent to 1 + (2 + (3 + 0))
    
    

    The reason is that foldr is evaluated like this (look at the inductive step of foldr):

    foldr (+) 0 [1, 2, 3]                        --          foldr (+) 0  [1,2,3]
    (+) 1 (foldr (+) 0 [2, 3])                   -- 1 +        foldr (+) 0  [2,3]
    (+) 1 ((+) 2 (foldr (+) 0 [3]))              -- 1 + (2 +     foldr (+) 0  [3])
    (+) 1 ((+) 2 ((+) 3 (foldr (+) 0 [])))       -- 1 + (2 + (3 +  foldr (+) 0 []))
    (+) 1 ((+) 2 ((+) 3 0))                      -- 1 + (2 + (3 +            0   ))
    
    

    The last line is equivalent to 1 + (2 + (3 + 0)), because ((+) 3 0) is the same as (3 + 0).

    # Transforming with map

    Often we wish to convert, or transform the contents of a collection (a list, or something traversable). In Haskell we use map:

    
    -- Simple add 1
     map (+ 1) [1,2,3]
     [2,3,4]
     
     map odd [1,2,3]
     [True,False,True]
     
     data Gender = Male | Female deriving Show
     data Person = Person String Gender Int deriving Show
    
     -- Extract just the age from a list of people
     map (\(Person n g a) -> a) [(Person "Alex" Male 31),(Person "Ellie" Female 29)]
     [31,29]
    
    

    # Filtering with filter

    Given a list:

    
    li = [1,2,3,4,5]
    
    

    we can filter a list with a predicate using filter :: (a -> Bool) -> [a] -> [a]:

    
    filter (== 1) li       -- [1]
     
     filter (even) li       -- [2,4]
     
     filter (odd) li        -- [1,3,5]
     
     -- Something slightly more complicated
     comfy i = notTooLarge && isEven
       where 
          notTooLarge = (i + 1) < 5
          isEven = even i
     
     filter comfy li        -- [2]
    
    

    Of course it's not just about numbers:

    
    data Gender = Male | Female deriving Show
     data Person = Person String Gender Int deriving Show
     
     onlyLadies :: [Person] -> Person
     onlyLadies x = filter isFemale x
       where 
         isFemale (Person _ Female _) = True
         isFemale _ = False
     
     onlyLadies [(Person "Alex" Male 31),(Person "Ellie" Female 29)]
     -- [Person "Ellie" Female 29]
    
    

    # Zipping and Unzipping Lists

    zip takes two lists and returns a list of corresponding pairs:

    zip []     _      = []
    zip _      []     = []
    zip (a:as) (b:bs) = (a,b) : zip as bs
    
    > zip [1,3,5] [2,4,6]
    > [(1,2),(3,4),(5,6)]
    
    

    Zipping two lists with a function:

    zipWith f  []     _      = []
    zipWith f  _      []     = []
    zipWith f  (a:as) (b:bs) = f a b : zipWith f as bs
    
    > zipWith (+) [1,3,5] [2,4,6]
    > [3,7,11]
    
    

    Unzipping a list:

    unzip = foldr (\(a,b) ~(as,bs) -> (a:as,b:bs)) ([],[])
    
    > unzip [(1,2),(3,4),(5,6)]
    > ([1,3,5],[2,4,6])
    
    

    # Syntax

  • empty list constructor `[] :: [a]`
  • non-empty list constructor `(:) :: a -> [a] -> [a]`
  • head - returns the first value of a list `head :: [a] -> a`
  • last - returns the last value of a list `last :: [a] -> a`
  • tail - returns a list without the first item `tail :: [a] -> [a]`
  • init - returns a list without the last item `init :: [a] -> [a]`
  • xs !! i - return the element at an index i in list xs `(!!) :: Int -> [a] -> a`
  • take n xs - return new list containing n first elements of the list xs `take :: Int -> [a] -> [a]`
  • map :: (a -> b) -> [a] -> [b]
  • filter :: (a -> Bool) -> [a] -> [a]
  • (++) :: [a] -> [a]
  • concat :: [[a]] -> [a]
  • # Remarks

    1. The type [a] is equivalent to [] a.
    2. [] constructs the empty list.
    3. [] in a function definition LHS, e.g. f [] = ..., is the empty list pattern.
    4. x:xs constructs a list where an element x is prepended to the list xs
    5. f (x:xs) = ... is a pattern match for a non-empty list where x is the head and xs is the tail.
    6. f (a:b:cs) = ... and f (a:(b:cs)) = ... are the same. They are a pattern match for a list of at least two elements where the first element is a, the second element is b, and the rest of the list is cs.
    7. f ((a:as):bs) = ... is NOT the same as f (a:(as:bs)) = .... The former is a pattern match for a non-empty list of lists, where a is the head of the head, as is the tail of the head, and bs is the tail.
    8. f (x:[]) = ... and f [x] = ... are the same. They are a pattern match for a list of exactly one element.
    9. f (a:b:[]) = ... and f [a,b] = ... are the same. They are a pattern match for a list of exactly two elements.
    10. f [a:b] = ... is a pattern match for a list of exactly one element where the element is also a list. a is the head of the element and b is the tail of the element.
    11. [a,b,c] is the same as (a:b:c:[]). Standard list notation is just syntactic sugar for the (:) and [] constructors.
    12. You can use all@(x:y:ys) in order to refer to the whole list as all (or any other name you choose) instead of repeating (x:y:ys) again.