Data.Text
Text Literals
Section titled “Text Literals”The OverloadedStrings language extension allows the use of normal string literals to stand for Text values.
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.Text as T
myText :: T.TextmyText = "overloaded"Checking if a Text is a substring of another Text
Section titled “Checking if a Text is a substring of another Text”ghci> :set -XOverloadedStringsghci> import Data.Text as TisInfixOf :: Text -> Text -> Bool checks whether a Text is contained anywhere within another Text.
ghci> "rum" `T.isInfixOf` "crumble"TrueisPrefixOf :: Text -> Text -> Bool checks whether a Text appears at the beginning of another Text.
ghci> "crumb" `T.isPrefixOf` "crumble"TrueisSuffixOf :: Text -> Text -> Bool checks whether a Text appears at the end of another Text.
ghci> "rumble" `T.isSuffixOf` "crumble"TrueStripping whitespace
Section titled “Stripping whitespace”{-# LANGUAGE OverloadedStrings #-}
import qualified Data.Text as T
myText :: T.TextmyText = "\n\r\t leading and trailing whitespace \t\r\n"strip removes whitespace from the start and end of a Text value.
ghci> T.strip myText"leading and trailing whitespace"stripStart removes whitespace only from the start.
ghci> T.stripStart myText"leading and trailing whitespace \t\r\n"stripEnd removes whitespace only from the end.
ghci> T.stripEnd myText"\n\r\t leading and trailing whitespace"filter can be used to remove whitespace, or other characters, from the middle.
ghci> T.filter /=' ' "spaces in the middle of a text string""spacesinthemiddleofatextstring"Splitting Text Values
Section titled “Splitting Text Values”{-# LANGUAGE OverloadedStrings #-}
import qualified Data.Text as T
myText :: T.TextmyText = "mississippi"splitOn breaks a Text up into a list of Texts on occurrences of a substring.
ghci> T.splitOn "ss" myText["mi","i","ippi"]splitOn is the inverse of intercalate.
ghci> intercalate "ss" (splitOn "ss" "mississippi")"mississippi"split breaks a Text value into chunks on characters that satisfy a Boolean predicate.
ghci> T.split (== 'i') myText["m","ss","ss","pp",""]Encoding and Decoding Text
Section titled “Encoding and Decoding Text”Encoding and decoding functions for a variety of Unicode encodings can be found in the Data.Text.Encoding module.
ghci> import Data.Text.Encodingghci> decodeUtf8 (encodeUtf8 "my text")"my text"Note that decodeUtf8 will throw an exception on invalid input. If you want to handle invalid UTF-8 yourself, use decodeUtf8With.
ghci> decodeUtf8With (\errorDescription input -> Nothing) messyOutsideDataIndexing Text
Section titled “Indexing Text”{-# LANGUAGE OverloadedStrings #-}
import qualified Data.Text as T
myText :: T.Text
myText = "mississippi"Characters at specific indices can be returned by the index function.
ghci> T.index myText 2's'The findIndex function takes a function of type (Char -> Bool) and Text and returns the index of the first occurrence of a given string or Nothing if it doesn’t occur.
ghci> T.findIndex ('s'==) myTextJust 2ghci> T.findIndex ('c'==) myTextNothingThe count function returns the number of times a query Text occurs within another Text.
ghci> count ("miss"::T.Text) myText1Remarks
Section titled “Remarks”Text is a more efficient alternative to Haskell’s standard String type. String is defined as a linked list of characters in the standard Prelude, per the Haskell Report:
type String = [Char]Text is represented as a packed array of Unicode characters. This is similar to how most other high-level languages represent strings, and gives much better time and space efficiency than the list version.
Text should be preferred over String for all production usage. A notable exception is depending on a library which has a String API, but even in that case there may be a benefit of using Text internally and converting to a String just before interfacing with the library.
All of the examples in this topic use the OverloadedStrings language extension.