This document is intended to be used as a reference for Haskell style and usage. It covers a range of topics, from libraries and language features to style and usage.
- Stack with Hpack is build system of choice. On a per-project basis, this permits concise
package.yamlfiles that are generated into version-compatible.cabalfiles.
-Wall,-Werror, and-fwarn-tabsfor warnings-threaded(executables only) for the threaded runtime-02for runtime performance (recommended), though it produces slower compilation times-rtsopts,-with-rtsopts=-N, and-with-rtsopts=-T(executables only) for enabling runtime options, using all processors, and showing runtime statistics, respectively
Pragmas to use anytime (and should be turned on by default):
ConstraintKindsto use constraints as type synonymsDataKindsto use data constructors as typesDefaultSignaturesto allow default implementations inside a class declaration for all or some of its methodsDeriveDataTypeablefor derivingDeriveGenericfor derivingEmptyDataDeclsfor empty data declarations likedata FooFlexibleContextsto relax some restrictions on the form of a type signatureFlexibleInstancesto relax some restrictions on the form of a class signatureFunctionalDependenciesto constrain the parameters of a multi-parameter class by stating that one of the parameters can be determined from the othersGADTsto allow generalized ADTsGeneralizedNewtypeDerivingto derive any class through a newtypeInstanceSigsto give type signatures in instances, which can be useful documentationLambdaCaseto desugar\ caseto\ x -> case x ofMultiParamTypeClassesto have classes with multiple parametersNamedFieldPunsto bind local names to field names of a recordNoImplicitPreludeto allow use of classy preludeNoMonomorphismRestrictionto prevent the compiler from filling in free types with defaultsOverloadedStringsto allow string literals to be interpreted as different string representationsPackageImportsto allow qualified importsPatternSynonymsto provide names to pattern matchesPolyKindsto allow declarations to have kind variables, likedata Proxy (a :: k)QuasiQuotesto help template haskellRankNTypesto put all variable declarations within the sameforallRecordWildCardsto allow wildcards in data typesScopedTypeVariablesto allow usage of explicit forall type variables to be used in the bodyStandaloneDerivingto allow deriving after type declarationTemplateHaskellto turn on template haskellTupleSectionsto allow(a,)to desugar to\ b -> (a, b)TypeApplicationsto provide explicit type arguments to polymorphic functionsTypeFamiliesto allow type familiesTypeOperatorsto allow types as operatorsViewPatternsto allow function application on values at the time they are unwrapped
Pragmas to use sparingly:
UndecidableInstances(see this explanation for details)
Pragmas to probably panic about:
OverlappingInstances
Essential libraries and frameworks:
aesonfor JSON parsing and serializationclassy-preludelike prelude, but way betterekgand associated libraries for statisticslensfor high powered operations on data typesopaleyeto access a databaseservantfor web applications
More high-powered libraries:
sybfor fancy recursionvinylfor HLists and the like
In general, adhere to a certain flavor of style for a cohesive feel:
- Indentation level is 2 spaces
- Column width is 100
- Dealing with multiple lines:
- When in doubt, adhere to Stylish Haskell
- Multiline function calls should add an indentation level to any extra lines in order to distinguish those lines from surrounding code
- Multiline function definitions should align arrows (
->)- Function definition constraints may go on the same line as the function name
- Start a newline and add an indent level (no need to be aligned under
::)
- Multiline lists should add a space after the list-opening character (
[or(), align the first character of each line, and start each line - with a comma
- Multiline expressions should have operators at the beginning of each continued line, indented from the first.
- Exception: place
$at the end of the line, mostly.
- Exception: place
- Multiline imports should add a newline after the module name and indent (maintained by stylish-haskell)
- No wildcard imports except for ClassyPrelude
- Okay in tests if you wildcard import the module you are testing
- When you do this, import it in a separate section and call out the fact that you are doing it only because it's a test
module FooSpec where import ClassyPrelude import LiterallyAnythingElseDontYouDareWildcard () -- imported wildcard because it's being tested import Foo
- Okay in tests if you wildcard import the module you are testing
- In general, no orphan instances (sometimes, sometimes they may be necessary)
- Okay in tests
- Always write these in a separate module named
<ClassName>OrphanInstances.hs
- Avoid importing multiple modules qualified under the same name (e.g.
import qualified Foo as Fandimport qualified Bar as Ftogether is bad) - Record field names should be prefixed by the entire type name in camel case
- If lenses are generated for this type, each record name should start with an underscore
- An added advantage of declaring record types this way is that derived JSON instances do not have collisions
- Avoid
&and the lens operators, e.g..~.- Exception: certain APIs lead to very unwieldy code unless the lens operators are used, specifically swagger2. In that case we use them liberally.
- Avoid unnecessary parentheses
- Exception: instance declarations that have constraints are easier to read if any constraints (or even a single constraint) are surrounded by parens. See example below.
Some examples as a guide. These aren't rules, but use best judgement to write code that is readable and easy to understand.
When using let clauses, try to define as many clauses as possible in the same block without reusing the let keyword. For pure functions, this is enforced by the compiler, but within a do block it is up to the programmer. For example:
foo :: a -> b -> c -> m d
foo x y z = do
-- NOT THIS:
let bar = baz x
let bin = qux y
...
-- this instead:
let bar = baz x
bin = qux y
...Where clauses are nice when a function can be written concisely using functions that need not be at the module level. A simple example is a fold:
foo :: [a] -> Map b c
foo = foldr makeMap mempty
where
makeMap = ...Where clauses are also nice when using explicit type signatures.
Where clauses should add an indentation level to the where keyword, then a newline, then another indentation level, unless the function (or value) can be inlined.
If the record type will be part of the "environment" for any code (i.e. with MonadReader), use makeClassy ''MyType to generate a HasMyType class (as well as lenses for the fields). Otherwise, most of the time it’s useful to generate lenses with makeLenses ''MyType.
The fields in the declaration of a record type should be aligned:
data SomeReallyCoolType = SomeReallyCoolType
{ _someReallyCoolTypeName :: Text -- ^What name is for.
, _someReallyCoolTypeDetailInfo :: Foo -- ^Something about detail.
}
makeLenses ``SomeReallyCoolTypeIf you run out of space, the haddock comments can be moved to the line following each field. If so, all of them should be moved. But they need not be aligned when instantiating, because these tend to be indented and the rhs may be long:
...
pure $ SomeReallyCoolType
{ _someReallyCoolTypeName = foo bar baz $ flarp <$> mconcat zebras
, _someReallyCoolTypeDetailInfo = anotherLongExpression of someValues
}Being point-free is great until it becomes confusing. For instance, don't abuse flip in favor of writing a function point-free. Operators can also be confusing in point-free syntax, as they often need extra parentheses. Here's a bad example of a point free function:
foo :: a -> b -> c -> Either Text d
foo x y = either ((<> " messed up") . ("this value is " <>)) id . flip (flip (bar x) y)This is better:
foo :: a -> b -> c -> Either Text d
foo x y z = either (\ msg -> "this value is " <> msg <> " messed up") id $ bar z x yProvide an explicit type (or a comment!) when code is confusing. A type is preferable to a comment (if it’s equally explanatory) because a type is checked by the compiler.
In general, stay within the line limit, indent by 2 spaces when needed, and don't be too fussy:
-- Keep it simple:
foo :: MySite site => Server site Foo
-- Break long lines:
foo :: MySite site
=> Foo -> Bar -> Baz -> Server site Bin
-- When there many constraints, use a space inside the "(" to line them up, and add a line break if it helps:
doAThing ::
( HasEnv r, HasRedisPool r, HasServerSettings r
, MonadLogger m, MonadReader r m, MonadUnliftIO m )
=> Foo -> Bar -> Baz -> m Bin
-- Special case: always use parens with a constraint after "instance":
instance (HasSwagger sub) => HasSwagger (MyAuth :> sub) where-- Operators on the left, and indented:
"A long string "
<> tshow foo <> ": "
<> tshow bar
-- OK to indent creatively for better alignment:
Q.where_ ( x Q.^. FieldFoo Q.==. Q.val foo
Q.&&. x Q.^. FieldBar Q.==. Q.val bar )
-- Consider an alternative if you want things aligned:
concat
[ "A long string "
, tshow foo , ": "
, ...
-- $ is special:
fooSpec = describe "foo" $
it "is awesome" $
pure True
-- But you can also put it at the front if it works better
foo <- either throwIO pure
. f (name <> " string") (object bar)
myFunction
$ myValueWhen to use...
newtypes are great for type safety. With GeneralizedNewtypeDeriving turned on, (almost) any class can be derived through a newtype. The type itself is erased at runtime, so there's no performance detriment to using these. Declarations should adhere to at least one of the following rules: either the newtype has one record, named unConstructorName, or there is no record name (just a constructor), but makePrisms is invoked for the type.
data is for constructors with multiple fields, ADTs, or GADTs. Always use data for types which use deriveJSON or similar to derive instances which depend on the field name, even if there is only one field. (This is in keeping with the rule about only naming that field unConstructorName for newtypes.) Best practice for using ADTs is to define a detail data type for each branch of the ADT so you get something like data Foo = Bar BarDetail | Baz BazDetail. This is especially useful for prisms (in the lens family).
type synonyms (aliases) are great for ascribing a specific name to a repeated union of types. With ConstraintKinds turned on, types can be used in class or instance declarations, like type FooM a m = (Monad m, Foo a) with class FooM a m => Bar a m where. Don't use a type alias with a simple type as in type ImportantThing = Int; instead use a newtype and enjoy the warm feeling of type safety.
Lenses (and prisms, for ADTs) should be generated at compile time for any type that will be a part of any significant business logic in the code. Types that are only used in an outward-facing API, for example, need not generate lenses as they are typically only used at the edge of the server.
Applicative provides sequential sequential application (<*>) on a value, while Monad provides binding (>>=). Based on the type of (>>=) :: m a (a m b) m b, we can see that Monad requires an a to be present in order to compute m b. But with (<*>) :: f (a b) f a f b, effects are composed/combined independently of each other. An important consequence of this is that in validation, failures can accumulate. For this reason, consider using Applicative when parsing or validating. Finally, it's always useful to consider using the weakest constraint when writing code, so if a block can be written using Applicative instead of Monad, it could be useful to do so.
Operator chains (i.e. >>=, >>, >=>) are useful for point-free expression across a few lines. If abused, they can lead to confusing code. When in doubt, use do or combine operations in a let/in/where block.
Classes are useful when operator overloading is needed, also known as ad-hoc polymorphism. Useful when many types have similar operations (like a simple database insert that always takes an entity and returns the entity plus its created key).
Take care when writing a class, and document any assumptions into laws to which the class should adhere. If possible, write property tests for such laws and expose them so alternative implementations can be tested.
A type family is a type-level function that returns a type. Very useful when a class of functions returns the same kind of thing, and in such a case acts as a witness. A very powerful implementation of this is when it is used with functions on GADTs, so that the type family may be a witness to each GADT constructor's return type.
Constraints are more flexible, as the only constraints that are needed for a function are the ones that are actually used. Monad transformers are much more static, as they require the valid monad stack.
Always use ExceptT. It is older and more widely supported.
Old rule, used in some places: In general, try not to use exceptions (fail). Sometimes they are unavoidable, if MonadError (throwError) is not on the stack. MonadError is almost always preferable, since it is recoverable. In general an exception will be considered a server error.
New rule, used in some places: when IO (or MonadIO) is in the stack, it may be more sane to use exceptions (i.e. throwIO) because 1) some libraries already force it on us, 2) there is a useful semantics for what happens when exceptions are thrown asynchronously, and 3) something intelligent-sounding about masking and error handling. When throwing exceptions, always create an exception type which records some meaningful information and document where exceptions can be thrown, within reason.
Be aware of which rule is in force and be consistent. Consider switching to the new rule especially when doing more asynchronous IO.
It's OK to use fail only at the top level of a script or main program, for example when validating inputs.
Never, ever, ever use error. Exception: error "not implemented" is a convenient way to create a “hole” that will allow the program to run (unlike both _ and undefined, which are both rejected by the compiler when -Werror is enabled). These should always be cleaned up before code is deployed, and usually before it is merged.
One can intuit some differences between these functions from the following examples:
foldrpreserves ordering whilefoldl'reverses the list (as wouldfoldl).
foldr (:) [] ['a', 'b', 'c']
"abc"
foldl' (flip (:)) [] ['a', 'b', 'c']
"cba"foldlis lazy and never finishes evaluation on an infinite list, butfoldrdoes.
foldl (\ xs x -> (x+1):xs) [] [(0 :: Int)..]
-- lazy version never exits
foldr (\ x xs -> (x+1):xs) [] [(0 :: Int)..]
-- all the thingsfoldl'uses complexity proportional to its output, while foldr uses complexity at least as proportional to its input.
:set +s
foldl' (+) 0 [(1 :: Int)..1000000000]
500000000500000000
(11.43 secs, 96,000,471,976 bytes)
foldr (+) 0 [(1 :: Int)..1000000000]
*** Exception: stack overflowYou should use ordNub/ordNubBy whenever possible. nub is an O(n^2) function that removes duplicate elements from a list. In particular, it keeps only the first occurrence of each element. ordNub is an O(n log n) version of the nub function that uses comparisons via Ord instead of Eq.
ClassyPrelude exports ordNub, ordNubBy, and hashNub (an O(n log n) function requiring Hashable and Eq).
The language option TypeApplications allows a more compact syntax when it's necessary to provide a type hint to the compiler. Instead of using ::, provide a type preceded by @ in argument position (with no space between the @ and the type it decorates).
encode $ toSchema (Proxy :: Proxy Text)
:set -XTypeApplications
encode $ toSchema $ Proxy @Text
encode $ toSchema $ Proxy @(Map Text Int)To language option PatternSynonyms allows naming of pattern matches.
Pattern synonyms are useful when you want to hide the representation of a datatype. For example, the containers package defines a type Seq representing finite lists. It is implemented as a special sort of tree, but the implementation is not exposed. Instead, the package defines pattern synonyms like Empty and :<| which allow you to match on a Seq as if it were a list:
head :: Seq a -> Maybe a
head Empty = Nothing
head (x :<| _) = Just xThere are two validation libraries, validation and either.
Never use a default QuickCheck instance for Text.
It is okay to use Test.QuickCheck.Instances.Time () for UTCTime instances, though there is a known issue around this in haskellari/qc-instances#13.
It is okay to use Test.QuickCheck.Instances.Semigroup () to get NonEmpty instances.
Never import Test.QuickCheck.Instances () wholesale.
MonadResource (and MonadMask, to some extent) operates similarly to a python context manager (aka a with statement). It's important to keep track of resources created within a runResourceT block because they will be cleaned up. A common example of this is sinkCacheLength, which can be used to read the number of bytes in a file before streaming it to S3. The behavior of sinkCacheLength is to "stream the input data into a temp file and count the number of bytes present. When complete, return a new Source reading from the temp file together with the length of the input in bytes." The temp file exists for the span of the resource block. Following are two (contrived) examples illustrating improper (first) and proper (second) use of this.
let source = sourceLbs "any source can go here"
cacheLength <- runResourceT $ map (over _1 fromIntegral) $ runConduit $
source .| sinkCacheLength
-- this won't work because the temp file doesn't exist out here
doSomethingToIt url cacheLength -- stream the thing into S3When defining an ADT like data Foo = Bar | Baz, it's important that the data constructors Bar and Baz for the type Foo take at unnamed arguments. Using named arguments can result in runtime exceptions from partial functions. The REPL is helpful in illustrating this.
data Foo = Bar { unBar :: Int } | Baz
:t unBar
unBar :: Foo -> Int unBar $ Bar 1
1
unBar $ Baz
*** Exception: No match in record selector unBarBecause of this fact, never use record syntax with ADTs.
Furthermore, it's not inherently bad to declare data Foo = Foo Int Int, but as a matter of style it's good to give names to arguments. Instead of data Foo = Foo Int Int try data Foo = Foo FooDetail | Bar where data FooDetail = FooDetail { fooDetailFirstInt :: Int, fooDetailSecondInt :: Int }.
In general, write sum types like this:
- newline before =
- Align = and | at 2 spaces of indentation
- Docs for constructors indented 4 spaces (or if they are all short, they can all be on the same line as the constructor they pertain to.)
- Also align
derivingat 2 spaces of indentation
-- |Doc for Foo
data Foo
= FooBlah
-- ^Doc for FooBlah
| FooBleh
-- ^Doc for FooBleh
| FooBlorp
-- ^Doc for FooBlorp
deriving (Bleep, Blorp, Bloop)And in general, write product types like this:
- fields are prefixed by an underscore, and then the type name with the first letter lowercased. Fields are camel-cased.
- Align
{,|, and}at 2 spaces of indentation - Align the
::(and note thatstylish-haskellcan do this for you) - Put deriving on the same line as
}
-- |Doc for FooDetail
data FooDetail = FooDetail
{ _fooDetailOne :: SomeType
-- ^Doc for fooDetailOne
, _fooDetailAnotherDetal :: SomeOtherType
-- ^Doc for fooDetailAnotherDetail
} deriving (Bleep, Blorp, Bloop)Typically, group such data definitions near the top of the module. After the data declarations, there should typically be a group of clauses to derive any needed instances, such as makePrisms ''Foo and makeLenses FooDetail. Typically, makePrisms for all sum types, and makeLenses for all product types. As mentioned above, use makeClassy instead if the product type is an "env" type.