tl;dr - In this post I go through a bunch of improvments and extras to the code from part 3, 2 years after this initial series was started. The improvements range from development aids to post-deployment “observability” (metrics, logging, tracing) and how I’ve handled them in the past. It’s pretty ridiculous to finish a blog series 2 years later, so I also released the code for a Job Board API I wrote (also the frontend that manages it) for those who want to see how a CRUD-y production Haskell app looks.
This is a multi part blog post with the following sections:
This is the (much delayed) final post in a 4-part series where we worked our way through various layers of DDD; first the domain models themselves, then components that manipulate them, using the mtl
(Monad Transformer Library) approach and ReaderT pattern (as opposed to Free monads or simple monad stacking), culminating in a working API. One particularly interesting bit of this post will likely be the discussion on monad transformers, mtl, the ReaderT pattern, and the approach I use which I’ve dubbed “classy mtl readers”, with a look at the horizon, which to me is Free/FreeR monads.
Without much further ado let’s jump right into how we can improve the code and conventions we’ve used in parts 0-3.
REPLs are excellent for improving development speed, and while stack ghci
works great for me in terms of simple recompiling (repeatedly typing :r<enter>
), a great way to improve this experience is to employ a continuous typechecker/recompiler. Yes, you could wire this up yourself with entr
(which is an excellent project btw), but luckily Haskell has some native projects to do it for you as well. I recently watched a fantastic talk on haskell Terminal UIs by the folks over at FPComplete on Youtube which brought this up and explained it very well.
Simply using stack
’s --file-watch
flag is enough to get some basic functionality working:
$ stack build --pedantic --fast --file-watch
Here’s an example straight from the stack
documentation:
stack build --pedantic --haddock --test --exec "echo Yay, it succeeded" --file-watch
Another option which is slightly different is ghcid, thanks to /u/Alexbrainbox on reddit for pointing this out to me in the comments surrounding Part 3 of this blog series. It’s got some great documentation as far as I can see, with a really clear description of what it is on the front:
Either “GHCi as a daemon” or “GHC + a bit of an IDE”. To a first approximation, it opens ghci and runs :reload whenever your source code changes, formatting the output to fit a fixed height console. Unlike other Haskell development tools, ghcid is intended to be incredibly simple. In particular, it doesn’t integrate with any editors, doesn’t depend on GHC the library and doesn’t start web servers.
Linting is a developer aid I’ve come to value more as time has progressed, and hfmt
seems to be the go-to tool for that in Haskell land. To use hfmt
in my project I needed to add the following extra dependency in stack.yaml
:
# ... other YAML ...
extra-deps:
- hindent-5.2.7
# ... other YAML ...
After that everything went just as the documentation states:
$ stack install hfmt
# ... lots of build output ...
$ hfmt
# .. big patch outlining all the changes to various files ..
$ hfmt -w
The last command, hfmt -w
, runs hfmt
and (over)writes the relevant files. I’ll leave it up to you to integrate this with your own environment (whether setting up hfmt
to run on every file edit or every file save, or whenever else).
In the vein of hfmt
there are other tools that help at various levels, in particular:
stylish-haskell
haskell-tools
’s refactorhindent
(used by hfmt
)And of course it goes without saying, hlint
is an excellent tool that you should be using, as I’ve mentioned in part 1 and part 2. I personally derive much more value from hlint
than hfmt
(mostly because I don’t like some of the choices that hfmt
makes).
NOTE - I foudn myself going back and re-formatting a bunch of things that hfmt
did… The idea is good (standardizing formatting), and I’m sure hfmt
has lots of ways to customize but I think for now I’m going to just keep styling somewhat manually.
The strictness annotation, !
, turns on strict evaluation in Haskell. One common practice that I haven’t done on the codebase so far is to make sure that all data
declarations have as many fields as is reasonable set to strict evaluation. An SO post on this has some pretty good advice, stating:
A common rule of thumb is to make data structures strict when:
- You expect to strictly traverse the whole thing and retain it, so there’s no point in the overhead of laziness; or
- The fields are “small”—less than or equal to the size of a pointer—and you’d like the compiler to unbox them to avoid unnecessary indirections. (-funbox-small-strict-fields is on by default since GHC 7.7.)
Strictness also recently came up in a thread on reddit, and as /u/ElvishJerricco noted in a comment there:
Frankly it requires knowledge of runtime semantics for it to be clear when strictness is a performance advantage. Laziness and strictness are, after all, mostly just optimizations in terminating computations. I might even avoid the concept of laziness until the students are at a point that runtime semantics are relevant to them.
While I don’t claim to understand enough about the Haskell runtimes, thunks, and evaluation to know every instance that you should use strict – most of the time I’m working with some data object, I can get behind having those computations by default be strictly evaluated (as in, the change to the object is made immediately).
We can make a bunch of improvements on our current code by straightening out our ToJSON
and FromJSON
instances (from aeson
). This was inspired by a recent talk by Joe Kachmar called “Production Haskell Demystified. There are tons of great tips in that talk but the one I want to highlight here is slightly easier-to-use generated JSON for our models. For example with Task
:
data Task = Task { tName :: TaskName
, tDesc :: TaskDesc
, tState :: TaskStateValue
} deriving (Eq, Show, Read, Generic)
instance ToJSON Task
The default ToJSON
instance here will produce objects like this:
{
"tName": "<name>",
"tDesc": "<description>",
"tState": "Finished | InProgress | NotStarted"
}
This is OK, but the prefixing to avoid Haskell’s issue with name clashes for record fields is a bit of an unfortunate side effect for any frontends that might be interacting with the API. Let’s fix this by slightly modifying the default ToJSON
instance:
import Data.Aeson(FromJSON (..), ToJSON (..),
Value (..), object, (.=), genericToJSON, defaultOptions, Options(..))
-- ... other code ...
instance ToJSON Task where
toJSON = genericToJSON Data.Aeson.defaultOptions { fieldLabelModifier = (drop 1) }
There are lots of great tips in the talk (for example testing the JSON instances using Arbitrary
auto-generated examples), so I would highly recommend watching it.
One of the best things to come out of the modern web development sphere has been an attempt to standardize on how we describe REST-ful APIs. While every API is REST-ful to different degrees (leaving aside the question of whether REST is the right paradigm at all), closely adhering to at least some of the tenets of REST across the board has enabled people to explore creating machinery to document and automate creating APIs – enter Swagger/OpenAPI, RAML and API Blueprint. Thanks to the work of some Haskellers, Haskell hasn’t been left behind – the servant-swagger
offers Swagger 2.0 (AKA OpenAPI v1) support (one step back from the most recent which is is OpenAPI v3).
servant-swagger
uses a annotation-through-typeclass-definition approach, along with exposing some functionality to make use of the annotations to generate/produce documentation as necessary. How I’ve found myself using it is writing the appropriate instances, and then writing some additional command line subcommand to handle generating swagger and writing it to a file. Here’s what that looks like for the codebase we have so far:
Types.hs
:
-- These imports were required for the ToSchema instance for (WithUUID a)
import Data.Swagger (ToSchema(..), schemaName, NamedSchema(..), Schema(..), toSchema, Referenced(..))
import Data.Proxy (Proxy(..))
import qualified Data.HashMap.Strict.InsOrd as DHMSIO
-- Easy ToSchema instances for the simpler data types
instance ToSchema Task
instance ToSchema (Complete TaskF)
instance ToSchema (Partial TaskF)
-- Manual ToSchema instance for the (WithUUID a) compound type
-- This needed to be manual since the "uuid" property needs to be injected
-- Why does "uuid" property need to be injected? Since that's how the custom ToJSON instance for (WithUUID a) works
-- The objects that ToJSON (WithUUID a) instance generates have the shape {"uuid": "...", "fielda": "...", ...}
instance ToSchema a => ToSchema (WithUUID a) where
declareNamedSchema _ = updated
where
inner = declareNamedSchema (Proxy :: Proxy a)
-- Inject the "uuid" property into the named schema for the existential a
uuidKey = "uuid" -- see ToJSON instance for (WithUUID a)
uuidReferencedSchema = Inline $ toSchema (Proxy :: Proxy UUID)
injectUuidProperty = DHMSIO.insert uuidKey uuidReferencedSchema
-- Modify and rebuild a NamedSchema to inject a "uuid" property
modifyProps ns@NamedSchema{..} =
ns { _namedSchemaSchema =
_namedSchemaSchema {
_schemaProperties = injectUuidProperty $ _schemaProperties _namedSchemaSchema
}
}
updated = modifyProps <$> inner
The ToSchema
instance is pretty hairy, but would probably look a lot less hairy if I was using lens
. Since I’m not it looks bad, but it should still be fairly legible. As far as the actual changes to add the subcommand that generates the Swagger JSON, changes were required in Main.hs
as shown below:
-- swaggerOutputPath was added to the command line options object
data Options = Options
{ cfgPath :: Maybe FilePath
, cmd :: Command
, swaggerOutputPath :: Maybe FilePath -- <--- new
}
data Command
= Serve
| ShowConfig
| SwaggerGenerate -- <--- new
deriving (Show, Eq, Read)
-- ... other code ...
-- | Parser for commands
parseCommands :: Parser Command
parseCommands = subparser commands
where
serverCmd :: ParserInfo Command
serverCmd = info (pure Serve) (progDesc "Start the server")
showConfigCmd :: ParserInfo Command
showConfigCmd = info (pure ShowConfig) (progDesc "Show configuration")
swaggerGenerateCmd :: ParserInfo Command
swaggerGenerateCmd = info (pure SwaggerGenerate) (progDesc "Generate swagger JSON")
commands :: Mod CommandFields Command
commands = command "server" serverCmd <>
command "show-config" showConfigCmd <>
command "swagger-generate" swaggerGenerateCmd
-- ... other code ...
-- | Top level optparse-applicative parser for the entire CLI
parseCmdLine :: Parser Options
parseCmdLine = Options <$> parseOptions <*> parseCommands <*> parseOptions
-- ... other code ...
swaggerGenerate :: Options -> IO ()
swaggerGenerate Options{..} = when (isNothing swaggerOutputPath) printPathWarning
>> pure (toSwagger (Proxy :: Proxy TodoAPI))
>>= pure . encode
>>= BSL.writeFile path
>> printSuccessMessage
where
printPathWarning = putStrLn "[WARN] no swagger output path specified (--swaggerOutputPath)"
printSuccessMessage = putStrLn $ "[INFO] Generated Swagger JSON @ [" <> path <> "])"
path = maybe "swagger.json" id swaggerOutputPath
-- ... other code ...
main :: IO ()
main = parseOptions >>= process
where
cmdParser :: ParserInfo Options
cmdParser = info parseCmdLine idm
parseOptions :: IO Options
parseOptions = execParser cmdParser
process :: Options -> IO ()
process opts =
case cmd opts of
Serve -> runServer opts
ShowConfig -> showConfig opts
SwaggerGenerate -> swaggerGenerate opts
And to run the new subcommand:
$ stack exec haskell-restish-todo-exe swagger-generate
Unfortunately, making these changes forced me to undo the earlier changes to the ToJSON
instance for Task
(where we stripped the t
prefix from tName
) – the generated swagger.json
seemed to be ignoring that output:
"definitions": {
"Task": {
"required": [
"tName",
"tDesc",
"tState"
],
"properties": {
"tName": {
"type": "string"
},
"tDesc": {
"type": "string"
},
"tState": {
"type": "string"
}
},
"type": "object"
},
For that reason I reverted the changes – no point in having nice looking output if the Swagger is wrong – and I didn’t want to spend too much time looking into whether servant-swagger
(and swagger2
that it depends on) had the appropriate customization points. Of course there are other ways to solve this problem (there are other solutions to the record naming issue), but looks like the Haskell records issue strikes again!
While I don’t actually retrofit all the code to use lens
I did find a fantastic introductory talk on Lens by Ben Kolera, that I thought would be worth sharing.
It’s a bit late to be considering this now, but another way we could have improved the code we wrote was by using a custom prelude – a group of functions to use and overwrite some of the partial functions that exist as defaults. The great @sdiehl has some good writing on the subject if you’d like to read more.
An easy (and oft-used) example of how this might make our code safer is the simple function head
– if you try and run head []
, that value will be undefined
(AKA “bottom”), and will crash your program when used, but a safer prelude would specify a total function for head
, with a type signature like head :: [a] -> Maybe a
.
I’m going to just swap the default prelude out in favor of protolude
, and as the documentation indicates, all it takes is adding the NoImplicitPrelude
default extension (in package.yaml
, since I’m using stack with hpack
) and importing Protolude
. There isn’t a ton of benefit here, since I wasn’t relying too heavily on unsafe functions in the prelude, but one thing using Protolude
gave out of the box was a warning for use of undefined
in the code, which is nice.
There were tons of changes (way too many to list here), but I did manage to fit all the changes into one commit so check it out below:
EntityStore
to avoid liftIO
(💪 🎩)You might have noticed all the liftIO
calls and type casting littering the code, for example:
getTodoByUUID :: UUID -> AppHandler (WithUUID Task)
getTodoByUUID uuid =
ask >>= \(AppState _ estore) ->
liftIO
(getByUUID estore uuid :: IO (Either EntityStoreError (WithUUID Task))) >>=
rightOrServantErr genericServerError
Since EntityStore
is written for the IO
monad, we are forced to use liftIO
to “lift” the IO
computation into our own monadic context AppHandler (WithUUID Task)
. While this isn’t the end of the world, we can get rid of these pesky liftIO
s with (you guessed it) more abstraction! We’re going to write EntityStore
so that it’s less picky about the monad that it’s being run in. Being less specific about which monad we run in can enable lots of different usecases (we could run in some MockLogContext
that just logs operations for example), but we’re going to be using it just to get rid of the liftIO
s littering our handler code. We can get this advanced functionality by abstracting the monad we’ve been using in our operations to a polymorphic m
! Let’s put on our strong typing tophats (đź’Ş 🎩) and figure it out – basically replace everywhere you see IO
with m
:
-- | Generalized typeclass for entity storage.
class Monad m => SQLEntityStore store m
-- | Create an entity
where
create ::
forall entity.
( SQLInsertable entity
, SQLInsertable (WithUUID entity)
, FromRow (WithUUID entity)
)
=> store
-> Validated entity
-> m (Either EntityStoreError (WithUUID entity))
-- | Get an entity by ID
getByUUID ::
forall entity. (SQLInsertable entity, FromRow (WithUUID entity))
=> store
-> UUID
-> m (Either EntityStoreError (WithUUID entity))
-- | Update an existing entity by ID
updateByUUID ::
forall (entity :: FBounded).
( SQLInsertable (Complete entity)
, SQLUpdatable (Partial entity)
, FromRow (Complete entity)
)
=> store
-> UUID
-> Validated (Partial entity)
-> m (Either EntityStoreError (WithUUID (Complete entity)))
-- | Delete an entity by ID
deleteByUUID ::
forall entity.
(SQLInsertable entity, SQLDeletable entity, FromRow entity)
=> store
-> UUID
-> m (Either EntityStoreError entity)
-- | Get a listing of all entities
list ::
forall entity. (SQLInsertable entity, FromRow entity)
=> store
-> m (Either EntityStoreError [entity])
We do have to get specific at some point about which type we’re using, and one place to do that is our AppState
:
data AppState = forall estore m.
( Monad m
, SQLEntityStore estore IO
) => AppState { appConfig :: !(Complete AppConfig)
, entityStore :: estore
}
So what have we gained here? It’s kind of hard to see at this stage, we’ve gotten a bit closer to separating what the code is doing from what it needs to do it’s job. By abstracting to m
, we’ve opened up the world to any monad (even if it’s likely a monad stack with IO
at the bottom) to run this code, as long as it can do the things we want it to do. This pattern isn’t suuuper immediately useful here, but it’s going to be well worth the effort in the future.
mtl
+ typeclass pattern) (💪 🎩)Abstracting the m
is good, but it also raises another problem – we want to be OK with any m
, but we need to know that the m
provides access to a SQLiteEntityStore
and other things! All our operations that work with the store require it. There’s a bunch of great talks about this approach that contrast the typeclass approach with the usual monad unstacking and unwrapping:
The second talk comes before the first time-wise but George’s talk cuts more to the meat of what we’re doing here. Long story short, the approach we can see from these two videos is to actually constrain the abstracted monads with good ‘ol Constraint
s! We can ensure that the monadic context m
that we run our handler in has what we need by just creating a typeclass that ensures it, and applying that Constraint
to m
. Let’s do this for our handlers & custom context:
-- | List all companies
allCompanies :: ( HasDBBackend m db
, HasCacheBackend m c
, MonadError ServerError m
, CompanyStore db m
) => SessionInfo -> Maybe Limit -> Maybe Offset -> m (EnvelopedResponse (PaginatedList (ModelWithID Company)))
allCompanies sessionInfo limit offset = requireRole Administrator sessionInfo
>> getDBBackend
>>= \db -> getCacheBackend
>>= \cache -> lookupOrComputeAndSaveEither
(getCompanyListing cacheKey cache) -- lookup
(getAllCompanies db limit offset) -- compute
(insertCompanyListing cacheKey cache) -- save
>>= ifLeftEnvelopeAndThrow Err.failedToRetrieveResource
>>= pure . EnvelopedResponse "success" "Successfully retrieved companies"
where
cacheKey = CompanyListing limit offset
We’ve gained a bit of flexibility here – if we change our AppHandler
monadic context later to some BetterAppHandler
, all we have to do is make sure it has the right instances and our handler code will work untouched! At the meta level, we’ve achieved a goal we didn’t even know we had we can start expressing the requirements of handlers in terms of the functionality they need. This expressiveness also becomes a limiting factor (in a good way) – an m
only constrained by HasDB
has no idea about other functionality that might be present in the monad, and thus can’t use it – it’s our old friend information hiding – if m
can send emails, your handler can’t know about it unless you constrain m
in the appropriate way, which makes your intent very very clear.
Another good example I’ve come across is a github gist by user ocharles.
To sum all of this up, the approach we’re using here I’ll call the “classy mtl readers” approach, it’s a combination of:
While the results of this approach are accomplishable by lesser paradigms (like just using IO
everywhere), the safety/information-hiding it offers is very nice. It’s beautifully abstracted and type-reinforced (đź’Ş 🎩) way to write programs.
While I’m pretty happy with the classy mtl readers approach, I’d be remiss if I didn’t tell you about what might be next – people have actually long since been working on this, but there’s something on the horizon that might offer even better semantics, abstraction and type protection: Free/FreeR monads and applicatives.
The link above ( to the gist by @ocharles) is actually a response to some articles about Free Monads:
Free monads interest me more and more these days as I ponder the benefits of expressing what a program (or service) can do as an operational algebra – so a EntityStore
can CreateEntity
, ReadEntity
, etc. After a while I came upon another great blog post, an opinionated guide to haskell in 2018, which laid out the Free/mtl
/FreeR dichotomy nicely, and introduced FreeR in a really approachable way to me.
Unfortunately, there are some issues with Free/FreeR namely:
mtl
for the compiler to optimize, Free is even worse (supposedly FreeR is fast enough for this inefficiency to not matter)I think FreeR Monads/Applicatives are incrementally better if not for the abstraction and forced-design they provide, and I want to start moving to them, but we’ll leave that as a topic for another time – maybe I’ll write a little app in both approaches and benchmark it locally in another post – it should be possible to write the actual functions that do the work (monadic functions in the mtl
case, executors in the FreeR case) in a completely separate module to make it a fair comparison.
Of course, if you read DeGoes’s second article, he actually covers this by introducing the “Free Transformers” – this moves the computation into the actual data type, but seems to solve basically all the problems with Free. As for the Applicative/Monad choice requirement, it looks like Free Applicative can just be used instead at all points – he also gave a great talk on the subject.
I posted about this on Reddit, and while I basically answered my questions myself after more reading/watching it was good to ask others about this. Thanks to the help of the folks in r/haskell (Edward Kmett even responded!), I understand the limitations a little better now – free applicatives are where I want to be.
polysemy
, fused-effects
& friendsBack when this post series was first being written, there was a massive penalty for using the Free/R Monads – that drawback has mostly disappeared these days, with the maturation of some awesome community libraries:
How do we know the performance penalty is gone? Some brave souls have done the benchmarking! – you can peruse the criterion directly as well. The best performer on the graph is fused-effects
(as of the writing of this post), almost mirroring MTL speed. The tradeoff with using fused-effects
is the increased boilerplate, but at this point it’s looking like it’s worth it!
Here are some general things I think are nice to add for robustness.
Let’s improve our error catching and error returning semantics, define some envelopes for the data we’re sending back.
data EnvelopedResponse a = EnvelopedResponse
{ status :: String
, message :: String
, respData :: a
} deriving (Generic, Eq, Show)
NOTE Using String
is not great here, but this is how the code sits so I’m leaving it in – Text
and strict Bytestring
s may be better choices.
Good APIs have good pagination. While I’m not going to do any fancy cursor-based pagination or something like that, here’s how I handle the usual offset+limit style pagination.
-- For the request side
data PaginationOptions = PaginationOptions { limit :: Maybe Limit
, offset :: Maybe Offset
, afterId :: Maybe ID
}
-- For the data modeling side
data PaginatedList a = PaginatedList
{ items :: [a]
, total :: Int
} deriving (Generic, Eq, Show)
makePaginatedList :: [a] -> PaginatedList a
makePaginatedList as = PaginatedList as (length as)
instance ToSchema a => ToSchema (PaginatedList a)
instance Functor PaginatedList where
fmap f list@(PaginatedList oldItems _) = list { items=fmap f oldItems }
instance Foldable PaginatedList where
foldMap f (PaginatedList oldItems _) = foldMap f oldItems
There’s a smattering of classes here, but hopefully they’re all pretty straight forward, there are a few
Server-side errors should never make it to users – all around the code I actually use rightOrThrow
which just throws an error rather than surfacing a proper error that could be shown to users.
throwServantErrorIfLeft :: (MonadError ServerError m, Exception e) => Either e a -> m a
throwServantErrorIfLeft (Left err) = throwError $ Err.makeGenericError (show err)
throwServantErrorIfLeft (Right v) = return v
makeGenericError :: String -> Servant.ServerError
makeGenericError s = Servant.err500 { Servant.errBody=errStr }
where
errStr = BSU.fromString $ "Server error occurred: " ++ s
ensureNotNothing :: (MonadError e m, Monad m) => Maybe a -> e -> m a
ensureNotNothing Nothing e = throwError e
ensureNotNothing (Just x) _ = return x
ifNothingThrowIOError :: (MonadError e m, Monad m) => e -> Maybe a -> m a
ifNothingThrowIOError = flip ensureNotNothing
ifNothingThrowError :: (Exception e, Monad m) => e -> Maybe a -> m a
ifNothingThrowError e Nothing = throw e
ifNothingThrowError e (Just v) = pure v
whenNothing :: (Monad m) => m a -> Maybe a -> m a
whenNothing action Nothing = action
whenNothing action (Just x) = return x
whenLeft :: (Monad m, Exception e) => (e -> m a) -> Either e a -> m a
whenLeft action (Left err) = action err
whenLeft action (Right v) = return v
throwServantErrorIf :: (MonadError e m) => Bool -> e -> m ()
throwServantErrorIf True err = throwError err
throwServantErrorIf False _ = return ()
ifLeftConvertAndThrow :: (MonadError ServerError m) => Either SomeException r -> m r
ifLeftConvertAndThrow (Left err) = throwError . Err.enveloped . Err.convertToServantError $ err
ifLeftConvertAndThrow (Right v) = pure v
servantConvertErr :: Exception e => e -> ServerError
servantConvertErr = Err.enveloped . Err.convertToServantError
ifNothingEnvelopeAndThrow :: (MonadError ServerError m) => ServerError -> Maybe r -> m r
ifNothingEnvelopeAndThrow e Nothing = throwError (Err.enveloped e)
ifNothingEnvelopeAndThrow e (Just v) = pure v
ifLeftEnvelopeAndThrow :: (MonadError ServerError m, Exception e) => ServerError -> Either e r -> m r
ifLeftEnvelopeAndThrow e (Left err) = throwError (Err.enveloped e)
ifLeftEnvelopeAndThrow e (Right v) = pure v
throwIf :: (Exception e, Monad m) => Bool -> e -> v -> m v
throwIf True err _ = throw err
throwIf False _ v = pure v
throwErrorIf :: Monad m => Bool -> String -> m ()
throwErrorIf True err = error err
throwErrorIf False _ = return ()
throwIfLeft :: (Exception l, Monad m) => Either l r -> m r
throwIfLeft (Left err) = throw err
throwIfLeft (Right v) = pure v
ifLeftThrow :: (Exception e, Monad m) => e -> Either l r -> m r
ifLeftThrow e (Left _) = throw e
ifLeftThrow _ (Right v) = pure v
That is a lot of utility functions! Let’s get a bit more conrete by showing the use in a real use case:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE FlexibleContexts #-}
module API.V1.Addresses ( Routes
, server
) where
import Control.Monad.Except (MonadError)
import Control.Monad.State (liftIO)
import Servant
import Types
import Util (ifNothingThrowIOError)
import qualified Errors as Err
-- ^ User information retrieval endpoints
type Routes = "addresses" :> Capture "addressId" AddressID :> Get '[JSON] (EnvelopedResponse (ModelWithID Address))
server :: (HasDBBackend m db, MonadError ServerError m) => ServerT Routes m
server = findAddressByID
findAddressByID :: (HasDBBackend m db, MonadError ServerError m) => AddressID -> m (EnvelopedResponse (ModelWithID Address))
findAddressByID cid = getDBBackend
>>= liftIO . flip getAddressByID cid
>>= ifNothingThrowIOError Err.failedToRetrieveResource
>>= pure . EnvelopedResponse "success" "Successfully retrieved address"
Well here’s an interesting idea that occurred to me as I wrote this – should configuration actually be a component in and of itself? All the really slick systems these days (and the well-built system of yore) can deal with changing configuration on the go, and it seems very easily within reach with the way we’ve written everything so far. Haskell has extremely ergonomic concurrency support, and state-of-the-art coordination primitives (whether message passing is your thing or shared memory), maybe I could whip up such a system really quick by moving configuration from a static built-once-at-startup thing to the purview of a component?
I was originally going to leave this out, because it seems too far off the beaten path, and then I saw Control.Concurrent.Chan just sitting there, and I couldn’t help myself. I’m not going to do anything crazy and implement an app-wide event bus or anything, just 1-to-N component communication, in particular, a way for distributing configurations! Wouldn’t a typeclass for components that could be persuaded to update their own configurations and re-initialize? What about the chaining mechanism? How does a theoretical ConfigComponent
let other components know that it has a new configuration available for them to see, and for those components to reinitialize?
NOTE The techniques in this section are 2+ years old. Please check
servant
and other libraries for better solutions, or easier to use abstractions than what I’ve laid out below
OK, before we mark our dinky little service as done, let’s handle one of the most important cross-cutting concerns – Authentication (AuthN) and Authorization (AuthZ). This bit can be a little difficult to tease out, but let’s do it. Here’s a scheme that I developed for working with Auth in servant:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
module AuthN
( genAuthServerContext
, getRoleFromSessionInfo
, getUserIDFromSessionInfo
, authCookieName
, sessionUserID
, sessionUserRole
, FullRequest
, AuthHandler
, CookieAuth
) where
import Data.Maybe (isNothing, fromJust, fromMaybe)
import Control.Monad.State (when, liftIO)
import Data.Aeson (decode)
import Data.ByteString (ByteString)
import Data.Typeable (Typeable)
import Network.HTTP.Types.Header (hCookie)
import Network.Wai (Request, vault, requestHeaders)
import Network.Wai.Session (Session)
import Servant
import Servant.Server.Experimental.Auth (AuthHandler, mkAuthHandler, AuthServerData)
import Servant.Server.Internal.Delayed (passToServer)
import System.Log.Logger (warningM)
import Types
import Util (ifNothingThrowIOError, throwErrorIf)
import Web.ClientSession (Key, decrypt)
import Web.Cookie (parseCookies)
import qualified Data.ByteString.Lazy.Char8 as BSL8
import qualified Data.Vault.Lazy as V
import qualified Errors as Err
import qualified Data.DateTime as DT
-- For Servant generalized auth
type instance AuthServerData (AuthProtect "cookie-auth") = SessionInfo
type CookieAuth = AuthProtect "cookie-auth"
sessionUserID :: SessionInfo -> UserID
sessionUserID = userID . sessionUserInfo
sessionUserRole :: SessionInfo -> Role
sessionUserRole = role . sessionUserInfo
authCookieName :: ByteString
authCookieName = "your-app-here"
-- Generate a cookie auth handler, given a session store cookie
genCookieAuthHandler :: Key -> AuthHandler Request SessionInfo
genCookieAuthHandler serverKey = mkAuthHandler handler
where
handler req = liftIO DT.getCurrentTime
>>= \now -> getAuthCookieValue req
>>= ifNothingThrowIOError Err.notLoggedIn
>>= decryptCookie serverKey
>>= ifNothingThrowIOError Err.invalidAuthHeaders
-- Throw error if is cookie has expired
>>= \c -> if now > expires c then throwError Err.invalidSession else pure c
getAuthCookieValue :: Request -> Handler (Maybe ByteString)
getAuthCookieValue = return . maybe Nothing (lookup authCookieName) . (parseCookies<$>) . lookup hCookie . requestHeaders
decryptCookie :: Key -> ByteString -> Handler (Maybe SessionInfo)
decryptCookie key str = return . fromMaybe Nothing $ decode . BSL8.fromStrict <$> decrypt key str
-- ^ Generate an context that contains the necessary auth handler, given a key to use for client side cookie encryption
genAuthServerContext :: Key -> Context '[AuthHandler Request SessionInfo]
genAuthServerContext = (:. EmptyContext) . genCookieAuthHandler
getRoleFromSessionInfo :: SessionInfo -> Role
getRoleFromSessionInfo = role . sessionUserInfo
getUserIDFromSessionInfo :: SessionInfo -> UserID
getUserIDFromSessionInfo = userID . sessionUserInfo
data FullRequest deriving Typeable
instance HasServer api context => HasServer (FullRequest :> api) context where
type ServerT (FullRequest :> api) m = Request -> ServerT api m
hoistServerWithContext _ pc nt s = hoistServerWithContext (Proxy :: Proxy api) pc nt . s
route Proxy context subserver =
route (Proxy :: Proxy api) context (passToServer subserver getRequest)
where getRequest req = req
While I use this scheme, it’s important to note that servant-auth
now exists (thanks to the hard work of many contributors to Servant, I’ve personally been helped many times by @jkarni and @phadej) – this is what you’re going to want to use for your auth, instead of cobbling it together like I did. Please do not take this code to production!
And on the side of AuthoriZation (“AuthZ”), I’ve got code that looks like this:
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE ScopedTypeVariables #-}
module AuthZ ( ensureUserIsAdminOrCompanyRep
, representsCompanyWithID
, requireRole
) where
import Control.Monad.Except (MonadError, when)
import Control.Monad.State (liftIO)
import Control.Monad.Catch (handle)
import Data.Maybe (fromJust)
import Servant (throwError)
import Servant.Server (Handler, ServerError)
import AuthN (sessionUserID, sessionUserRole)
import Util (throwServantErrorIfLeft, throwServantErrorIf, servantConvertErr)
import Types ( UserID
, CompanyID
, SessionInfo
, Role(..)
, ModelWithID(..)
, HasDBBackend(..)
, PaginatedList(..)
, DatabaseBackend(..)
, CompanyStore(..)
, EntityStore(..)
)
import qualified Errors as Err
ensureUserIsAdminOrCompanyRep :: ( HasDBBackend m db, MonadError ServerError m, CompanyStore db m )
=> CompanyID -> SessionInfo -> m ()
ensureUserIsAdminOrCompanyRep cid sessionInfo = userId `representsCompanyWithID` cid
>>= \isRep -> throwServantErrorIf (isNotAdmin && not isRep) Err.unauthorized
where
userId = sessionUserID sessionInfo
isNotAdmin = (/= Administrator) $ sessionUserRole sessionInfo
representsCompanyWithID :: ( HasDBBackend m db, MonadError ServerError m, CompanyStore db m )
=> UserID -> CompanyID -> m Bool
representsCompanyWithID uid cid = getDBBackend
>>= flip getCompaniesRepresentedByUserWithID uid
>>= throwServantErrorIfLeft
>>= pure . (cid `elem`) . (mId <$>) . items
requireRole :: MonadError ServerError m => Role -> SessionInfo -> m ()
requireRole r sessionInfo = throwServantErrorIf (sessionUserRole sessionInfo /= r) Err.unauthorized
An aside on web security – as far as I know the most secure, generalizable security configuration seems to be JWTs-as-tokens in cookies, allowing mobile clients to maybe take the API token out of the cookie and use it as a header or whatever else. As far as what to actually store, an opaque identifier is perfectly fine, but JWT confers some advantages/downside:
The real heavy lifting here is being done by HTTPOnly
and Secure
settings on cookies, so as long as you don’t do anything dumb like pass things that should be secret in a way that is accessible to the browser (in the case of an XSS attack) then you should be roughly OK. Anyway, read the OWASP top 10.
I am by no means a testing guru – I only find myself writing really good tests when I invest a lot of time up front in making testing easy for myself, or finding a perfect fit tooling wise. I find the most value for me comes from E2E tests, and that’s where I spend the most time – this is even more true in Haskell as the old adage (exaggeration really) goes: “if it compiles, it works”.
Outside of E2E testing, I also prioritize regression type testing – problems solved in production should almost always cause commits of tests that test for and ensure the problem never happens again. Engineering perfection means never solving the same problem twice. While many have really touted the effectivenss of property based testing with QuickCheck, I find it hard to value finding what can often be very obtuse/pathological failure cases – I’m often much more interested in ensuring that a certain happy path through the application is protected, and that’s where I spend my time.
I also very much dislike mocking and am much more comfortable with spinning up temporary resources locally before tests and/or suites of tests, going so far as to spin up a local observable instance of a dependency rather than trying to “stub out” calls or replace components with mocks. In my mind, mocking is a near-pointless endeavor – it requires knowledge of shifting internals and takes you further and further away from production (or even testing/staging environments) with every piece of clever code you write.
When it comes to testing, here’s my order of importance:
I don’t find myself writing a lot of unit tests (I’m not sure exactly how ashamed I should be), given the power of the type system (assuming I’m using any of the power) – but of course that would be the right tier for generative testing.
One of my favorite things about working in the node space is how easy it is to use tools like supertest
(built on top of superagent
) and simple testing frameworks like test along with the power of nodejs to write tests against a “real” API. Turns out it’s not that hard to do E2E testing in Haskell either, though I’m a bit verbose with my usage:
promotionTriggerTest :: SpecWith TestAppInstanceInfo
promotionTriggerTest = it "should be triggered after a JPR is promoted" runTest
where
runTest (port, _, _) = buildClientEnv port
-- | Do login and get the cookie for the admin user
>>= \clientEnv -> doUserLogin clientEnv adminUserAndPassword
>>= \authenticatedReq -> pure ()
-- | Create a company to be the employer
>> runClientM (createCompany authenticatedReq testCompany) clientEnv
>>= shouldBeRight
>>= \createCompanyResp -> when (status createCompanyResp /= "success") (error "Failed to create company")
-- | Create the job posting
>> let jpr = testJPR { rCompanyId=Just $ mId $ respData createCompanyResp }
in runClientM (createJobPostingRequest jpr) clientEnv
>>= shouldBeRight
>>= \createJPRResp -> when (status createJPRResp /= "success") (error "JPR creation failed")
-- | Get the available jobs (job search with no query/filters/etc)
>> runClientM (jobSearch Nothing [] [] Nothing Nothing []) clientEnv
>>= shouldBeRight
>>= \originalJobsResp -> when (status originalJobsResp /= "success") (error "Job retrieval failed")
-- | Approve the JPR, using a fake SessionInfo for the admin user
>> let jprId = mId $ respData createJPRResp
in runClientM (approveJPR authenticatedReq jprId) clientEnv
>>= shouldBeRight
>>= \approvedJPRResp -> when (status approvedJPRResp /= "success") (error "JPR approval failed")
-- | Retrieve the available jobs again
>> runClientM (jobSearch Nothing [] [] Nothing Nothing []) clientEnv
>>= shouldBeRight
>>= \updatedJobsResp -> when (status updatedJobsResp /= "success") (error "Updated retrieval failed")
-- | Ensure that totals changed, the approved job should have shown up (total should now be 1)
-- which means the cache should have been invalidated
>> let totalAfterUpdate = total $ respData updatedJobsResp
originalTotal = total $ respData originalJobsResp
in originalTotal `shouldNotBe` totalAfterUpdate
The easy Haskell integration of HTTP
in combination with Wai.Application
and ClientApi
means it’s pretty easy to generate a client to connect to your servant API.
There’s an excellent practical talk on the subject.
The particular project I based this post doesn’t have any property based testing but it should. I have to be honest, I’ve just not found that unit (and by extension property-based) testing that necessary in developing apps. I know it’s something I should definitely have but 99% of the time I’ve just forgotten a parameter and most peopl aren’t trying to put negative numbers where positive numbers go (which is something I’d just solve with Natural
). This isn’t a good reason to not do proper testing, but it’s my own rationalization.
hslogger
has worked fantastically for me as far as a basic logging solution for Haskell. One thing that I haven’t done with it, that I need to, is structured logging (basically using JSON/some format versus plain strings in your log messages). GELF is a reasonably nice standard for this and I’m thinking of adopting it (I first came upon it when using Graylog). Haskell also happens to have support for GELF with the graylog
package, so that’s nice.
I’ve written in the past about getting tracing working within my haskell applications, so I’d suggest you look there to see how I’ve gotten on with this. A lot of the libraries that I was trying to evaluate then have grown, so another survey of the landscape is almost surely necessary, but I won’t try and accomplish that in this post. Anyway here’s the code I used to write to Jaeger on a recent project:
data TracingBackend = TracingBackend { tbCfg :: TracingConfig
, tbLogger :: Maybe Logger
, tbSpans :: IORef [Span]
, tbMgr :: Maybe Manager
}
makeConnectedTracingBackend :: TracingConfig -> Maybe Logger -> IO (Either SomeException TracingBackend)
makeConnectedTracingBackend cfg maybeL = try makeBackend
where
makeManager = case tracingBackendType cfg of
LocalTracingBackend -> pure Nothing
Zipkin -> Just <$> newManager defaultManagerSettings
makeBackend = newIORef []
>>= \spans -> TracingBackend cfg maybeL spans <$> makeManager
Again, this code is 2 years old – I’m pretty sure the tracing libraries have gotten much better for Haskell in 2021. OpenTracing joined OpenCensus to become OpenTelemetry – lots of things have happened in this space.
I’ve also written about getting basic metrics working with prometheus in the past (basically exposing some basic warp
supported metrics to prometheus), but I haven’t done a whole lot of deep thinking on this subject, and don’t feel confident enough with knowing the ecosystem to suggest things here, outside of just using prometheus-related tooling that’s available.
As far as metrics go, I would really recommend reading up on the USE method and RED method for guiding which metrics you really care about and want to expose to ensure you have enough signal coming out of the noise.
So this post series took 2 years to reach it’s conclusion (only because I put off finishing up this post for 2 years). Since it’s pretty ridiculous to not finish a post for 2 years I thought I should do something for the readers who bothered to read this. What I’ve decided to do is completely open source the code or a job board I used to actively work on (TechJobs.Tokyo). I don’t know when I’ll work on it again (I was in the middle of lots of improvements too, adding OAuth, etc), but in the meantime even if the code is old I think it might be helpful to absolute beginners to have it out there in the world.
I don’t know how much Haskell has moved past the code as it sits (I expect libraries like servant
and sqlite-simple
don’t look too much different), but my basic assumption is having more production-aimed Haskell code out there that people can peruse (or maybe even fork and use? If you’re an absolute madperson) is probably good?
Anyway, check out the code that powers the job board on GitLab:
Well this has to be the longest running series that I’ve ever written – only because the last installment is 2 years late! I haven’t written any Haskell in a while (I’ve been making do with Rust and Typescript) but hopefully I can find time to shave some yaks in th enear future with what I consider to be the best programming language I know.