End To End Testing With Hspec (Almost)

tl;dr While the setup works, the most mature haskell library for dealing with webdrivers that I could find wasn’t able to keep up with the changes in Selenium Webdriver :(. Skim through the post to check out the relevant code snippets and tech that made it all (almost) work.

If you’re not familiar with Haskell, check it out. This isn’t an introductory type post so it might not be for you, but the language is amazing.

This is a story about setting up end to end (“E2E”, AKA acceptance tests) testing running in Haskell. No well-architected/engineered web application should be devoid of end-to-end testing (I’m guilty of leaving it out or waiting to the last minute too… don’t worry). Here are some questions I asked myself when thinking of how to start with this:

Q: Where should E2E tests go?

A: Near the backend.

Why? E2E necessitates ALL the resources/components (or at least mocked versions) an app needs to run. What I mean by this is that if your app sends emails, you need something to send emails (maybe from some staging environment), something that acts as a temporary/mocked database, etc. Some projects have backend and frontend separated (in different repositories), and for those projects I’d recommend git submodule it’s come a long way and works pretty well these days. This project in particular hasn’t split the front and back ends so they’re pretty close, albeit in separate top level folders (which could easily be git submodules).

Q: What about all the JS-based testing?

A: JS-level unit, component, and component-integration level testing should be kept near the frontend

In other words, I don’t think you should be testing your component with an E2E test at the app level. They should go inside the JS repository (technically they’re end to end tests there).

Tests that DO require a backend to run (for example checking if an item listing page loads properly) should probably have mocked backends/objects, but this only works well IMO if you replicate your backend taxonomy on the frontend (frontend models that mimic backend ones). If you don’t replicate the models, then all your mocks will be wrong any second the output of the backend changes. While it might seem like just moving the goalpost, if you have models as an intermediary, the test is less broken, as long as the adapter (usually the model constructor) would still work on the changed output.

To illustrate: If your app has a User, the definitions for that model likely exist in multiple places (?? why? future research idea, maybe there should be a shared model description format here): usually the frontend and the backend. If you have something like a UserService or UserStore on the frontend, you can safely mock that service’s responses in tests, because you control what your version of the user model looks like. Even if the backend’s idea of a User changes, if you’ve ensured that your frontend either takes the “User” you expect or nothing at all, you can safely test the happy path with it (and you won’t need to worry about backend changes). It is, however, your reesponsibility to add contractual tests (currently gathering my thoughts on this, but it’s basically an E2E test) to ensure that the backend delivers what you expect, and that your services work. It’s ideal to add those tests to the backend repository (or if the backend code contains the frontend code in a submodule, making sure they run the appropriate service-backend level integration tests)

OK, with that said, lets get on to the libraries and actual machinery involved to make this a thing.

Test.HSpec

HSpec is an excellent testing framework for haskell. It combines all the creature comforts of TDD and BDD with the type friendliness that keeps me using Haskell. There’s not a whole lot about it that the lengthy and amazingly succint and clear documentation doesn’t already say. While I don’t necessarily subscribe to TDD/BDD, I do appreciate the syntatic conveniences/conventions because they often lead to easeir-to-read tests, and more stable codebases.

Particularly important to me was the section on using hooks, as one of the main things you have to do with E2E tests is spend some time setting up the world before you test your small part of it.

Test.HSpec.Webdriver

I’ve already done webdriver testing in other languages, so I knew what to search for, which is half the battle. I wasn’t sure sure what would exist in Haskell-land. The first library I came across that seemed functional/mature was Test.HSpec.Webdriver. I’m very grateful to the author (John Lenz lenz@math.uic.edu) for creating it, as I didn’t have to mess with writing low level webdriver stuff myself.

I immediately discovered the first pain point (which to be fair is how just about every one of these libraries works, in just about all the languages): you have to start PhantomJS before you use it. I knew that whatever solution I developed for running my end to end tests would handle running PhantomJS for me so I didn’t have to do this (next level, even download/install it for me?). So off the bat, I know that I’m going to need to use the hspec hooks to start a thread, and kill it after the test finishes. I also know I’ll need to do stuff like start the actual app server and whatever else, so I’m going to need to pass a few pieces of information.

Getting PhantomJS to start before the tests

Here’s what some initial versions of the code to start PhantomJS before the tests run looked like:

phantomCmdPrefix :: String
phantomCmdPrefix = "phantomjs --webdriver-loglevel=ERROR --webdriver="

startPhantomJS :: IO (Int, ProcessHandle)
startPhantomJS = getRandomPortForPhantomJS -- just picks a port in a predefined range
                 >>= \p -> spawnCommand (makeCmd p)
                           >>= \ph -> threadDelay phantomStartupWaitInMS >> return (p, ph)
    where
      makeCmd = (phantomCmdPrefix++) . show

withPhantomJSInstance :: ((Int, ProcessHandle) -> IO ()) -> IO ()
withPhantomJSInstance = bracket startPhantomJS (terminateProcess . snd)

main :: IO ()
main = withPhantomJSInstance $ \phandle -> hspec (loginTests phandle)

loginTests :: (Int, ProcessHandle) -> Spec
loginTests (phantomPort,_) = describe "A useless E2E test" $
                             sessionWith config "login page" $ using evergreenCapabilities $
                                         it "loads properly" $ runWD $
                                            openPage "http://localhost:5002/#/login" -- This ideally should be gotten from an instantiated app instance
    where
      config = defaultConfig {wdPort=phantomPort}

This worked great, except it looks like PhantomJS isn’t closing properly, and before/it aren’t working as they should. This prompts me to go back to the drawing board, but I keep the utility functions and figure that I just wasn’t combining them correctly.

After working with it for a bit longer, I just gave up, it seemed like phantom was just going to leak process handles no matter what I did (and not close properly), maybe someday I’ll switch to something that does. Weirdly enough, I did actually have some instincts that chromedriver would be the thing I switched to (and it would consistently close). Running headless (no browser window shows up) is important to me for getting just that little bit of efficiency gain, but it’s a little less important than running tests.

Kludgy working test

Here’s the first kludgy working test, building on those same initial utility functions from before.

main :: IO ()
main = withPhantomJSInstance $ \info -> hspec (loginTests info)

loginTests :: (Int, ProcessStuff) -> Spec
loginTests (phantomPort,_) =
    describe "A useless E2E test" $
             sessionWith config "login page" $
                     using evergreenCapabilities $ do
                       it "loads properly" $ runWD $ do
                                                     (port, tid) <- liftIO $ startAppForTest defaultTestConfig
                                                     openPage $ "http://localhost:" ++ show port ++ "/#/login"
                                                     liftIO $ killThread tid
    where
      config = defaultConfig { wdPort=phantomPort }

Some of the stuff like sessionWith, using evergreenCapabilities, and runWD come from Test.HSpec.Webdriver, so feel free to ignore that stuff if you’re not actually working with it. runWD was defined earlier. Basically all this working test does is wrap the entire test suite with a PhantomJS process creation and cleanup. Despite that, there were some downsides:

  • Unfortunately/Fortunately the whole test suite now runs with a PhantomJS made (not by individual test case (turns out this is fine, because sessions are used – don’t have to worry about tests colliding too much as long as they’re in separate sessions.)
  • Had to use sessionWith and pass along the Port that phantom will get started on (randomly picked). The info argument that is getting used in the main declaration is basically a bunch of information produced by the IO action that started phantom… things like the port that phantom is running on along with the ThreadId (for when it’s time to close it).
  • Since I don’t want to work on just setting up the machinery forever, I’ve only got one more code clean up in me: using the bracket pattern.

Adding in the bracket pattern

Here’s the code with my naive use of the bracket pattern:

bracketWD :: IO a
      -> (a -> IO b)
      -> (a -> WD c)
      -> WD c
bracketWD acquire release middle = do
  resource <- liftIO acquire
  result <- middle resource
  _ <- liftIO $ release resource
  return result

withAppInstance :: AppConfig -> ((Int, ThreadId) -> IO ()) -> IO ()
withAppInstance = flip bracket (killThread . snd) . startAppForTest

withAppInstanceWD :: AppConfig -> ((Int, ThreadId) -> WD ()) -> WD ()
withAppInstanceWD = flip bracketWD (killThread . snd) . startAppForTest

It’s pretty cool how similar withAppInstanceWD is to withAppInstance – I only needed to change bracket to bracketWD. For those wondering at home why I couldn’t just lift it – I couldn’t (I think) because the middle action (whatever’s actually being run) needs to be in the WD monad (not IO).

Running it all (spoiler: it fails, for reasons I didn’t expect)

After running all this, it turns out the test are failing, and I’m not sure why at first. When I load the page on my own (as I develop), everything works of course, but when I look a the output of the test, selenium is complaining about the commands that are being used. :(

Turns out **selenium’s radical API change broke Test.HSpec.Webdriver. I tried to revert to old selenium versions for a bit (painfully downloading and running random JARs from their site) hoping to find one that was sufficiently old but in the end I just gave up (it didn’t even matter either, going back versions would create problems in the long run). At that point, I’d already poured quite a bit of time into understanding everything but I decided to just cut with what I had in favor of implementing the E2E tests in JS land, as it’s more likely to have some project that kept up with the changes (again thanks to the hs-webdriver project for even attempting/giving such a good base to work on).

So I definitely discovered one dissapointing bit in haskell, there’s not a lot of work in the web driver space (though there is at 1 execllent project that is doing it’s best), but I’ve heard good things about nightwatchJS, and am excited to give it a shot.

And I would have gotten away with it too, if it weren’t for those meddling full API rewrites and inconsistent/unclear documentation.

PS As I said above, I actually went with a JS solution, but it turned out to NOT be nightwatchJS, I switched to using chromedriver + webdriverio . Thank you to selenium for innovating and pioneering the path to where we are today, but dear lord is it hard to use. I actually tried about 3 JS libraries before settling on webdriver.io, it won out because it worked, was easy enough to use, and didn’t try to make too many decisions for me. I have a certain test structure I like and tools like nightwatch were very much hyped, but a little too opinionated for my taste. I know what I want to accomplish, just give me the tools and abstractions to get there easily. For example, don’t require me to export certain functions from a file… just give me a function to run, that returns a promise (or takes a callback, whatever) and let me go on my way.