A boring Haskell+Servant based sample backend API that hits a few public weather servers.
If you're coming here as an employeer, this started as a tech task, but is now a bit of an illustration of I develop software. Although note that this was built in a couple of evenings so it's a bit rough around the edges.
It's a somewhat standard implementation of a backend API server in Haskell using:
servant-serverto define an API,servant-clientfor testing,servant-openapi3as the framework for OpenAPI generation,servant-swagger-uito produce a nice page for documentation and testing.
However, somewhat more uniquely, for the definitions of serialisers and OpenAPI, this uses the library Autodocodec, which combines serialisation, deserialisation and documentation into one explicit definition.
This avoids nasty situations such as:
- Serialisers, deserialises and documentation getting out of sync AND
- Otherwise innocent refactors (i.e. variable renaming, reorganising internal data structures) silently breaking serialisers.
Indeed, I like Autodocodec so much I've contributed numerous PRs to it.
There's a few features a standard backend app should generally have that this currently doesn't, such as:
- A backing database. I have got experience using Opaleye for this though.
- Observability. There's a few options here in the Haskell ecosystem, I've personally used Honeycomb in the past.
I have developed backend Haskell applications with these features, but this was at a previous role which was closed source.
You'll see a lot of explicit definitions of datatypes in this codebase. In that, instead of something like:
data Person = Person {
name :: Text,
age :: Int,
email :: Text
}
You'll see stuff like this:
# Name.hs
newtype Name = Name Text
deriving newtype ...
# Age.hs
newtype Age = Age Int
deriving newtype ...
# Email.hs
newtype Email = Email Text
deriving newtype ...
# Person.hs
data Person = Person {
name :: Name,
age :: Age,
email :: Email
}
Yes, this is more code. No, it isn't more "complex" (IMHO). This prevents silly mistakes like putting names in email addresses, adding ages, and what not. As this is a sample app, the advantages of defining the invariants for your "real world" types isn't particularly obvious, but I would suggest looking at src/Data/Weather/Types/Temperature.hs, where we define a few reasonable operations of degrees Celcuis temperatures whilst forbidding others (like multiplying temperatures). We wouldn't be able to do this if temperatures were just Floats. Ultimately it's not much effort to do this early (especially with AI writing the boilerplate) and it gives more flexibility later if we actually want to add custom behaviour to types (like say, an email validator). It also allows for us to generate more precise documentation through OpenAPI at a later date if we wish to.
So yes, there's more code, and more files, but I don't think this makes the code more "complex" or "over-engineered". In many ways it's simpler because there's less ways you can interact with this codebase in the wrong way without the compiler shouting at you. The whole point of writing code in this way is so that you don't have to read the code to use it, automatically generated documentation and compiler errors should guide you most the way.
I tend to use one datatype per file now. That's a lesson I learnt (I was previously lazy and didn't do this). So there are lots of files, which does somewhat increase lines of code due to repeated imports. But:
- There's less merge conflict issues with smaller files when working in a team.
- I've had the issue in the past where a file has got too big, and some new code both depends on this file is also depended on by existing code in the file. So it has to be lumped in that file and it just gets bigger. Pulling this web apart is a bit of a nightmare so I try not to build it up in the first place.
So that's why there's lots of files!
I'm also a fan of deriving via, i.e. encapuslating a datatype's custom behaviour elsewhere. ViaTextual.hs and CustomTime.hs are good examples of this. This behaviour can then be used with deriving via. That means your datatype definition is just a list of where all the instances of derived from, the actual noise of the details of the code is separate place.
In short, you'll see a lot of what looks like boilerplate in my code. This is intentional, I spend quite a bit of my code budget not just coding what I think needs to be done right now, but telling the compiler the invariants of the system I expect to hold. I do this so the compiler can actually help me write correct code, and not only me, but also my collegues that may work on the code in the future, or indeed an agentic AI. Particularly in the latter case I feel you get much better results a compiler is guiding it instead of it diving in blindly and relying on a few testcases to keep everything tied together. I'll talk more about that in the next section.
Note immediately upon starting this code, but early on, I started to tag all commits that were written by [AI] with the tag AI.
This does make the commit history a bit janky, and I probably wouldn't do this normally, but I thought it would be good for demostrative purposes to show my interactions back and forth with agentic AI.
I am just using the cursor $20 a month plan with no extra spend, so perhaps the model is not particularly advanced, but I prefer to keep the iterative cycle fairly brief so I can check what it's doing. A few notes:
- AI doesn't seem to know or care about "Don't repeat yourself (DRY)". It regularly copy/pastes code. It generally needs to be told to clean this up. A good way to do this is to refactor the code out as a function and then show AI the refactoring you've done, and for it to do it elsewhere.
- It's also got a bad case of what pilots call "get-there-itis". In that its goal seems to be code that compiles/passes the test case, even if it's not correct. In particular, AI is very happy to use generally accepted unsafe functions in subtle ways; indeed I caught it performing the Haskell equivalent of the recent Cloudflare issue. I put some notes in
.cursor-notes.mdin an attempt to tell it not to do this, but I think it's probably wiser when using AI to have strong compiler/linting warnings for potentially unsafe constructs. I myself used these unsafe constructs inmiddleElemsNonEmpty(one can assume the resulting list is non-empty if the passed list is non-empty) but one probably wants human intervention and an explicit warning exception in this case. - That being said, it's a great productivity improver. It's important to get the code and patterns right early (because it will copy its own existing garbage) but for example, adding the third endpoint was relatively trivial once the code for the first two was sorted out. I just asked it to follow the same patterns.
- BUT, there are significant architectural decisions AI just won't make. It will just plow onward with whatever exists. It generally won't do any opinionated rearchitecting/refactoring unless it's specifically told what to do, so in non-trivial codebases I suspect this will become even more of an issue (as this codebase is relatively small).
- Install stack
- Run
stack run(and get a coffee and/or go for a walk whilst everything compiles)
You should then have a webserver serving on http://localhost:3000/docs.
There's an openAPI spec at http://localhost:3000/openapi.json.
This implements a webserver which connects to three independent endpoints.
The first thing the webserver does is expose those endpoints to the user, and one can individually test these endpoints.
But there's another endpoint, the aggregation endpoint, which:
- Takes a variety of location parameters, all optional.
- Attempts to hit all three endpoints simultaneously (plus two other intentionally broken endpoints, discussed below) for which the appropriate location data is provided. For example, one endpoint requires latitude+longitude, another city name, and another airport code. If only lat+long and airport code are provided, it won't attempt to contact the city name based endpoint.
- Waits a timeout length of seconds, passed to the server by the client.
- If at least one endpoint returns before the timeout, it returns the number of endpoints that returned, and the mean and median of those temperatures.
- If no endpoint returned, it returns an HTTP error.
Note, as mentioned above, there is an OpenAPI generated testing page at http://localhost:3000/docs and a spec at http://localhost:3000/openapi.json.
There are the following tests, run using stack test.
- A test for each of the downstream endpoints. These tests don't go through the webserver itself; they just attempt to hit the downstream endpoints.
- A test which starts up the webserver and attempts to hit the aggregate endpoint. Note that there are two intentionally broken endpoints, namely: a. An endpoint which immediately returns a HTTP error AND b. An endpoint which hangs forever and never returns. In a real app, one wouldn't include these in the live version, but I made the aggregate endpoint hit these broken endpoints on purpose for illustrative purposes.