Don't Mock What You Don't Own - a moderately-contrived story
A piece of automated testing guidance I find people often struggle with is Don't Mock What You Don't Own. The basic idea is that you should never use mocking to replace a third-party interface in your codebase.
Instead, you should define your own wrapper APIs around third-party interfaces, the design of which is driven by your application's requirements, rather than being dictated by the shape of the third-party interface.
These wrappers can then be safely mocked because you control the interface. This leads to a "hexagonal" architecture with low coupling.
The problem is this sounds like vague academic nonsense, which requires extra code to be written with the only immediate benefit being that your system's internals are easier to test. It's hard to blame most engineers (who consider themselves finely-tuned bullshit detectors) for concluding that the Test Driven Development (TDD) acolytes have lost their minds in pursuit of a clean test suite.
In this post I'm going to give a concrete example of how this approach affects a system's design and demonstrate how it helps a system adapt to change. Hopefully by the end you'll see why this pattern is worth considering even if you're not practising TDD.
A worked example with third-party mocking
Imagine we're building a web application which displays data fetched from an imaginary third-party git hosting provider named "repo-host.com".
Our application has two end-points, repo_view
and other_view
, both of which query the repo-host.com API and render a template using the retrieved data.
def repo_view(repo_id):
response = http.get(f"repo-host.com/api/{repo_id}")
repo = response.as_json()
return render_template("""
<h1>{repo['name']}</h1>...
<span>{repo['commits'].length}</span>...
""", {repo: repo})
def other_view(repo_id):
response = http.get(f"repo-host.com/api/{repo_id}")
repo = response.as_json()
return render_template("""
...<span>{repo['commits'].length}</span>...
""", {repo: repo})
A URL change
One day we receive a message from repo-host.com
developer outreach telling us that unforeseen architectural challenges force a URL change. repo-host.com/api/:repo_id
will soon be repo-host.com/api/v1/:repo_id
. You grumble something about non-RESTful URLs but agree to update your application.
In doing so, you notice that you've got some repetition in your code, which means you must update two lines which reference the URL in question. "Don't Repeat Yourself!" you cry, as you refactor the common code into a get_repo
function.
# Refactor common fetch code into a function
def get_repo(repo_id):
response = http.get(f"repo-host.com/api/v1/{repo_id}")
return response.as_json()
# Views are now "DRY"
def repo_view(repo_id):
repo = get_repo(repo_id)
return render_template("""
<h1>{repo['name']}</h1>...
<span>{repo['commits'].length}</span>..
""", {repo: repo})
def other_view(repo_id):
repo = get_repo(repo_id)
return render_template("""
...<span>{repo['commits'].length}</span>...
""", {repo: repo})
The response body changes
repo-host.com
have realised that returning a list of all commits with every request is expensive and are applying rate limiting to /api/v1
. To continue querying at the rate we need, they encourage us to use the rate-unlimited simple=true
parameter, which instead returns a commit_count
attribute.
As you come to make this change, you realise this is a bit annoying, as you have two templates expecting a commits
array attribute, which both need to be updated.
def get_repo(repo_id):
# UPDATED
response = http.get(f"repo-host.com/api/v1/{repo_id}?simple=true")
return response.as_json()
def repo_view(repo_id):
repo = get_repo(repo_id)
# UPDATED
return render_template("""
<h1>{repo['name']}</h1>...
...<span>{repo['commit_count']}</span>...
""", {repo: repo})
def other_view(repo_id):
repo = get_repo(repo_id)
# UPDATED
return render_template("""
...<span>{repo['commit_count']}</span>...
""", {repo: repo})
Still, not the end of the world. You update your code and test mocks, run your CI and deploy.
Runtime exceptions in production
As soon as your release hits production, you start seeing runtime exceptions. Turns out, someone else on the team had seen your helpful get_repo
function and integrated it into their own view.
# Someone else's view
def repo_commits_view(repo_id):
repo = get_repo(repo_id)
return render_template("""
{% for commit in repo['commits'] %}
{% commit['sha'] %} {% commit['message'] %}...
{% end %}
""", {repo: repo})
This code depended on the commits
array attribute in the return value which we just replaced with commit_count
. Why didn't the CI catch this?! Were there no tests?!
There were tests, but unfortunately they mocked the request object, which isn't owned by our system:
def test_repo_commits_view():
with mock.patch("http.get") as fake_get:
fake_get.return_value = mock.Mock(as_json={
"commits": [...]
})
test_client.get("/repo/commits")
...
You roll the deploy back, and set to work adding a "simple" boolean parameter to your get_repo
function. But before you can finish, the phone starts ringing again.
"Yeah, so HTTP isn't working out for us"
repo-host.com
have decided that gRPC is the wave of the future, and are deprecating their HTTP REST API. Your heart sinks as you realise exactly how many parts of the system are calling get_repo
and therefore expect a JSON key-value object, which you'll now need to re-write to match the gRPC values.
What went wrong?
This is, admittedly, a highly contrived example. I hope most third parties would provide a more stable API than repo-host.com
.
Additionally, many of the problems here wouldn't have made it to production with extensive integrated testing. However, integrated tests have their own set of problems, and one of the goals of TDD is to arrive at designs which can be verified with as few integrated tests as possible.
The common root of our issues is that we've allowed our design to be affected by what's available rather than what we need. We've invited a data structure we don't own (the repo-host.com REST API response body) deep into our application, to the point where even our template layer's code is informed by it. The whole of our system is now "coupled" to this structure, and as soon as it changes, we have to change the entire system with it.
Take 2
This is why "Don't mock what you don't own" and Discovery Testing put a focus on describing the dependent layers of your system in terms of the interfaces you want to exist, rather than being guided by what's available.
Combined with YAGNI, you end up with smaller interfaces to third-party services which are tightly coupled to your domain model, and loosely coupled to the third-party.
Let's rewind to the start of our system, and imagine the sort of design I'd expect to arrive at following those design principles.
We start with our views, imagining the get_repo
function we want to exist.
def repo_view(repo_id):
repo = get_repo(repo_id)
return render_template("""
<h1>{repo.name}</h1>...
<span>{repo.commit_count}</span>..
""", {repo: repo})
def other_view(repo_id):
repo = get_repo(repo_id)
return render_template("""
...<span>{repo.commit_count}</span>...
""", {repo: repo})
Because we're focussed on what we want, not what's available, the return value from our imaginary get_repo
function is a simple object rather than a JSON dictionary, and we only reference commit_count
, rather than doing commits.length
on an array we otherwise don't use.
Now, we implement our imagined get_repo
function, mapping the third-party interface into the first-party interface we just designed.
def get_repo(repo_id):
response = http.get(f"repo-host.com/api/{repo_id}")
return Repo.build_from_response(
response.as_json()
)
# This class models only attributes we need
class Repo:
name: str
commit_count: int
@classmethod
def build_from_response(cls, response_body):
return cls(
name=response_body['name'],
commit_count=response_body['commits'].length,
)
The first thing you'll notice is: this is a lot more lines of code! This is a valid concern - more lines means it takes longer to write and creates more space for bugs to hide in.
So what's the upside?
All translation between repo-host.com and our internal system are now encapsulated by the get_repo
function. We "own" the entirety of the get_repo
interface, including the return type, meaning, according to "Don't mock what you don't own", this function is now fair game for mocking.
As such, our view tests can look like this:
def test_repo_view():
with mock.patch("get_repo") as fake_get_repo:
fake_get_repo.return_value = Repo(name="hello", commit_count=0)
...
If we wanted to mock out our previous get_repo
implementation we had to specify an arbitrary JSON object as our return value. With our new implementation, we can specify an instance of our new Repo
type.
This smaller, simpler return type makes for an easier to read test. Additionally, because Repo
is a concrete class, we can be confident that our return value has the same fields as those in the production system.
As for testing the get_repo
function itself, as this calls the third-party repo-host.com interface over HTTP, we cannot safely mock its internals and must rely on integration testing. In these situations I would typically reach for a tool like VCR.py. Thankfully, because the responsibilities of get_repo
are very specific and limited, we should not need many integrated tests to have sufficient confidence.
Replaying the changes
So, what happens as we work through those same sets of required changes to the system?
First up, changing the URL to add /v1
def get_repo(repo_id):
# One line change - update the URL
response = http.get(f"repo-host.com/api/v1/{repo_id}")
Building get_repo_commits_view
Next, a step we didn't see happen with the first design - someone else building a repo_commits_view
. Previously, they saw our get_repo
function, observed that the commits
attribute was an array and built their view around it.
With our new implementation, there's no opportunity for that accidental coupling, as our 1st-party Repo
class only models the attributes we use: name
and commit_count
.
While the temptation still exists be guided by what's available and extend get_repo
to add the commits to the Repo
class, let's again imagine the interface we want to exist.
def repo_commits_view(repo_id):
# we don't get to `get_repo`, we want to...
commits = get_repo_commits(repo_id)
return render_template("""
{% for commit in commits %}
{% commit.sha %} {% commit.message %}...
{% end %}
""", {repo: repo})
Now, we must code our imagined get_repo_commits
function into existence:
def get_repo_commits(repo_id):
# Duplicated code
response_body = http.get(f"repo-host.com/api/{repo_id}").as_json()
return [
Commit.build_from_response(res) for res in response_body['commits']
]
class Commit:
message: str
hash: str
@classmethod
def build_from_response(cls, response_body):
return cls(
name=response_body['message'],
hash=response_body['sha']
)
Again, we end up with a function (get_repo_commits
) where we own the entire interface, and therefore can mock it safely when testing other functions. The function itself must be integration tested as it communicates with a third-party.
Don't Repeat Yourself again
In building this, we copy-and-pasted the repo-host.com HTTP GET call from get_repo
. Once our integrated tests our passing, we decide to refactor the common http.get
code into a shared repo_host_fetch_repo
function:
def get_repo(repo_id):
response_body = repo_host_fetch_repo(repo_id)
...
def get_repo_commits(repo_id):
response_body = repo_host_fetch_repo(repo_id)
...
def repo_host_fetch_repo(repo_id):
return http.get(f"repo-host.com/api/{repo_id}").as_json()
We're pretty happy that we've eliminated the duplication. We might be tempted to re-write our integrated get_repo
and get_repo_commits
tests as isolated tests which mock out repo_host_fetch_repo
, but since the return type is raw JSON from a third-party, it fails the "Don't mock what you don't own test". As such, the integrated tests stay.
Rate limiting
Now let's introduce the rate-limiting change which makes fetching commits
expensive. This time our refactor is safer, as it's clear from the code which functions depend on commits
. As an example, let's do the most naive thing possible and just add simple=true
to repo_host_fetch_repo
.
def repo_host_fetch_repo(repo_id):
return http.get(f"repo-host.com/api/v1/{repo_id}?simple=true").as_json()
Immediately, our integrated tests for get_repo_commits
start failing, as simple=true
means repo_host_fetch_repo
no-longer returns the commits
array. This reveals that our optimistic, duplication-reducing refactor was folly. We decide to unroll our repo_host_fetch_repo
function.
Here are the complete required changes:
def get_repo(repo_id):
# unroll `repo_host_fetch_repo`,
# specify `simple=true`, we don't need commits
response_body = http.get(f"repo-host.com/api/v1/{repo_id}?simple=true").as_json()
...
class Repo:
...
def build_from_response(cls, response_body):
return cls( ...
# Use `commit_count` rather than `['commits'].length`
commit_count=response_body['commit_count'],
)
def get_repo_commits(repo_id):
# unroll `repo_host_fetch_repo`,
# don't specify `simple=true`, we need commits
response_body = http.get(f"repo-host.com/api/v1/{repo_id}").as_json()
That's it - three lines modified and no need to update any tests. The rest of the system is isolated from the change and requires no updates. Despite only writing integrated tests for get_repo
and get_repo_commits
, the test were sufficient to catch integration mistakes and prevented us from shipping broken code to production. Had we not done our premature duplication-reducing refactor, we wouldn't have needed to touch get_repo_commits
either.
HTTP -> gRPC
Switching from integrating with HTTP to gRPC sounds like a pretty big change, but when you're practising "Don't mock what you don't own", and designing interfaces based on your domain model's needs, it's actually not that crazy.
What would we have to change? Well, our integrated tests for our boundary functions get_repo
and get_repo_commits
would have to be completely re-written, and we'd have to make them pass. But that might be it. The rest of your system is written in terms of what we want, and once we've translated at third-party boundary into these first-party representations, there may well be no reason for the rest of the system to change.
Was it worth it?
The gains here may seem small, but they scale as your application grows.
I'm not here to pretend that creating bespoke wrappers for your third-party interfaces isn't more work up-front. In the early stages of your application you are certain to write more code, much of which may seem like unnecessary boiler-plate. Do not follow this pattern if you're building something in a 48 hour hack weekend.
However, like many TDD practices, this pattern helps you write code with low coupling, which at its core means modifications to your application require fewer parts of your system to change.
This means your delivery cadence is more likely to be stable, without dramatic spikes for unexpected new requirements. Upgrading to newer versions of libraries is easier, so you can use the latest versions of tools and apply critical security updates quickly. Because your system is easier to adapt, you can say "Yes" to big changes ("gRPC? No problem") or pounce upon opportunities (What if we cached the whole of repo-host.com in a database?) that a more tightly coupled system might preclude.
And lastly, it lays out a framework for deciding where to use integrated tests (which are expensive to write and maintain) while providing safe spaces to use isolated unit tests with mocks without worrying about compromising your test suite's ability to catch errors.