DI in python

I’ve been worried lately about doing dependency injection in python

Having programmed largely in C#, and being used to decouple things via constructor injection, with the aid of an Inversion of Control (IoC) container, I’ve tried to apply the same patterns in the python world, but somehow I’ve never felt this was really pythonic.

The lack of a community accepted IoC container also felt like a warning smell

I've started to write this post on 2020-10-20 and it was even longer than it is now. I've just reduced the scope of it, and left the main motivating question unanswered, as it has been unanswered to me also during this period. I hope to follow this post with some others that answer it, at least partially, with my current understanding of how to tackle this kind of stuff in python. My thoughts have changed a little bit, but the drawings are nice, so I decided to post it before I move on.

Continuing with the IoC problem, first thing one does is to STFW, and that means google and stackoverflow. What I’ve found, phrased diversely, in summary is:

you don not need a DI container in python.

Yeah, but I don’t want to loose all the benefits that I was used to get from DI. How do you do it in python?

What is inversion of control and dependency injection and why we need it?

Going back to the classics, I’ll use Fowler’s example with a liberal translation to python.

In Fowler’s example an application obtains a list of movies directed by someone.

from lister import MovieLister

if __name__ == '__main__':
    lister = MovieLister()
    spielberg_movies = lister.movies_directed_by('Steven Spielberg')
    print(spielberg_movies)
main.py

It does that with the aid of a service MovieLister. This service is the one that has the business logic.

In the whole example this is reduced to the simplistic minimum that serves to exemplify the necessary architecture, but fits in a blog post. Here, the business logic is just filtering by director.

from finder import MovieFinder

class MovieLister:

    def __init__(self):
        self.finder = MovieFinder()

    def movies_directed_by(self, director):
        all_movies = self.finder.find_all()
        return [movie for movie in all_movies if movie.director == director]
lister.py

This class depends on another service MovieFinder from the module finder

from movies import Movie

data = [
    Movie(year=1972, name="The Lift", director="Robert Zemeckis"),
    Movie(year=1973, name="A Field of Honor", director="Robert Zemeckis"),
    Movie(year=1978, name="I Wanna Hold Your Hand", director="Robert Zemeckis"),
    Movie(year=1979, name="1941", director="Steven Spielberg"),
    Movie(year=1980, name="Used Cars", director="Robert Zemeckis"),
    Movie(year=1984, name="Romancing the Stone", director="Robert Zemeckis"),
    Movie(year=1985, name="Back to the Future", director="Robert Zemeckis"),
    Movie(year=1988, name="Who Framed Roger Rabbit", director="Robert Zemeckis"),
    Movie(year=1989, name="Back to the Future Part II", director="Robert Zemeckis"),
    Movie(year=1990, name="Back to the Future Part III", director="Robert Zemeckis"),
    Movie(year=1992, name="Death Becomes Her", director="Robert Zemeckis"),
    Movie(year=1992, name="Trespass", director="Walter Hill"),
    Movie(year=1994, name="Forrest Gump", director="Robert Zemeckis"),
    Movie(year=1996, name="Bordello of Blood", director="Gilbert Adler"),
    Movie(year=1997, name="Contact", director="Robert Zemeckis"),
    Movie(year=2000, name="What Lies Beneath", director="Robert Zemeckis"),
    Movie(year=2000, name="Cast Away", director="Robert Zemeckis"),
    Movie(year=2004, name="The Polar Express", director="Robert Zemeckis"),
    Movie(year=2007, name="Beowulf", director="Robert Zemeckis"),
    Movie(year=2009, name="A Christmas Carol", director="Robert Zemeckis"),
    Movie(year=2012, name="Flight", director="Robert Zemeckis"),
    Movie(year=2015, name="The Walk", director="Robert Zemeckis"),
    Movie(year=2015, name="Doc Brown Saves the World", director="Robert Zemeckis"),
    Movie(year=2016, name="Allied", director="Robert Zemeckis"),
    Movie(year=2018, name="Welcome to Marwen", director="Robert Zemeckis"),
    Movie(year=2020, name="The Witches", director="Robert Zemeckis"),
]

class MovieFinder:

    def find_all(self):
        return data
finder.py

which, in turn, depends on the model Movie from the module movies

from dataclasses import dataclass

@dataclass
class Movie:
    name: str
    director: str
    year: int
movies.py

Again, to simplify, here the MovieFinder obtains a list of movies that is simply hardcoded on the module. In a real world situation, the finder module whould use a database or an external service to fetch the data, here I opted to hardcode some movies from the Robert Zemeckis entry on wikipedia.

This approach works as expected:

(.venv) paudirac $ python main.py
[Movie(name='1941', director='Steven Spielberg', year=1979)]

so maybe it is time to pause and watch that movie. I strongly recomed you to do it right now!

1941
1941

So, what’s the problem?

Still here? Have you watched the movie? I’ll wait.

Ok, let’s do some testing.

Yeah, I know, I know. Forget for a moment that we should have started by doing the tests in the first place, and concede me this little sin for the shake of the story telling.

We can easily write a test

from lister import MovieLister

def test_lister():
    lister = MovieLister()
    spielberg_movies = lister.movies_directed_by('Spielberg')
    assert len(spielberg_movies) > 0
    assert all(movie.director == 'Spielberg' for movie in spielberg_movies)
test_lister.py

that breaks

(.venv) paudirac $ pytest test_lister.py
platform linux -- Python 3.8.2, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
============================= test session starts ==============================
rootdir: /home/pcervera/tmp
collected 1 item

test_lister.py F                                                         [100%]

=================================== FAILURES ===================================
_________________________________ test_lister __________________________________

    def test_lister():
        lister = MovieLister()
        spielberg_movies = lister.movies_directed_by('Spielberg')
>       assert len(spielberg_movies) > 0
E       assert 0 > 0
E        +  where 0 = len([])

test_lister.py:8: AssertionError
=========================== short test summary info ============================
FAILED test_lister.py::test_lister - assert 0 > 0
============================== 1 failed in 0.03s ===============================

The problem is that code breaks the test even when the implementation of the movies_directed_by method is correct!

The problem is that we can’t test it, because MovieFinder is hard wired onto MovieLister.

from finder import MovieFinder # <- here

class MovieLister:

    def __init__(self):
        self.finder = MovieFinder() # <- and specially here

We say that MovieLister depends on MovieFinder.

Monkey patching

In python (and dynamic languages in general) we can temporaly circumvent this and use a patch to mock the dependency during the test.

from unittest import mock
from lister import MovieLister
from movies import Movie

def test_lister():
    spielberg = [
        Movie(year=1979, name="1941", director="Spielberg"),
    ]
    with mock.patch('lister.MovieFinder') as mocked_finder_class:
        mocked_finder_class.return_value.find_all.return_value = spielberg
        movie_lister = MovieLister()
        spielberg_movies = movie_lister.movies_directed_by('Spielberg')
        mocked_finder_class.return_value.find_all.assert_called_once()
        assert len(spielberg_movies) > 0
        assert all(movie.director == 'Spielberg' for movie in spielberg_movies)
test_lister.py

See:

(.venv) paudirac $ pytest test_lister.py
============================= test session starts ==============================
platform linux -- Python 3.8.2, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
rootdir: /home/pcervera/tmp
collected 1 item

test_lister.py .                                                         [100%]

============================== 1 passed in 0.06s ===============================

unittest.mock.patch

How to use the patch method from the unittest.mock package deserves a post in itself. I'll encourage you to read Python Mock Gotchas and the docs about where to patch to fully understand what is going on here. But, roughly, we're dynamically rebinding the MovieFinder name on the lister module with a mock that substitues the MovieFinder class and then we are also mocking the result of the find_all method of the resulting mock instance class to return what we need to write the test. And we do this without touching the lister module code at all.

This monkey patching trickery allows us to write some tests and can be a great option in the case that you’ll have a dependency on some module that you don’t own and that you can’t modify.

But the dependencies are still wrong.

Do dependencies exist when you’re not watching?

The current situation is as follows.

MovieLister depends on MovieFinder, that, in turn, depends on a specific backend (simplified here with hardcoded data, but imagine a database instead).

The application depends on the MovieLister and, transitively, on the MovieFinder and on the database also.

implicit dependencies: MovieLister  -> MovieFinder
implicit dependencies: MovieLister -> MovieFinder

Abstractions

If you look carefully, though, there’s a subtlety here.

MovieLister only depends on some abstraction that is named find_all. Python being dynamic and lightweight often hides this subtlety, but it exists anyway.

The need to patch this exact dependency on the test to isolate MovieLister and be able to unit test it, supports this point of view.

Actually, the lister module also depends on the Movie class from the movies module.

We can make this a little bit more explicit by adding some typing:

import typing

from movies import Movie # <- depends on Movie
from finder import MovieFinder # <- here

class MovieLister:

    def __init__(self):
        self.finder = MovieFinder() # <- and specially here

    def movies_directed_by(self, director: str) -> typing.List[Movie]: # <- depends on Movie
        all_movies = self.finder.find_all()
        return [movie for movie in all_movies if movie.director == director]
lister.py

In a language with a type system as Haskell’s, the compiler will infer that abstraction during the static type analysis, without needing type annotations when compiling.

It will infer a typeclass that has a signature (converted to pythonic syntax) like

def find_all(self) -> typing.List[Movie]

In languages like C# or Java this will correspond to an interface and on dynamic languages this is usally refered to as duck typing.

In any case, I do think the abstractions exist.

From the design viewpoint, it does not matter much if the compiler is able to detect them or not. For me, the important point is that when we humans detect them, we seem to be able to write much better code. Better in the sense that the code does what we expect the code to do.

So, yes, dependencies exist when we’re not watching. Abstractions exist even when we’re not watching.

Making the dependencies explicit

The dependencies are like those in the picture:

wrong dependencies
wrong dependencies

With this insight, we adapt the code to explicitly communicate this.

Modeling abstractions

Python doesn’t have intefaces, but we can use an abstract class with an empty method that represents our abstraction. Following some C#ish style naming conventions we prepend an I to the class name to denote that interface intent (this is not idiomatic python code, is just a convention we found usefull).

The IMovieFinder abstraction gets represented as

import abc
import typing

class IMovieFinder(abc.ABC):

    @abc.abstractmethod
    def find_all(self) -> typing.List[Movie]:
        pass

Next, we modify MovieFinder to implement this behaviour

class MovieFinder(IMovieFinder):

    def find_all(self) -> typing.List[Movie]:
        return data

Now the code reflects what dependencies it has and who depends on who.

We can summarize the dependencies on the following table

Class Depends on
MovieLister IMovieFinder
MovieLister MovieFinder
MovieFinder IMovieFinder

According to the dependency inversion principle we’ve got some things wrong.

Dependency inversion principle

The dependency inversion principle says

  1. High level modules should not depend upon low level modules. Both should depend upon abstractions.
  2. Abstractions should not depend upon details. Details should depend upon abstractions.

Clearly, we’ve got the second dependency on the table wrong: the MovieLister module is depending on MovieFinder that is a lower level module.

We can solve this by _inverting the dependency_.

Lower?

When we say that the module MovieLister is a lower level module than MovieFinder we mean that it knows more things about the real implementation.

High level modules and classes express the business use cases in terms of objects, modules and classes in the domain itself, independently of how those are implmented.

MovieLister doesn’t care if you’re using a mock implementation, an implementation that extracts data from a hardcoded array on memory or an implementation that queries a real database or service.

On the contrary, MovieFinder does. So it is lower level: it has more details and is less abstract.

up down
up down

How to invert a dependency

A simple way to invert a dependency is removing the call site where it is constructed.

We can remove the dependency of MovieLister on the MovieFinder module, by requiring the dependency to be passed in from outside instead of instantiating it on the module.

Now the module only depends on the high level abstraction IMovieFinder, but not in MovieFinder anymore, so we can get rid of the import also.

import typing

from movies import Movie, IMovieFinder

class MovieLister:

    def __init__(self, finder: IMovieFinder):
        self.finder = finder

    def movies_directed_by(self, director: str) -> typing.List[Movie]:
        all_movies = self.finder.find_all()
        return [movie for movie in all_movies if movie.director == director]
lister.py
We could have got rid of the IMovieFinder also by relying on duck typing, being python dynamic. But here I prefer to add the import that allows me to type annotate the method. This adds a little overhead to the language, but can also document the code without resorting to comments. And, contrary to comments that are automatically legacy (because comments can't be tested), we can add some extra test in the form of some static type checking. And, from the phylosophical point of view, I think that the abstraction exists, so I find it usefull to document it in code. I find it valuable that the business abstractions appear explicitly on the design.

Now the dependencies are right:

right dependencies
right dependencies

Is obvious from the diagram that the MovieFinder implementation doesn’t need to know anything about the domain except IMovieFinder (and Movie).

Better still, the domain don’t need to know even that MovieFinder exists.

Testable code

This enables testing automatically. We can simply mock the dependency and inject the mock. There’s no need to patch anything anymore:

from unittest import mock
from lister import MovieLister
from movies import Movie

def test_lister():
    spielberg = [
        Movie(year=1979, name="1941", director="Spielberg"),
    ]
    movie_finder_mock = mock.MagicMock()
    movie_finder_mock.find_all.return_value = spielberg
    movie_lister = MovieLister(finder=movie_finder_mock)
    spielberg_movies = movie_lister.movies_directed_by('Spielberg')
    movie_finder_mock.find_all.assert_called_once()
    assert len(spielberg_movies) > 0
    assert all(movie.director == 'Spielberg' for movie in spielberg_movies)

Guards

We can even assert at runtime that the dependency is fullfilled. First we add the edge cases tests:

from unittest import mock
from lister import MovieLister
from movies import Movie, IMovieFinder
import pytest

def test_lister():
    spielberg = [
        Movie(year=1979, name="1941", director="Spielberg"),
    ]
    movie_finder_mock = mock.create_autospec(spec=IMovieFinder)
    movie_finder_mock.find_all.return_value = spielberg
    movie_lister = MovieLister(finder=movie_finder_mock)
    spielberg_movies = movie_lister.movies_directed_by('Spielberg')
    movie_finder_mock.find_all.assert_called_once()
    assert len(spielberg_movies) > 0
    assert all(movie.director == 'Spielberg' for movie in spielberg_movies)

def test_lister_needs_finder():
    with pytest.raises(TypeError):
        movie_lister = MovieLister()

def test_lister_needs_not_None_finder():
    with pytest.raises(TypeError):
        movie_lister = MovieLister(finder=None)

def test_lister_needs_fails_without_an_IMovieFinder():
    finder_mock = mock.MagicMock()
    with pytest.raises(TypeError):
        movie_lister = MovieLister(finder=finder_mock)

and we can narrow the interface as much as we want via guards or assertions on the initializer:

import typing

from movies import Movie, IMovieFinder

class MovieLister:

    def __init__(self, finder: IMovieFinder):
        if finder is None:
            raise TypeError("finder canno't be None")
        if not isinstance(finder, IMovieFinder):
            raise TypeError("finder should be an IMovieFinder")
        self.finder = finder

    def movies_directed_by(self, director: str) -> typing.List[Movie]:
        all_movies = self.finder.find_all()
        return [movie for movie in all_movies if movie.director == director]
lister.py

Push behaviour to the domain

Having broken the dependency enables us to push the behaviour further and put it all inside the domain.

right dependencies use case
right dependencies use case
import typing
import abc

from movies import Movie, IMovieFinder


class IMovieLister(abc.ABC):

    @abc.abstractmethod
    def movies_directed_by(self, director: str) -> typing.List[Movie]:
        pass


class MovieLister(IMovieLister):

    def __init__(self, finder: IMovieFinder):
        if finder is None:
            raise TypeError("finder canno't be None")
        if not isinstance(finder, IMovieFinder):
            raise TypeError("finder should be an IMovieFinder")
        self.finder = finder

    def movies_directed_by(self, director: str) -> typing.List[Movie]:
        all_movies = self.finder.find_all()
        return [movie for movie in all_movies if movie.director == director]
lister.py

We can create a new object UseCase that knows everything about the abstractions that conform the use case, but nothing about the current implementations.

from lister import IMovieLister

def print_movies_directed_by(director: str, lister: IMovieLister):
    spielberg_movies = lister.movies_directed_by('Steven Spielberg')
    print(spielberg_movies)
app.py

Only in the application layer the system is assembled with concrete objects that fullfill the abstractions needed by the use cases.

#!/usr/bin/env python
import app
from lister import MovieLister
from finder import MovieFinder

if __name__ == '__main__':
    finder = MovieFinder()
    lister = MovieLister(finder=finder)
    app.print_movies_directed_by('Steven Spielbert', lister)
main.py

Who will assemble the application layer is what motivated the post in the first place, and now we are just where we’ve started.

In this simple example all the stuff we’ve done is probably overkill, but in a more realistic situacion, where a the application is interacting with some other applications (probably not ours), the hability to test the business logic without needing to have the whole setup is crucial.

We can easily imagine a system that has more dependencies. For instance, just adding a simple database will increase the boilerplate to set up the application:

from lister import MovieLister
from finder import MovieFinder
from db import Connection
from config import sql_connection_string

if __name__ == '__main__':
    # init boilerplate
    connection = Connection(conn_string=connection_string)
    finder = MovieFinder(connection=connection)
    lister = MovieLister(finder=finder)
    # end boilerplate
    app.print_movies_directed_by('Steven Spielbert', lister)

DI is also beneficial in python

We don’t yet know how to solve the IoC stuff yet, but the benefits of DI can be used in python also. In short:

  • We managed to invert dependencies (with a little bit of effort and uglyness.
  • We can test the domain without taking the banana, the gorilla and the jungle.
  • The code might (arguably) be more understandable with explicit modeling of the abstractions that arose during the coding process.
  • We don’t yet have a nice way to have the application composed with concrete implementations for us.

What might be an antipattern is an IoC container in python. This is why there are no containers in python.

Further explorations of the topic will follow.

References