smhk

Replacing Tox with Poetry and pre-commit for CI linting

TL;DR: These notes meander a little. The initial problem was Tox being an inefficient way to do linting with my setup. I arrive at the solution of replacing Tox with pre-commit for running linting tools, both locally and in CI.

Background §

For Python development, I use the following tools:

black
For formatting the code.
isort
For sorting imports.
flake8
For linting.
pylint
For linting.

Ideally I’d like a way to do the following:

  • Run all the tools in one go.
  • Run a single tool, so that I can have a quick feedback loop on the results of that specific tool.

I’m also using the following (for lack of a better term) “runners”:

Tox
Primarily for running unit tests in CI. Runs the tests under multiple Python versions. Runs all the tools listed above in “check” mode (i.e. --check is used with black and isort so that it does not modify files on disk, but still returns an exit code to indicate whether any changes would have been made)
pre-commit
Currently used only for local development. Runs the tools listed above, and does modify files on disk (if changes are necessary).

Starting configuration §

I have pre-commit configured this way, so that it only runs black and isort upon commit:

.pre-commit-config.yaml
fail_fast: true
repos:
- repo: local
  hooks:
    - id: black
      name: black
      entry: poetry run black
      language: system
      types: [file, python]
    - id: isort
      name: isort
      entry: poetry run isort
      language: system
      types: [file, python]

I have Tox configured like this, so that it runs black, isort, flake8 and pylint:

tox.ini
[testenv:black]
commands =
    black setup.py conftest.py docs/ src/ tests/ --exclude="IgnoreThisFile\.py" --check

[testenv:isort]
commands =
    isort . --check --diff

[testenv:flake8]
commands =
    flake8 src/my_app

[testenv:pylint]
commands =
    pylint src/my_app --recursive=true

This then runs under GitLab CI:

.gitlab-ci.yml
lint:combined:
  # ...
  script:
    - tox -e black,isort,flake8,pylint

Run all the tools §

With this configuration, I can manually run all the tools by:

  • Using pre-commit:
    • If I attempt to commit with git commit -m "blah", then pre-commit will run all the tools. If any fail, then commit is aborted.
    • If I run poetry run pre-commit run, then pre-commit will run all the tools, bypassing Git.
  • Using Tox:
    • Since I have a Tox configuration for each tool, I have to do tox -e isort,black,flake8,pylint. A separate testenv is used for each environment, making this slower than the pre-commit approach.

This last point is particularly painful. Splitting lots of separate tools into separate Tox testenvs can make it quite tedious to run them. Perhaps it’s not the best approach?

Run a single tool §

With this configuration, I can run a single tool by:

  • Using pre-commit:
    • Using poetry run pre-commit run <id>, where <id> is defined in .pre-commit-config.yaml, I can run an individual tool, e.g. poetry run pre-commit run isort to just run isort.
    • By default it will only run on the files that have change since the last commit (if any), so use --all to force it to run against all files (if needed).
  • Using Tox:
    • I can run a specific tool my doing tox -e <testenv>, e.g. tox -e isort to just run isort.
    • Tox is unaware of which files have changed since the last commit, so it will always run against all files specified.

Improving the Tox approach §

A real pain point is that running tox -e black,isort,flake8,pylint is so slow because it uses a separate Python virtual environment for each testenv. It would be helpful to reuse testenvs, or have a more dynamic way to specify which command(s) to run in the testenv. Though perhaps that is going against the Tox appproach? I think so, as we will find our next…

Approach 1: Reuse testenv §

It would speed up Tox if I could reuse the same testenv for all the tools.

This would enable me to either run all linting stages back-to-back using the same testenv:

$ tox -e isort,black,flake8,pylint

Or to run them individually, for example, if I just want to test I’ve fixed some flake8 linting issue:

$ tox -e flake8

Unfortunately, this does not appear to be possible in Tox 4 without using an experimental plugin.

In Tox 3, it was possible by doing this, however as of Tox 4 that is no longer supported, and the developers do not intend to add support for it.

The Tox 4 plugin tox-ignore-env-name-mismatch claims makes it possible to do this again, but I have not tried it since I do not want to rely upon a tool marked as experimental.

Approach 2: Duplicate the testenv §

Since reusing testenvs is not looking so good, an alternative approach is to create a new testenv that contains a copy of all the commands:

tox.ini
# Commands for running tools individually.
[testenv:black]
commands =
    black setup.py conftest.py docs/ src/ tests/ --exclude="IgnoreThisFile\.py" --check

[testenv:isort]
commands =
    isort . --check --diff

[testenv:flake8]
commands =
    flake8 src/my_app

[testenv:pylint]
commands =
    pylint src/my_app --recursive=true

# Command for running all tools.
[testenv:combined]
commands =
    black setup.py conftest.py docs/ src/ tests/ --exclude="IgnoreThisFile\.py" --check
    isort . --check --diff
    flake8 src/my_app
    pylint src/my_app --recursive=true

Then update our GitLab CI to:

.gitlab-ci.yml
lint:combined:
  # ...
  script:
    - tox -e combined

This is not great. It goes against DRY, and adds yet another testenv (which means another Python virtualenv has to be created). On the plus side, running tox -e combined is faster than tox -e isort,black,flake8,pylint.

Approach 3: Just use pre-commit §

It seems redundant to define entry points for the linting tools in both pre-commit and Tox. Since there is no nice way to achieve the flexibility I need within Tox, and everything I need can be done with pre-commit, then I will just use pre-commit for these tools and remove them from Tox. This means CI will also use pre-commit to run the tools, which may seem a little unusual, but in practice works well. In fact, the pre-commit website has a section on using pre-commit in CI:

pre-commit can also be used as a tool for continuous integration. For instance, adding pre-commit run --all-files as a CI step will ensure everything stays in tip-top shape. To check only files which have changed, which may be faster, use something like pre-commit run --from-ref origin/HEAD --to-ref HEAD

Replacing Tox with pre-commit §

In these steps I’ll be replacing Tox with pre-commit for the linting tools, for the reasons covered earlier. Tox will continue to be used for running unit tests.

Update the pre-commit config §

Following is the new pre-commit config. The main change is it now includes flake8 and pylint. Since not all of src/ passes linting, the files option is used to specify that only src/my_app should be linted. This is equivalent to how in the tox.ini we were passing specific file paths into flake8 and pylint. Remember that pre-commit is a bit smarter than Tox, in that by default it automatically passes in just the files that have changed since the last commit, so we have to use pre-commit’s configuration to add additional filtering.

.pre-commit-config.yaml
fail_fast: true
repos:
- repo: local
  hooks:
    - id: black
      name: black
      entry: poetry run black
      language: system
      types: [file, python]
    - id: isort
      name: isort
      entry: poetry run isort
      language: system
      types: [file, python]
    - id: flake8
      name: flake8
      entry: poetry run flake8
      language: system
      types: [file, python]
      files: 'src/my_app'
      exclude: 'IgnoreThisFile\.py'
    - id: pylint
      name: pylint
      entry: poetry run pylint
      language: system
      types: [file, python]
      files: 'src/my_app'
      exclude: 'IgnoreThisFile\.py'

Update CI §

We can then update our GitLab CI to run via pre-commit instead of Tox:

.gitlab-ci.yml
lint:combined:
  # ...
  script:
    - poetry install --with dev
    - poetry run pre-commit run --all

In the GitLab pipeline, the log looks something like:

$ poetry run pre-commit run --all
black....................................................................Passed
isort....................................................................Passed
flake8...................................................................Passed
pylint...................................................................Passed
Cleaning up project directory and file based variables
Job succeeded

Let’s give this a try and see how it works out.

Future work §

I’ve heard good things about ruff. It has “Drop-in parity with Flake8, isort and Black”, so should replace all these disparate tools I use with just one tool. It looks worth a try.