Ruff - the new king of Python code formatting

2024-07-19

Jul 18, 2024

After long hesitation, I switched to Ruff in a new project, and I realised I should have done that long ago.

Why do we use automated code formatters and linters? How do we do it exactly?

Black and code formatting

We all know that code readability is supreme. You need to read code 10x more than write, so any friction related to that is a waste of your time.

Code can be written in different styles, and there is an ongoing debate about which is best. And there are many-many options to select from.

Black solved this by enforcing a single uncompromising standard. We all know the famous quote:

“Customers can have any colour they want, as long as it is black.” - Henry Ford

You simply have no options. Any time you save your code, black formats it to this standard.

Linting

But black doesn’t do linting. Linting is identifying (and eliminating) programming techniques that are valid but should be avoided.

A typical example is zip() without the strict=True option. If you don’t specify the parameter and use zip() with two different length lists, it will silently throw away the excess elements. While this is almost always a hidden error, it is better to force the programmer to explicitly make this decision or fail if zip() is called with different length parameters.

This and many other situations are checked with linting.

Setup

VSCode

Add the following part to your `.vscode/settings.json`

   "editor.formatOnSave": true, 
   "[python]": {
        "editor.defaultFormatter": "charliermarsh.ruff",
        "editor.codeActionsOnSave": {
            "source.organizeImports": "explicit"
        },
        "editor.insertSpaces": true,
        "editor.tabSize": 4
    },

Any time you press Cmd-S, VSCode will reformat the code with Ruff. It will also underline the code that should be formatted.

pyproject.toml

[tool.ruff]
line-length = 120

exclude = ["excluded_file.py"]
lint.select = [
    "E",  # pycodestyle errors (settings from FastAPI, thanks, @tiangolo!)
    "W",  # pycodestyle warnings
    "F",  # pyflakes
    "I",  # isort
    "C",  # flake8-comprehensions
    "B",  # flake8-bugbear
]
lint.ignore = [
    "E501",  # line too long
    "C901",  # too complex
]

[tool.ruff.format]
quote-style = "preserve"

[tool.ruff.lint.isort]
order-by-type = true
relative-imports-order = "closest-to-furthest"
extra-standard-library = ["typing"]
section-order = ["future", "standard-library", "third-party", "first-party", "local-folder"]
known-first-party = []

This will set the recommended line length to 120, allow you to keep the single/double quotes and give you sensible linting and import formatting rules.

Pre-commit-hooks

Add these to your .pre-commit-hooks.yaml

  - repo: local
    hooks:
      - id: ruff-format
        name: ruff-format
        entry: poetry run ruff format
        language: system
        types: [python]
      - id: ruff
        name: ruff
        entry: poetry run ruff check --fix
        language: system
        types: [python]

Whenever you create a commit in git, it will format and fix any linting errors on all edited codes. So you don’t have to worry; your repo will always have a consistent style.

GitHub Actions

Add the following code to `.github/workflows/python-app.yml`

    - name: Lint with ruff
      run: |
        poetry run ruff check .

Whenever you create a PR, Ruff will run on the new PR to double-check all the code (not just the part you modified).

That’s it!

The above are just a few techniques leading to an efficient workflow. If you are interested in learning more about setting these out, check out my interactive “Python Project Essentials” course on the MLOps Comunity’s learning platform:

https://learn.mlops.community/courses/languages/project-essentials/

See you there!

Deliberate Machine Learning

Discussion about this post