After long hesitation, I switched to Ruff in a new project, and I realised I should have done that long ago.
Why do we use automated code formatters and linters? How do we do it exactly?
Black and code formatting
We all know that code readability is supreme. You need to read code 10x more than write, so any friction related to that is a waste of your time.
Code can be written in different styles, and there is an ongoing debate about which is best. And there are many-many options to select from.
Black solved this by enforcing a single uncompromising standard. We all know the famous quote:
“Customers can have any colour they want, as long as it is black.” - Henry Ford
You simply have no options. Any time you save your code, black formats it to this standard.
Linting
But black doesn’t do linting. Linting is identifying (and eliminating) programming techniques that are valid but should be avoided.
A typical example is zip() without the strict=True option. If you don’t specify the parameter and use zip() with two different length lists, it will silently throw away the excess elements. While this is almost always a hidden error, it is better to force the programmer to explicitly make this decision or fail if zip() is called with different length parameters.
This and many other situations are checked with linting.
Setup
VSCode
Add the following part to your `.vscode/settings.json`
"editor.formatOnSave": true,
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.codeActionsOnSave": {
"source.organizeImports": "explicit"
},
"editor.insertSpaces": true,
"editor.tabSize": 4
},
Any time you press Cmd-S, VSCode will reformat the code with Ruff. It will also underline the code that should be formatted.
pyproject.toml
[tool.ruff]
line-length = 120
exclude = ["excluded_file.py"]
lint.select = [
"E", # pycodestyle errors (settings from FastAPI, thanks, @tiangolo!)
"W", # pycodestyle warnings
"F", # pyflakes
"I", # isort
"C", # flake8-comprehensions
"B", # flake8-bugbear
]
lint.ignore = [
"E501", # line too long
"C901", # too complex
]
[tool.ruff.format]
quote-style = "preserve"
[tool.ruff.lint.isort]
order-by-type = true
relative-imports-order = "closest-to-furthest"
extra-standard-library = ["typing"]
section-order = ["future", "standard-library", "third-party", "first-party", "local-folder"]
known-first-party = []
This will set the recommended line length to 120, allow you to keep the single/double quotes and give you sensible linting and import formatting rules.
Pre-commit-hooks
Add these to your .pre-commit-hooks.yaml
- repo: local
hooks:
- id: ruff-format
name: ruff-format
entry: poetry run ruff format
language: system
types: [python]
- id: ruff
name: ruff
entry: poetry run ruff check --fix
language: system
types: [python]
Whenever you create a commit in git, it will format and fix any linting errors on all edited codes. So you don’t have to worry; your repo will always have a consistent style.
GitHub Actions
Add the following code to `.github/workflows/python-app.yml`
- name: Lint with ruff
run: |
poetry run ruff check .
Whenever you create a PR, Ruff will run on the new PR to double-check all the code (not just the part you modified).
That’s it!
The above are just a few techniques leading to an efficient workflow. If you are interested in learning more about setting these out, check out my interactive “Python Project Essentials” course on the MLOps Comunity’s learning platform:
https://learn.mlops.community/courses/languages/project-essentials/
See you there!