SOFTWARE ENGINEERING

Formatting code automatically

Improve your python code quality

5 min readMay 20, 2022

Every software engineer has his/her personal style when writing code and there is no change in that. However, there are some simple guidelines that we can follow, which can benefit all of us. After all, code formatting is often your reader’s first encounter with your system.

Why even bother?

“If my code works, it works! And that’s enough” 🤦‍♀️ Nope, this is wrong. Formatting Python code the right way is essential for the following reasons:

Readability: most probably, you will not be the only developer reading your code or working on your system. A well-formatted and organised script is easier to read, understand and maintain, which also allows us to:
Spot bugs more easily
Consistency: everyone has its own personal style. Following certain formatting rules, when coding in Python, allows the team to focus on the actual code and not having to deal with conflicts concerning formatting

📦 Which packages?

black

black is a Python code formatter. It will not change the behavior of your code after reformatting because it assures that the Python bytecode remains the same.

flake8

flake8 is a great toolkit for checking your code base against coding style (PEP8) and programming errors (like “library imported but unused” and “Undefined name”).

flake8 also has some cool features. You can ignore specific errors on a line by using # noqa: <error>, e.g., # noqa: E234. Multiple codes can be given, separated by a comma. The noqa token is case insensitive, the colon before the list of codes is required otherwise the part after noqa is ignored.

Does black replace flake8?

Not at all. Black only reformats, you still run flake8 to check for various issues not covered by black.

isort

isort is a Python library to sort imports alphabetically, and automatically separated into sections and by type.

💻 How?

Who doesn’t love clean and well ordered codebases? I know I do. I love them even more if I don’t have to manually spend hours doing it myself. But not everyone shares the same passion. That is ok though, since all the magic is going to happen automatically.

The ideal moment to introduce proper formatting checks is before committing our changes to our working branch. This is where pre-commit comes into the picture. It is known for its simple setup and easy configuration. Let’s see how we can set it up 👀

I really recommend you create in your repository a requirements-dev.txt file which will look like this:

-r requirements.txt
black==22.3.0
flake8==4.0.1
isort==5.10.1
pre-commit

This file is very similar to the well known requirements.txt but it is only used during project development. Thus, in your Dockerfile you will only include the requirement.txt and not the requirements-dev.txt.

📝: you can use any package version you wish; what I’ve written is only a suggestion.

The next step is to create a virtual environment. If you do not know how, read this.

Install the requirements by running this in your terminal:

pip install -r requirements-dev.txt

And now it is time to create the.pre-commit-config.yaml file. This contains the pre-commit hooks you want to run every time before you commit. An example file looks like this:

repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.1.0
    hooks:
      - id: check-added-large-files # prevents giant files from being committed
      - id: check-merge-conflict # checks for files that contain merge conflict strings
      - id: check-yaml # checks yaml files for parseable syntax
      - id: detect-private-key # detects the presence of private keys
      - id: end-of-file-fixer # ensures that a file is either empty, or ends with one newline
      - id: mixed-line-ending # replaces or checks mixed line ending
      - id: requirements-txt-fixer # sorts entries in requirements.txt
      - id: trailing-whitespace # trims trailing whitespace
      - id: pretty-format-json # checks that all your JSON files are pretty

  - repo: https://github.com/psf/black
    rev: 22.3.0 # be careful to use the same version as in the requirements-dev.txt
    hooks:
      - id: black
        language_version: python3- repo: https://github.com/PyCQA/flake8
    rev: 4.0.1
    hooks:
      - id: flake8
        language_version: python3

  - repo: https://github.com/PyCQA/isort
    rev: 5.10.1
    hooks:
      - id: isort

Last step of our setup is to activate pre-commit in the repository in order to run the checks before each commit. You can simply run the following in your terminal:

pre-commit install

Let’s see at an example case. Imagine we have the following script that performs PCA on some dummy data. Nothing is wrong in this file; meaning that the code will run and you will get the result. But is it pretty? Can you quickly tell me what is happening in this code if you spend only 5 seconds on it?

Extra lines for no reason, no space around operator = , importing matplotlib although it is not used, etc

If we did not have pre-commit configured then we would have to manually make the changes. Who has time for that?? So, let’s see what will happen if I git commit this file.

The file looks now like this:

and I also got this error:

src/dummy.py:11:1: F401 'matplotlib.pyplot as plt' imported but unusedsrc/dummy.py:11:1: E402 module level import not at top of file

Since flake8 found some errors the file cannot be committed unless we fix the errors.

After removing the unnecessary import and running git commit once again:

the file looks perfect!

So, is there a way to be a lazy developer and still have a clean codebase?

Oh yes! The answer is pre-commit😉