Trying to understand what is refactoring, follow the guide from Real Python.

1. Functions That Should Be Objects

Without reading the code, you will not know if it will modify the original image or create a new image.

def load_image(path):
    with open(path, "rb") as file:
        fb = file.load()
    image = img_lib.parse(fb)
    return image

def crop_image(image, width, height):
    return image

def get_image_thumbnail(image, resolution=100):
    return image

To call the codes:

from imagelib import load_image, crop_image, get_image_thumbnail

image = load_image('~/face.jpg')
image = crop_image(image, 400, 500)
thumb = get_image_thumbnail(image)

Symptoms of code using functions that could be refactored into classes:

  • Similar arguments across functions
  • Higher number of Halstead h2 unique operands (All the variables and constants are considered operands)
  • Mix of mutable and immutable functions
  • Functions spread across multiple Python files

Here is a refactored version of those 3 functions, where the following happens:

  • .__init__() replaces load_image().
  • crop() becomes a class method.
  • get_image_thumbnail() becomes a property.

The thumbnail resolution has become a class property, so it can be changed globally or on that particular instance:

class Image(object):
    thumbnail_resolution = 100
    def __init__(self, path):

    def crop(self, width, height):

    def thumbnail(self):
        return thumb refactored

This is how the refactored example would look:

from imagelib import Image

image = Image('~/face.jpg')
image.crop(400, 500)
thumb = image.thumbnail
usage refactored

In the resulting code, we have solved the original problems:

  • It is clear that thumbnail returns a thumbnail since it is a property, and that it doesn’t modify the instance.
  • The code no longer requires creating new variables for the crop operation.

2. Objects That Should Be Functions

Here are some tell-tale signs of incorrect use of classes:

  • Classes with 1 method (other than .__init__())
  • Classes that contain only static methods
class Authenticator(object):
    def __init__(self, username, password):
        self.username = username
        self.password = password

    def authenticate(self):
        return result
authentication class

It would make more sense to just have a simple function named authenticate() that takes username and password as arguments:

def authenticate(username, password):
    return result

3. Converting “Triangular” Code to Flat Code

These are the symptoms of highly nested code:

  • A high cyclomatic complexity because of the number of code branches
  • A low Maintainability Index because of the high cyclomatic complexity relative to the number of lines of code
def contains_errors(data):
    if isinstance(data, list):
        for item in data:
            if isinstance(item, str):
                if item == "error":
                    return True
    return False

Refactor this function by “returning early”

def contains_errors(data):
    if not isinstance(data, list):
    	return False
	return data.count("error") > 0
function refactored

Another technique to reduce nesting by list comprehension

Common practise to create list, loop through and check for criteria.

results = []
for item in iterable:
    if item == match:

Replace with:

results = [item for item in iterable if item == match]

4. Handling Complex Dictionaries With Query Tools

It does have one major side-effect: when dictionaries are highly nested, the code that queries them becomes nested too.

data = {
 "network": {
  "lines": [
     "name.en": "Ginza",
     "": "銀座線",
     "color": "orange",
     "number": 3,
     "sign": "G"
     "name.en": "Marunouchi",
     "": "丸ノ内線",
     "color": "red",
     "number": 4,
     "sign": "M"

If you wanted to get the line that matched a certain number, this could be achieved in a small function:

def find_line_by_number(data, number):
    matches = [line for line in data if line['number'] == number]
    if len(matches) > 0:
        return matches[0]
        raise ValueError(f"Line {number} does not exist.")
find by line
find_line_by_number(data["network"]["lines"], 3)

There are third party tools for querying dictionaries in Python. Some of the most popular are JMESPath, glom, asq, and flupy.

import jmespath"network.lines", data)
[{'name.en': 'Ginza', '': '銀座線',
  'color': 'orange', 'number': 3, 'sign': 'G'}, 
 {'name.en': 'Marunouchi', '': '丸ノ内線',
  'color': 'red', 'number': 4, 'sign': 'M'}]

If you wanted to get the line number for every line, you could do this:

>>>"network.lines[*].number", data)
[3, 4]

You can provide more complex queries, like a == or <. The syntax is a little unusual for Python developers, so keep the documentation handy for reference.

Find the line with the number 3

>>>"network.lines[?number==`3`]", data)
[{'name.en': 'Ginza', '': '銀座線', 'color': 'orange', 'number': 3, 'sign': 'G'}]