Recursions and functools.cache: 12000x improvement!!!

2023-08-06

Aug 06, 2023

We often have problems which are difficult (or impossible) to express in closed form, but there is a relatively straightforward recursive solution.

I am testing this new feature on LinkedIn. If you aren’t following me there yet, click it to do so:

Follow me on LinkedIn

The problem with recursions

The problem is that these recursions rarely take into account the realities of practical computing. Unless you are a Haskel-wiz, good luck solving anything in a short time.

But here is where memoisation should help us. At any function call, we check a cache that stores all previous calls to the function with argument-result pairs. Instead of going through the recursive branch, you can just return the value instantly. Everything only needs to be calculated once.

The problem with memoisation

Memoisation is fiddly to implement. But here comes functools to the rescue.

Just add @cache decorator to the function, and you are done.

A test problem: Partition function

The below example calculates the partition number of an integer. Partitioning is breaking down a number into decreasing non-zero parts.

Like a partition of 11 is (4, 3, 2, 1, 1) because 11 = 4 + 3 + 2 + 1 + 1.

The partition number is the total number of different ways this can be done for a number. For 11, this is 56.

There is no closed-form solution for the problem, but the recursion is relatively straightforward (see the comments for a short video deriving it).

Of course, sympy implemented it, so I used a simple test to ensure I didn't make an error.

The decorator wraps the original function into a class that stores the cache. You can get information and clear the cache as shown below.

With cache, the function runs in a trivial time; without it, it is unfeasibly slow, even for relatively small numbers.

Comparing cached vs non-cached version

To evaluate the benefits, we need a non-cached version:

Instead of the cache, we increment a counter every time we call the function. This will give us an idea of how much improvement caching brings if we look at the number of hits and misses in CacheInfo.

Let’s do a test at n=70.

As you can see, the cached version has about 5000 hits+misses, and the cache size is 3710. This makes sense as a good guess from the algorithm that each (n,k) combination will be called, so the upper limit is about 4900.

The cached version returned nearly instantly; the non-cached one took about 11 seconds. Of course, it needed to churn through 61 million function calls! (I chose n=70 as a good number as slightly larger ones just hung up my computer).

The improvement is 12000x!!! Talking about improvements…

The cache doesn’t clear between function runs (hence the explicit cache_clear() call), so this can be even higher if the function is called multiple times.

Deliberate Machine Learning

Discussion about this post