abstract background with swirls multicolor splashes of paints

Mini-Post: Exploring Python Functions Through Experimentation

Read the docs. Look at the source code. Yes, these are great tips to figure out what’s happening underneath the hood. But sometimes you want to do a bit of experimentation and exploration of your own first.

It’s more fun this way. And sometimes more instructive, too.

Here’s something I came across today which I decided to explore. I was using random.choice() and dictionaries. At one point I wrote the following by mistake:

import random

points = {
    "Stephen": 19,
    "Mary": 24,
    "James": 44,
}

print(random.choice(points))

You can’t use dictionaries as arguments for random.choice() since the function needs a sequence. How silly of me, I actually knew that already.

But, had I not made that mistake, I would have denied myself a fun and instructive half an hour of exploration and experimentation.

Here’s the error when you run this code:

Traceback (most recent call last):
  ...
KeyError: 0

The KeyError piqued my interest. You get that error when you try to access a key in a dictionary that doesn’t exist. I ran the code again and I got the same error (of course I did) but with a different value. Here are 10 errors I got from 10 succesive runs of this script:

KeyError: 0
KeyError: 0
KeyError: 2
KeyError: 0
KeyError: 1
KeyError: 0
KeyError: 0
KeyError: 2
KeyError: 1
KeyError: 1

Let’s look at the clues. I know that random.choice() should pick a random item from a sequence. Therefore, the random nature of the numbers I got didn’t surprise me.

I also know I have three items in my dictionary, and all the KeyError values are in the range from 0 to 2.

This made me assume that random.choice() is trying to access a value using an index chosen at random.

I could have gone straight to the source code to read what’s inside random.choice(). But instead, I decided to explore this Python function myself first. Experimentation is fun!

Let’s Create A Test Class To Experiment

If random.choice() is trying to access a value based on the position in the sequence, it’s likely to be using indexing. This makes sense since if it’s trying to index the dictionary using points[1], say, then this will raise a KeyError since 1 is not a key in the dictionary.

Let’s test this with a Test class:

import random

class Test:
    def __getitem__(self, item):
        print("This is the __getitem__() method")

dummy_item = Test()
print(dummy_item[0])

The class Test only has the __getitem__() method defined. This is the method that’s called when you try to get an item using the square brackets. If the object is a sequence, then this refers to indexing. But the method is used for non-sequences too, such as dictionaries.

In the last two lines, you create an instance of Test and you use square brackets to confirm that this calls __getitem__(). Here’s the output:

This is the __getitem__() method
None

The line showing us that __getitem__() was called is printed out. You also get None printed out since there is no explicit return statement in __getitem__(), so it returns None.

Let’s try using this in random.choice() now:

import random

class Test:
    def __getitem__(self, item):
        print("This is the __getitem__() method")

dummy_item = Test()
print(random.choice(dummy_item))

We still get an error. Here it is:

Traceback (most recent call last):
  ...
TypeError: object of type 'Test' has no len()

This gives us another clue. The object used in random.choice() must have a length defined by the len() function.

Indeed, this is the formal definition of a sequence: it must have both __getitem__() and __len__() defined. __getitem__() needs to support integer indices and __len__() needs to return the length of the sequence.

So, let’s add the __len__() method, too:

import random

class Test:
    def __getitem__(self, item):
        print("This is the __getitem__() method")

    def __len__(self):
        print("This is the __len__() method")

dummy_item = Test()
print(random.choice(dummy_item))

When you run this, you’ll get two outputs. The first is the string showing you that the __len__() method was called. The second ‘output’ is another error:

This is the __len__() method

Traceback (most recent call last):
  ...
TypeError: 'NoneType' object cannot be interpreted as an integer

Let’s put our detective hat on again.

random.choice() needs the __len__() method to be defined. It makes sense since it needs to choose a random index and this depends on how many items there are in the sequence. But the __len__() method you defined so far doesn’t have an explicit return statement. Therefore, it returns None. And None is not an integer.

random.choice() is trying to interpret the value returned by __len__() as an integer. Of course it is, a sequence should have a __len__() method which returns an integer as stated in the definition of a sequence.

Let’s fix this by returning an integer:

import random

class Test:
    def __getitem__(self, item):
        print("This is the __getitem__() method")

    def __len__(self):
        print("This is the __len__() method")
        return 5

dummy_item = Test()
print(random.choice(dummy_item))

random.choice() no longer complains about not having an integer coming back from __len__() and you get the following output:

This is the __len__() method
This is the __getitem__() method
None

Both __getitem__() and __len__() were called as you can see from the strings printed out. You also have a None that’s returned by random.choice(). It’s likely that random.choice() is returning whatever __getitem__() returns. And at the moment, __getitem__() returns None.

Let’s test this:

import random

class Test:
    def __getitem__(self, item):
        print("This is the __getitem__() method")
        return "This is what's returned by __getitem__()"

    def __len__(self):
        print("This is the __len__() method")
        return 5

dummy_item = Test()
print(random.choice(dummy_item))

And the output shows that our final hypothesis is correct:

This is the __len__() method
This is the __getitem__() method
This is what's returned by __getitem__()

It seems as though random.choice() chooses a random index depending on the length of the sequence and then fetches that item using indexing.

Now, it’s time to check this by looking at the docs for random.choice() and its source code.

# from https://github.com/python/cpython/blob/main/Lib/random.py

def choice(self, seq):
        """Choose a random element from a non-empty sequence."""
        if not seq:
            raise IndexError('Cannot choose from an empty sequence')
        return seq[self._randbelow(len(seq))]

You can see that the sequence seq is being indexed using square brackets. This calls __getitem__(). It uses another method which has len(seq) as its argument.

Final Words

Doing some detective work to explore what’s happening inside Python functions is both fun and instructive. It encourages you to experiment: making hypotheses and testing them. And as with all experimentation, the failures are just as useful, if not more, as the successes!


Get the latest blog updates

No spam promise. You’ll get an email when a new blog post is published


Leave a Reply