Documenting Long Classes in Jupyter Notebook

In an earlier post, I described my experiment with using Jupyter Notebook and Jekyll together to write blog posts. This turns out to be very convienent for writing about scientific software, as it allows me to make blog posts that interlace code, figures, mathematics and explanatory prose in a way that hopefully is helpful for my readers. My approach does, however, come with some annoying limitations. For instance, if I want to describe a Python class with several methods and properties, it is difficult to explain each part of that class independently. Though addressing this limitation is currently on the wishlist for the Jupyter Notebook project, the current version doesn't really have a "good" way of dealing with this limitation.

In this post, I'll describe a "bad" way of describing long classes, by using inheritance to break up long definitions. I say that this is "bad" because it creates a needlessly complicated class hierarchy and code that is far from production-quality, but I think it's still a useful technique in that it allows for clearer explanations.

We begin by using print_function and the future library to provide better Python 2/3 compatability.

In [1]:
from __future__ import print_function
from future.utils import with_metaclass

Next, to show that the inheritence trick works well for abstract classes, we import from the abc standard library package.

In [2]:
from abc import ABCMeta, abstractmethod

Getting to the example itself, suppose that you want to define a class with two methods, foo and bar. We represent this as an abstract base class (hence abc) with two abstract methods.

In [3]:
class AbstractBase(with_metaclass(ABCMeta, object)):
    @abstractmethod
    def foo(self, x):
        pass
    
    @abstractmethod
    def bar(self, y):
        pass

If we now try to describe an example concrete (that is, not abstract) class inheriting from AbstractBase, we must define both foo and bar. If we leave one out, then we get a TypeError.

In [4]:
class A(AbstractBase):
    def foo(self, x):
        return x
In [5]:
try:
    a = A()
except TypeError as ex:
    print(ex)
Can't instantiate abstract class A with abstract methods bar

Thankfully, we can use inheritance to provide the second method (in this case, bar) after we have defined our example class.

In [6]:
class A(A):
    def bar(self, y):
        return y ** 2
In [7]:
a = A()
In [8]:
a.bar(2)
Out[8]:
4

What's going on here? In particular, how can A inherit from itself? Remember that in Python, classes are themselves values that can be manipulated at runtime. Thus, if we define a class B, we can do things like print that new class out to the console.

In [9]:
class B(object):
    pass

print(B)
<class '__main__.B'>

As with any other Python value, we can ask for the id of a class. This returns a unique number that identifies that class, even if we assign it to different variables.

In [10]:
print(id(B))
C = B
print(id(C))
67899912
67899912

Notably, if we define a new class that is also called B, this "hides" the old definition of B and gives us a class with a different id.

In [11]:
class B(object):
    pass

print(id(B))
67900856

The old class still exists, however, such that we can get to it if we assigned it to a different variable.

In [12]:
print(id(C))
print(C)
67899912
<class '__main__.B'>

Thus, when we make a class that "inherits from itself," it doesn't really do that per se, but rather inherits from the old value of the variable that held that class. We can confirm this by looking at the special attribute __bases__, which returns a tuple of all base classes of a class.

In [13]:
class D(object):
    pass

print(id(D))

class D(D):
    pass

print(id(D))
print([id(base_class) for base_class in D.__bases__])
67901800
67882920
[67901800L]

Thus, the "old value" of our class still lives on, as referred to by the __bases__ attribute of our new class. Practically speaking, this is a terrible and confusing thing to do, but it has a very nice effect for our purposes, of letting us progressively add attributes such as methods and properties to a class.

In [14]:
class E(object):
    x = 'foo'
    
class E(E):
    y = 'bar'
    
print(E.x, E.y)
foo bar

In this way, self-inheritence can provide a useful technique for splitting up long class definitions. That said, it's a bad idea, so please don't use this in "actual" code, and if you do, don't blame me for the confusion that results.

We finish the post by exporting to Jekyll:

In [15]:
!jupyter nbconvert --to html --template jekyll.tpl 2016-10-07-documenting-long-classes-jupyter-notebook
[NbConvertApp] Converting notebook 2016-10-07-documenting-long-classes-jupyter-notebook.ipynb to html
[NbConvertApp] Writing 19416 bytes to 2016-10-07-documenting-long-classes-jupyter-notebook.html