Python and Meta-Programming

Meta-programming is one of the lesser known features in Python that can simplify (and sometimes obscure) your code.

This was initially intended to be a 5-minute lightning talk at PyCarolinas 2012, but it could not quite fit the timeframe.

As described in the Python documentation, meta-programming allows you to customize class creation. Why would you need that? Keep reading, for I will discuss 2 common use cases I encountered.

This is how a normal class definition looks like:

class A(object):
    """Normal class"""
    def __init__(self):
        pass

And this is a skeleton class with a meta-class defined:

class A(object):
    "Customizing the creation of class A"
    class __metaclass__(type):
        def __new__(mcs, name, bases, attributes):
            # Do something clever with name, base or attributes
            cls = type.__new__(mcs, name, bases, attributes)
            # Do something cleverer with the class itself
            print "I've created class", name, attributes
            return cls

    def __init__(self):
        pass

class B(A):
    static = 1

Running this piece of code (without instantiating objects of class A or class B) will produce the following output:

I've created class A {'__module__': '__main__', '__metaclass__': <class '__main__.__metaclass__'>, '__doc__': 'Customizing the creation of class A', '__init__': }
I've created class B {'__module__': '__main__', 'static': 1}

So your code gets executed as the class gets defined, which gives you tremendous control over your class creation.

At some point, the mataclass’ __new__ method does need to call type.__new__ to let Python create your class, but you have the opportunity to modify the class name, base classes or class attributes before class creation. Why would you want to do that? Let’s explore 2 use cases.

Use case 1: Slots

Slots are also well described in the Python documentation, and should be used for several reasons. The first one is memory footprint. Slotted classes are more memory efficient, because the object dictionary (__dict__) is no longer  allocated. Another god reason to use slots is “weak typing” – it describes a class’ interface. How many times have you assigned data to myobj.vaule and wonder why there’s no data in myobj.value?

Here is how a slotted class looks like:

class A(object):
    __slots__ = [ 'data' ]

Now this is valid:

a = A()
a.data = 1

While this is not:

a.dat = 1

Python will raise an AttributeError exception.

So far so good. Now let’s bring inheritance into the mix.

class Base(object):
    __slots__ = ['data']

class A(Base):
    pass

So this should be invalid:

a = A()
a.dat = 1

But it is not. The reason? Slots have to be defined in the base class, as well as in each subclass, even if they are empty. So this is the proper definition:

class Base(object):
    __slots__ = ['data']

class A(Base):
    __slots__ = []

It is rather unfortunate that you have to remember to define empty slots, so let’s try to simplify that. This is where modifying the class’ attributes at class creation time comes in handy.

class Base(object):
    __slots__ = []
    class __metaclass__(type):
        def __new__(mcs, name, bases, attributes):
            if '__slots__' not in attributes:
                attributes.update(__slots__=[])
            cls = type.__new__(mcs, name, bases, attributes)
            return cls

class A(Base):
    pass

Note how, if __slots__ is not present in the attribute dictionary, we add it as an empty list.

Now this works as expected (in that trying to assign to field .a will raise an AttributeError):

a = A()
a.a = 1

Use case 2: A class registry

What we will explore now is an application that spawns different object types (instantiated from different classes), depending on the input. This is a common pattern when processing XML nodes as part of a SAX parser, where you would like to have a customized (non-generic) object created when the close tag in the XML stream is encountered. Doing this for very few variants of objects is not a problem (only a matter of the proper if/then/else construct), but it becomes cumbersome as soon as the number of classes grows.

In a very simplified scenario, we assume the input is plain text, and our custom classes define a Name attribute to indicate which input they are willing to handle. Here is the full example:

class Registry(object):
    _registry = {}

    @classmethod
    def register(cls, klass):
        cls._registry[klass.Name] = klass

    @classmethod
    def process(cls, text):
        print "-> Processing: %s" % text
        klass = cls._registry.get(text)
        if klass is None:
            return None
        return klass()

class Base(object):
    Name = None
    class __metaclass__(type):
        def __new__(mcs, name, bases, attributes):
            cls = type.__new__(mcs, name, bases, attributes)
            Registry.register(cls)
            return cls

class A(Base):
    Name = "A"

class B(Base):
    Name = "B"

Note the line in __metaclass__.__new__:

Registry.register(cls)

This is where our newly created class gets added to the registry.

Let’s run the above example and process some input:

print Registry.process('A')
print Registry.process('B')
print Registry.process('C')

The output will be:

-> Processing: A

-> Processing: B

-> Processing: C
None

Notice how, not having a handler class for C, a None object is returned in the third call.

An alternative and probably easier to follow implementation for this use case could also use class descriptors. This is how we would do it in that case (the Registry class is the same):

@Registry.register
class A(object):
    Name = "A"

@Registry.register
class B(object):
    Name = "B"

However, using descriptors to slotify sub-classes as in use case 1 will not work, since slots have to be defined at class creation.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>