Everything’s an object

In Python, pretty much everything is an object, whether it is a number, a function… or even None.

>>> type(1)
<class 'int'>
>>> type('test')
<class 'str'>
>>> type('test'.__str__)
<class 'method-wrapper'>
>>> def func():
...     pass
...
>>> type(func)
<class 'function'>
>>>
>>> class MyClass(object):
...     pass
...
>>> obj = MyClass()
>>> type(MyClass)
<class 'type'>
>>> type(obj)
<class '__main__.MyClass'>
>>> type(None)
<class 'NoneType'>

Python is using a pure object model where classes are instances of a meta-class “type” (in Python, the terms “type” and “class” are synonyms). And “type” is the only class which is an instance of itself:

>>> type(42)
<class 'int'>
>>> type(int)    # same as type(type(42))
<class 'type'>
>>> type(type)   # same as type(type(type(42)))
<class 'type'>

This object model can be useful when we want information about a particular resource in Python. Except for the Python keywords (e.g. if, def, globals), using “type(<name>)” or “dir(<name>)” -or just type the resource name and press enter- will work on pretty much anything.

>>> super
<class 'super'>
>>> abs
<built-in function abs>
>>> str
<class 'str'>
>>> dir
<built-in function dir>

Objects, methods and attributes

Objects can have methods and attributes. Let’s look how the two differ by looking at the following code and the disassembled code behind method2():

>>> class MyClass(object):
...     attr = 42
...     def method1(self):
...             pass
...     def method2(self):
...             self.method1()
...             return self.attr
...
>>> dis.dis(MyClass.method2)
  6           0 LOAD_FAST                0 (self)
              3 LOAD_ATTR                0 (method1)
              6 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
              9 POP_TOP

  7          10 LOAD_FAST                0 (self)
             13 LOAD_ATTR                1 (attr)
             16 RETURN_VALUE

The only difference between “self.method1()” and “self.attr” is just the CALL_FUNCTION which is just telling the CPython runtime to execute a function based on what is on the stack. But from an object perspective, a method is just an attribute whose type is “method” or “method-wrapper”(1). You can actually allow a function call by defining __call__ on the attribute:

>>> class Attr(object):
...     value = 42
...
>>> class MyClass(object):
...     attr = Attr()
...
>>> obj = MyClass()
>>> Attr.__call__ = lambda self : self.value
>>> obj.attr.value
42
>>> obj.attr()
42

In the above example we did not even define __call__ inside the Attr class definition but afterwards, by just adding a class attribute which is a lambda function. The method however needs to have a “self” argument, otherwise it needs to be applied to the class (static method) and not to an instance.

To take another example, “type(42)” is not a call to a built-in function like “dir(42)”, but is the same as “type.__call__(type, 42)”. In other words we’re calling the __call__ method for the class “type” and pass as argument the instance “type” (self) and the object 42(2).

The fact that methods are just a special type of attribute helps explain why the function dir() returns a list of all the combined attributes and methods for a given object:

>>> class MyClass(object):
...     attr = 42
...     def test(self):
...             pass
...
>>> dir(obj)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__',
 '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__',
 '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
 '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__',
 'attr', 'test']

You can see that the last two element in the array are the “attr” attribute and the “test” method as defined in the class. For the rest, all the other attributes take the __<name>__ notation which is used to represent descriptors. Descriptors are object attributes with binding behavior. For example, obj.__dir__() is called under the hood by dir(obj). Likewise, __le__ is a method that can be overridden in MyClass to implement obj1 <= obj2.

Yes, number literals too

One could think that number literals do not behave like regular objects. Even though dir(42) shows a method __add__, trying to call 42.__add__ fails:

>>> 42.__add__(4)
  File "<stdin>", line 1
    42.__add__(4)
             ^
SyntaxError: invalid syntax

But the error has to do with parsing and not how the object 42 behaves. Python indeed interprets the first three characters (“42.”) as a float, and as a result does not understand what to do with “__add__”. You can however call the method on 42 by using a few tricks:

>>> 42 .__add__(4)   # inserting a space after 42
46
>>> (42).__add__(4)  # wrapping 42 inside ()
46
>>> 42..__add__(4)   # using 42.0 instead of 42
46.0

Implementation under the hood

Let’s now dive in the C implementation to see how are objects represented.

Those objects are manipulated under the hood as a C structure called PyObject. Ironically, the CPython object model is implemented using C, a language which is not object-oriented.

typedef struct _object {
    _PyObject_HEAD_EXTRA
    Py_ssize_t ob_refcnt;
    struct _typeobject *ob_type;
} PyObject;

You will notice the two following attributes:

  • a reference count, which keeps track of how many other objects and variables reference it. This is changed in the C code through the macros Py_INCREF() and Py_DECREF()
  • a type (PyTypeObject structure), which allows Python to determine the type (aka class) of the object at runtime. That type contains various methods which are used to describe the behavior of the class. What function to call to allocate the type, to deallocate the type, to cast as a number, etc.

Built-in types vs user class

Python comes with some built-in classes, such as int, str, list, but also function or class.

Contrary to a language such as Ruby where everything is also an object, Python does not allow to add new attributes or methods to the built-in types such as int or str, let alone redefine existing methods:

>>> number = 10
>>> number.attr = 10
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'attr'

The declarations of these objects are in the Include directory, and you can find in Object the various implementations of several types: int (Objects/longobject.c), str (Objects/unicodeobject.c), list (Objects/listobject.c), user-defined classes (Objects/classobject.c), functions (Objects/funcobject.c), etc.

Each of those files defines a PyTypeObject instance which represents the type. Each PyTypeObject instance contains mostly functions that describe the behavior of the type. For example, tp_getattro and tp_setattro, when defined, are the functions that allow to respectively read and assign a value to an attribute. The absence of tp_setattro for the “int” type explains why it is not possible to add or change an attribute to an integer. tp_as_sequence and tp_as_mapping point to lists of methods to handle standard functions for respectively functions and dictionaries.

PyTypeObject PyLong_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "int",                                      /* tp_name */
    offsetof(PyLongObject, ob_digit),           /* tp_basicsize */
    sizeof(digit),                              /* tp_itemsize */
    long_dealloc,                               /* tp_dealloc */
    0,                                          /* tp_print */
    0,                                          /* tp_getattr */
    0,                                          /* tp_setattr */
    0,                                          /* tp_reserved */
    long_to_decimal_string,                     /* tp_repr */
    &long_as_number,                            /* tp_as_number */
    0,                                          /* tp_as_sequence */
    0,                                          /* tp_as_mapping */
    (hashfunc)long_hash,                        /* tp_hash */
    0,                                          /* tp_call */
    long_to_decimal_string,                     /* tp_str */
    PyObject_GenericGetAttr,                    /* tp_getattro */
    0,                                          /* tp_setattro */
    0,                                          /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE |
        Py_TPFLAGS_LONG_SUBCLASS,               /* tp_flags */
    long_doc,                                   /* tp_doc */
    0,                                          /* tp_traverse */
    0,                                          /* tp_clear */
    long_richcompare,                           /* tp_richcompare */
    0,                                          /* tp_weaklistoffset */
    0,                                          /* tp_iter */
    0,                                          /* tp_iternext */
    long_methods,                               /* tp_methods */
    0,                                          /* tp_members */
    long_getset,                                /* tp_getset */
    0,                                          /* tp_base */
    0,                                          /* tp_dict */
    0,                                          /* tp_descr_get */
    0,                                          /* tp_descr_set */
    0,                                          /* tp_dictoffset */
    0,                                          /* tp_init */
    0,                                          /* tp_alloc */
    long_new,                                   /* tp_new */
    PyObject_Del,                               /* tp_free */
};

User types

When the program is defining a user class, the runtime creates a new type for that class. Here is an example of such a type inside the C debugger:

(gdb) print *tp
$67 = {ob_base = {ob_base = {_ob_next = 0x98a6034, _ob_prev = 0x98a3058, ob_refcnt = 6,
      ob_type = 0x82a9d60}, ob_size = 0},
  tp_name = 0x98a3070 "MyClass",
  tp_basicsize = 24,
  tp_itemsize = 0,
  tp_dealloc = 0x8073ac6 <subtype_dealloc>,
  tp_print = 0,
  tp_getattr = 0,
  tp_setattr = 0,
  tp_reserved = 0x0,
  tp_repr = 0x8078847 <object_repr>,
  tp_as_number = 0x98563c8,
  tp_as_sequence = 0x985645c,
  tp_as_mapping = 0x9856450,
  tp_hash = 0x805f28f <PyObject_HashNotImplemented>,
  tp_call = 0,
  tp_str = 0x8078a2e <object_str>,
  tp_getattro = 0x805ff70 <PyObject_GenericGetAttr>,
  tp_setattro = 0x806029e <PyObject_GenericSetAttr>,
  tp_as_buffer = 0x9856484,
  tp_flags = 808448,
  tp_doc = 0x0,
  tp_traverse = 0x8073801 <subtype_traverse>,
  tp_clear = 0x8073a31 <subtype_clear>,
  tp_richcompare = 0x8080b08 <slot_tp_richcompare>,
  tp_weaklistoffset = ,
  tp_iter = 0,
  tp_iternext = 0x805fa55 <_PyObject_NextNotImplemented>,
  tp_methods = 0x0,
  tp_members = 0x9856494,
  tp_getset = 0x82a9b40,
  tp_base = 0x82aa020,
  tp_dict = ,
  tp_descr_get = 0,
  tp_descr_set = 0,
  tp_dictoffset = 16,
  tp_init = 0x8078462 <object_init>,
  tp_alloc = 0x807360b <PyType_GenericAlloc>,
  tp_new = 0x8078515 <object_new>,
  tp_free = 0x810b295 <PyObject_GC_Del>,
  tp_is_gc = 0,
  tp_bases = (<type at remote 0x82aa020>,),
  tp_mro = (<type at remote 0x98562fc>, <type at remote 0x82aa020>),
  tp_cache = 0x0,
  tp_subclasses = 0x0,
  tp_weaklist = <weakref at remote 0x98a0974>,
  tp_del = 0,
  tp_version_tag = 189
}

Note how the tp_getattro and tp_setattro point to generic methods PyObject_GenericGetAttr and PyObject_GenericSetAttr.

In the next article we will see how attribute lookup works.


(1) If in CPython methods are just a particular type of attribute, this is not the case for every dynamically-typed languages. In Ruby for example you only access attributes through methods. Something like “my_obj.my_attribute = 42” is syntactic sugar for calling the object setter method to modify its private attribute “my_attribute”.

(2) In most object-oriented language, “my_object.my_method(argument)” is the same as “MyObjectClass.my_method(my_object, argument)”. In the case of type(int), the confusing part is that “type” is both the class and the instance.

4 thoughts on “Everything’s an object

      • Thank you. However, if everything is an object in Python, keywords should also be an object of some class right? What class are keywords objects of?

        Like

  1. Not really. Keywords such as “def”, “=” or “class” are just a way for Python to parse the source code. When you type “def myFunc() …”, a “myFunc” object gets created but “def” is not, because it’s just a convention to tell Python that you defined a function.

    Like

Leave a comment