Python Property & Descriptor: Zero to Expert

Let's dive deeper into the python property and descriptor

Jul 13, 2024

Hi, everyone!

Today, I will explain Python properties and descriptors applicable when implementing applications by following OOP. Before starting the post, if you haven’t yet, subscribe to my newsletter so you can receive useful posts via email.

Already done?

Then, let’s start!

Instance attributes

Instance attributes in Python are variables bound to a class's specific instance. Each class instance can have different values for its instance attributes, making them unique to that instance. These attributes are typically defined and initialized within the __init__ method (the constructor) of the class.

Let’s see an example of instance attributes. In this example, it defines the MyClass with two instance variables: a and b.

If you use instance variables like this, then you can access those instance variables with x.a or getattr(x, ‘a’). Also, it is added to the instance dictionary, and Python finds the attribute from the instance __dict__ dictionary to find the variable.

However, sometimes, you might need more control over those instance variables. For example, you might want to encapsulate the actual value or calculate the final value when a user accesses the instance variable. In this case, you must create a method to manipulate and call the instance variable from outside. It can cause a lot of refactoring codes!

What you really need is that you don’t want to change the access pattern of the instance variable and get more control over it.

In this case, the Python property function helps you achieve that goal.

Property

In Python, property is a built-in function that creates and returns a property object. Properties provide a way of customizing access to instance attributes. They allow for the encapsulation of getter, setter, and deleter methods for an attribute, enabling control over how attributes are accessed and modified.

Let’s change MyClass implementation without changing the access pattern. In the __init__ function, change the instance variable name to use an underscore, typically used for internal use. Then, create two functions a, b with the property decorator.

As you can see, even though you did not change the access pattern(x.a or getattr(x, ‘a’)), you got the same result! And now, you can add more logic when the user tries to access those instance variables. One thing you should do for this is to create the setter function for the property.

I just added the setter functions of instance variables a and b. In the setter functions, I added some particular logic that prevents the user from setting a negative value. It means I want the user to use only non-negative values for the variable.

As you can see from the result, when I tried to set -10 to instance variable b, it raised the ValueError Exception.

How does property work?

You might have noticed that a and b are not in the instance's __dict__ variable. If Python can’t find the variable in the __dict__, it tries to find the class __dict__.

Let’s check the class __dict__ value.

Yes! There are a and b in the class __dict__ variable. Those variables are bound to the instance of the property object. Each property instance has the getter and setter function, which Python calls in case of getting or setting value from the instance variable.

Let’s check step by step.(It has one missing step and I will explain it later)

A user calls x.a
The Python interpreter looks up “a” from the __dict__value of x but cannot find it.
Python looks up the class variable of x, which is MyClass.__dict__.
Python calls the getter function of the property instance to which class variable a is pointing. The getter function of the property object is fget().
Python automatically sends an instance to the fget function to find out the value of a, which belongs to instance x.

If you want to find out, you can manipulate the property instance like this. The property instance is for the class variable, so you need to send an instance to fget function to find out the instance's variable.

According to the help(property) result, you can define property manually with the fget, fset, and fdel functions.

Let’s change the MyClass implementation. Now, I will create the custom getter and setter function and define the variable with the property function, not as the decorator.

If you run this code, it returns the same result because it is basically the same as you define with the property decorator. It meant that if you used a property decorator, then it called the property function and sent the decorated function to fget parameter.

# Those implementation is exactly the same!!!

@property
def a(self):
    return self._a

------------------------------

def a(self):
    return self._a

a = property(fget=a)

A setter function is also just function call to set fset function.

# Those implementation is exactly the same!!!

@a.setter
def a(self, value):
  self._a = value

------------------------------
def set_a(self, value):
  self._a = value
    
a = a.setter(set_a)

Limitation of property.

Even though the property function provides the overall control for the instance variable, it has some limitations. One thing critical is that it is not reusable! Let’s assume that you have 100 variables that should save only non-negative integer values. Or, you might have multiple classes that need to use a non-negative integer variable. Then you need to implement property getter and setter every time!

“Isn’t there any way to solve this limitation?”

→ Yes, you can use descriptor protocol!

Descriptor Protocol

The descriptor protocol in Python allows objects to control attribute access in other objects. Descriptors are used to manage a class's attributes, particularly when custom behavior for attribute access, assignment, and deletion needs to be implemented.

There are two types of descriptors.

Data Descriptors: These implement both __get__ and __set__ methods (and optionally __delete__).
Non-data Descriptors: These implement only the __get__ method.

Descriptors are defined by implementing any of the following methods.

__get__(self, instance, owner): This method is called to get the attribute from an instance or a class. instance is the instance that owns the attribute and owner is the class that owns the descriptor.
__set__(self, instance, value): This method is called to set the attribute on an instance. instance is the instance that owns the attribute and value is the value to set.
__delete__(self, instance): This method is called to delete the attribute from an instance. instance is the instance that owns the attribute.

Let’s create the descriptor for MyClass. First, I created the NonNegativeInteger class and implemented the __get__ and __set__ methods to follow the descriptor protocol. In MyClass, I defined class variables a and b with a NonNegativeInteger instance. NonNegativeInteger is a Data Descriptor because it has __set__ method.

Now, if you need to define another variable that requires a non-negative integer value, then you can use the NonNegativeInteger class! A descriptor is recommended if you need application-wide properties.

The descriptor is used as a class variable!

When using a descriptor, you should know it is a class method. A class method is shared by instances. If one of the instances changes the value, then the other’s attributes are also changed because each is actually bound to the same class variable.

Let’s check with two different MyClass instances.

When I created the x1 instance, a value was 10, and b was 20. However, after I made the second instance, x2 and x1’s values were changed to 30 and 40. This is because x1 and x2 shared the class variables a and b.

To handle this correctly, you should maintain the dictionary inside the descriptor to save value with the instance. You can identify each instance because the object is passed when calling the __get__ and __set__ methods. So, you can create a dictionary and use the hash value of the instance as a key for the value. When getting a value, you can get it with the hash of the instance, which is always the same.

Even though you created two instances, one does not affect the other. If you print the value of class variables a and b, then it stores the value with the instance object hash.

What if I delete x1? Does Python automatically delete the value from the descriptor?

Let’s check.

Even though I deleted x1, it is still alive in the descriptor! This means the garbage collector did not collect the x1 object because a reference remains somewhere, so I could recover an x1 object from that key.

See!? x1 is revived!!

This is not practical!

If I delete the object, then it should be deleted. A memory leak could happen unless the garbage collector collects the deleted and unused objects. To solve the problem, we should find out which object is deleted and remove the matching data inside the descriptor.

Weakref and a callback function

In Python, a “strong reference” is the standard way objects are referenced, ensuring they remain in memory as long as there is an active reference. This type of reference is common in typical variable assignments, where the object will not be eligible for garbage collection until all strong references to it are removed.

In contrast, a weak reference does not prevent an object from being garbage collected, even if it is still referenced. Weak references are helpful in scenarios where you want to reference an object without extending its lifetime unnecessarily. This is particularly important in large-scale applications, such as caching mechanisms or observer patterns, where holding onto objects indefinitely could lead to memory leaks.

For instance, using the weakref module, you can create a weak reference to an object, allowing it to be garbage collected when no strong references exist. This balance helps manage memory efficiently, ensuring that objects are only kept alive when necessary, preventing potential memory bloat and improving application performance.

In Python's weakref module, you can attach a callback function to a weak reference. This callback function is called when the referenced object is about to be finalized (i.e., it is about to be garbage-collected). This lets you perform cleanup actions or other operations before destroying the object.

Let’s use the weakref module to prevent memory leaks in MyClass. I store the weakref.ref object with the value to keep track of the object lifecycle and register the callback function, which finds the key and deletes it from the dictionary.

Here’s the result.

Before I delete x1, each descriptor’s value has two items in the dictionary. When I delete x1, the _cleanup function I registered as a callback function is called, finds the key ID (x1), and removes the matching item from the dictionary.

Property Lookup Resolution

Above, I explained how Python looks up the property internally. But one thing is hidden inside.

Let’s check the example before learning the hidden step. I created the data descriptor and non-data descriptor to test the lookup resolution.

Let’s check the lookup resolution!

As you can see from the result, instance __dict__ overrides the non-data descriptor value. However, it does not override the data descriptor value.

Before searching for the value, Python checks if the property is a data descriptor. If it is the data descriptor, it skips searching from the instance __dict__ variable. So, even though the instance has a different value with the same key as the data descriptor, it does not affect the application.

The property function also generates the data descriptor inside, even though you did not define the setter function. So, the property object will not be replaced by the value in the __dict__ variable.

How CPython handles lookup resolution

_PyObject_GenericGetAttrWithDict function inside the CPython shows this. It is called the _PyType_LookupRef function, which finds the descriptor that matches the name by following the MRO(Module Resolution Order). This means that Python first finds the variable that is defined based on the order of inheritance.

And then, if the descriptor has tp_descr_set, which is the __set__ attribute, it just calls the __get__ function and returns it. No more resolution step happens.

If the descriptor does not have the __set__ attribute, Python starts to look up the instance dictionary variable.

If it still cannot find the proper value, it finally calls the __get__ function of the descriptor.

Summary

In short, you can use the Python property function if you want more control of instance variables. However, you should know that the property is stored as a class variable, which means it is shared among instances. Also, the property has a limitation in that it cannot be reused from other attributes or classes.

You can create custom descriptors to use reusable properties. The descriptor protocol requires __to get__. If it has __set__ method simultaneously, then it is considered a data descriptor. If not, it is a non-data descriptor.

Zero to Expert