Object-Oriented Programming#
Object-Oriented Programming, abbreviated as OOP, is a programming paradigm. OOP treats objects as the basic units of programs, where an object contains data and functions that operate on that data.
Procedural programming views a computer program as a collection of commands, which means a sequence of function executions. To simplify program design, procedural programming continues to break functions down into sub-functions, reducing system complexity by dividing large functions into smaller ones.
In contrast, object-oriented programming views a computer program as a collection of objects, where each object can receive messages from other objects and process those messages. The execution of a computer program is a series of messages passed between various objects.
In Python, all data types can be considered objects, and you can also define custom objects. The custom object data type corresponds to the concept of classes in object-oriented programming.
Let's illustrate the differences between procedural and object-oriented programming with an example.
Suppose we want to handle a student's grade report. To represent a student's grade, a procedural program might use a dict:
std1 = { 'name': 'Michael', 'score': 98 }
std2 = { 'name': 'Bob', 'score': 81 }
Handling student grades can be implemented through functions, such as printing a student's grade:
def print_score(std):
print('%s: %s' % (std['name'], std['score']))
If we adopt an object-oriented programming approach, we first think not about the execution flow of the program, but about how the data type Student should be treated as an object, which has two properties: name and score. To print a student's grade, we must first create the corresponding object for that student, and then send a print_score message to the object, allowing the object to print its own data.
class Student(object):
def __init__(self, name, score):
self.name = name
self.score = score
def print_score(self):
print('%s: %s' % (self.name, self.score))
Sending a message to an object is essentially calling the associated function of that object, which we call methods. An object-oriented program looks like this:
bart = Student('Bart Simpson', 59)
lisa = Student('Lisa Simpson', 87)
bart.print_score()
lisa.print_score()
The object-oriented design philosophy comes from nature, as the concepts of class and instance are very natural. A class is an abstract concept, such as our defined class—Student, which refers to the concept of a student, while an instance is a specific student, such as Bart Simpson and Lisa Simpson.
Thus, the object-oriented design philosophy abstracts out classes and creates instances based on those classes.
The level of abstraction in object-oriented programming is higher than that of functions because a class contains both data and methods that operate on that data.
Encapsulation, inheritance, and polymorphism are the three main characteristics of object-oriented programming.
Classes and Instances
The most important concepts in object-oriented programming are classes and instances. It is essential to remember that a class is an abstract template, such as the Student class, while an instance is a specific "object" created based on the class, where each object has the same methods but may have different data.
Taking the Student class as an example, in Python, defining a class is done using the class keyword:
class Student(object):
pass
The class name follows immediately after the class keyword, which is Student. Class names are typically capitalized words, followed by (object), indicating which class the current class inherits from. Usually, if there is no suitable parent class, the object class is used, which is the ultimate parent class for all classes.
Once the Student class is defined, we can create instances of the Student class using class_name+():
>>> bart = Student()
>>> bart
<__main__.Student object at 0x10a67a590>
>>> Student
<class '__main__.Student'>
As we can see, the variable bart points to an instance of Student, and the address 0x10a67a590 is the memory address, which is unique for each object, while Student itself is a class.
We can freely bind attributes to an instance variable, for example, binding a name attribute to the instance bart:
>>> bart.name = 'Bart Simpson'
>>> bart.name
'Bart Simpson'
Since classes can serve as templates, we can enforce certain attributes to be filled in when creating instances. By defining a special init method, we can bind attributes like name and score when creating an instance:
class Student(object):
def __init__(self, name, score):
self.name = name
self.score = score
Note: The special method "init" has two underscores before and after its name!
Notice that the first parameter of the init method is always self, which refers to the instance being created. Therefore, within the init method, we can bind various attributes to self, as self refers to the instance itself.
With the init method in place, we cannot pass empty parameters when creating an instance; we must provide parameters that match those in the init method, but self does not need to be passed, as the Python interpreter will automatically pass the instance variable:
>>> bart = Student('Bart Simpson', 59)
>>> bart.name
'Bart Simpson'
>>> bart.score
59
Compared to ordinary functions, the only difference in functions defined within a class is that the first parameter is always the instance variable self, and this parameter does not need to be passed during the call. Other than that, class methods are no different from ordinary functions, so you can still use default parameters, variable parameters, keyword parameters, and named keyword parameters.
Data Encapsulation
One important feature of object-oriented programming is data encapsulation. In the Student class above, each instance has its own name and score data. We can access this data through functions, such as printing a student's grade:
>>> def print_score(std):
... print('%s: %s' % (std.name, std.score))
...
>>> print_score(bart)
Bart Simpson: 59
However, since the Student instance itself has this data, there is no need to access it from an external function; we can directly define a function within the Student class to access the data, thus encapsulating the "data." These encapsulated data functions are associated with the Student class itself, and we refer to them as class methods:
class Student(object):
def __init__(self, name, score):
self.name = name
self.score = score
def print_score(self):
print('%s: %s' % (self.name, self.score))
To define a method, aside from the first parameter being self, the rest is the same as an ordinary function. To call a method, you only need to call it directly on the instance variable, passing other parameters normally, without needing to pass self.
This way, from an external perspective of the Student class, we only need to know that creating an instance requires providing name and score, while how to print is defined internally within the Student class. These data and logic are "encapsulated," making the call easy, but we do not need to know the internal implementation details.
Another benefit of encapsulation is that we can add new methods to the Student class, such as get_grade:
class Student(object):
...
def get_grade(self):
if self.score >= 90:
return 'A'
elif self.score >= 60:
return 'B'
else:
return 'C'
Similarly, the get_grade method can be called directly on the instance variable without needing to know the internal implementation details.
Summary
- A class is a template for creating instances, while instances are specific objects, each instance has independent data that does not affect others.
- A method is a function bound to an instance, and unlike ordinary functions, methods can directly access instance data.
- By calling methods on instances, we directly manipulate the internal data of the object without needing to know the implementation details of the methods.
- Unlike static languages, Python allows binding any data to instance variables, meaning that for two instance variables, although they are different instances of the same class, they may have different variable names:
>>> bart = Student('Bart Simpson', 59)
>>> lisa = Student('Lisa Simpson', 87)
>>> bart.age = 8
>>> bart.age
8
>>> lisa.age
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Student' object has no attribute 'age'
Access Restrictions
Within a class, there can be attributes and methods, and external code can manipulate data by directly calling instance methods, thus hiding the internal complex logic.
However, from the previous definition of the Student class, external code can still freely modify an instance's name and score attributes:
>>> bart = Student('Bart Simpson', 59)
>>> bart.score
59
>>> bart.score = 99
>>> bart.score
99
To prevent external access to internal attributes, we can prefix the attribute names with two underscores __. In Python, if an instance variable starts with __, it becomes a private variable, accessible only internally, not externally. Thus, we modify the Student class as follows:
class Student(object):
def __init__(self, name, score):
self.__name = name
self.__score = score
def print_score(self):
print('%s: %s' % (self.__name, self.__score))
After this change, for external code, nothing seems to have changed, but it is now impossible to access instance variables.__name and __score from outside:
>>> bart = Student('Bart Simpson', 59)
>>> bart.__name
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Student' object has no attribute '__name'
This ensures that external code cannot arbitrarily modify the internal state of the object, making the code more robust through access restrictions.
But what if external code needs to access name and score? We can add methods like get_name and get_score to the Student class:
class Student(object):
...
def get_name(self):
return self.__name
def get_score(self):
return self.__score
What if we also want to allow external code to modify score? We can add a set_score method to the Student class:
class Student(object):
...
def set_score(self, score):
self.__score = score
You might ask, why not just modify it directly with bart.score = 99? Because in the method, we can check the parameters to avoid passing invalid values:
class Student(object):
...
def set_score(self, score):
if 0 <= score <= 100:
self.__score = score
else:
raise ValueError('bad score')
It is important to note that in Python, variable names like xxx, which start and end with double underscores, are special variables, and special variables can be accessed directly, meaning they are not private variables. Therefore, variable names like name and score should not be used.
Sometimes, you may see instance variable names starting with one underscore, like _name. Such instance variables can be accessed externally, but by convention, when you see such a variable, it means, “Although I can be accessed, please treat me as a private variable and do not access it casually.”
Are instance variables starting with double underscores completely inaccessible from outside? Not necessarily. The reason you cannot directly access __name is that the Python interpreter changes the name to _Student__name for external access, so you can still access __name using _Student__name:
>>> bart._Student__name
'Bart Simpson'
However, it is strongly advised not to do this, as different versions of the Python interpreter may change __name to different variable names.
In summary, Python itself has no mechanism to prevent you from doing bad things; it all relies on self-discipline.
Finally, note the following erroneous code:
>>> bart = Student('Bart Simpson', 59)
>>> bart.get_name()
'Bart Simpson'
>>> bart.__name = 'New Name' # Setting the __name variable!
>>> bart.__name
'New Name'
On the surface, it appears that external code has "successfully" set the __name variable, but in reality, this __name variable is not the same as the __name variable inside the class! The internal __name variable has been automatically changed by the Python interpreter to _Student__name, while the external code has added a new __name variable to bart.
>>> bart.get_name() # get_name() internally returns self.__name
'Bart Simpson'
Inheritance and Polymorphism
In OOP design, when we define a class, we can inherit from an existing class. The new class is called a subclass, while the class being inherited from is called a base class, parent class, or super class.
For example, if we have already written a class named Animal with a run() method that prints directly:
class Animal(object):
def run(self):
print('Animal is running...')
When we need to write Dog and Cat classes, we can directly inherit from the Animal class:
class Dog(Animal):
pass
class Cat(Animal):
pass
For Dog, Animal is its parent class, and for Animal, Dog is its subclass. Cat and Dog are similar.
What are the benefits of inheritance? The biggest benefit is that the subclass inherits all the functionality of the parent class. Since Animal implements the run() method, Dog and Cat, as its subclasses, automatically have the run() method without doing anything:
dog = Dog()
dog.run()
cat = Cat()
cat.run()
The output is as follows:
Animal is running...
Animal is running...
Of course, we can also add some methods to the subclass, such as in the Dog class:
class Dog(Animal):
def run(self):
print('Dog is running...')
def eat(self):
print('Eating meat...')
The second benefit of inheritance requires us to make some improvements to the code. You see, whether it’s Dog or Cat, they both print Animal is running... when they run(), which is logically incorrect. The logical approach is to display Dog is running... and Cat is running..., so we improve the Dog and Cat classes as follows:
class Dog(Animal):
def run(self):
print('Dog is running...')
class Cat(Animal):
def run(self):
print('Cat is running...')
When we run it again, the output is as follows:
Dog is running...
Cat is running...
When both the subclass and the parent class have the same run() method, we say that the subclass's run() overrides the parent's run(). During code execution, the subclass's run() is always called. This gives us another benefit of inheritance: polymorphism.
To understand what polymorphism is, we first need to clarify a bit about data types. When we define a class, we are actually defining a data type. The data types we define are no different from the built-in data types in Python, such as str, list, and dict:
a = list() # a is of list type
b = Animal() # b is of Animal type
c = Dog() # c is of Dog type
To check whether a variable is of a certain type, we can use isinstance():
>>> isinstance(a, list)
True
>>> isinstance(b, Animal)
True
>>> isinstance(c, Dog)
True
It seems that a, b, and c indeed correspond to the types list, Animal, and Dog respectively.
But wait, let’s try this:
>>> isinstance(c, Animal)
True
It seems that c is not just a Dog; c is also an Animal!
But upon careful thought, this makes sense because Dog is derived from Animal. When we create an instance of Dog, we can say that c is of type Dog, but it is also correct to say that c is of type Animal, since Dog is a kind of Animal!
So, in an inheritance relationship, if an instance's data type is a subclass, it can also be viewed as the parent class. However, the reverse is not true:
>>> b = Animal()
>>> isinstance(b, Dog)
False
Dog can be seen as an Animal, but Animal cannot be seen as a Dog.
To understand the benefits of polymorphism, we need to write a function that accepts a variable of type Animal:
def run_twice(animal):
animal.run()
animal.run()
When we pass an instance of Animal, run_twice() prints:
>>> run_twice(Animal())
Animal is running...
Animal is running...
When we pass an instance of Dog, run_twice() prints:
>>> run_twice(Dog())
Dog is running...
Dog is running...
When we pass an instance of Cat, run_twice() prints:
>>> run_twice(Cat())
Cat is running...
Cat is running...
It seems unremarkable, but think about it: now, if we define a Tortoise type that also derives from Animal:
class Tortoise(Animal):
def run(self):
print('Tortoise is running slowly...')
When we call run_twice() and pass in an instance of Tortoise:
>>> run_twice(Tortoise())
Tortoise is running slowly...
Tortoise is running slowly...
You will find that adding a new subclass of Animal does not require any modifications to run_twice(), in fact, any function or method that relies on Animal as a parameter can run normally without modification, and the reason lies in polymorphism.
The benefit of polymorphism is that when we need to pass in Dog, Cat, Tortoise, etc., we only need to accept the Animal type, because Dog, Cat, Tortoise, etc., are all of type Animal, and then we can operate according to the Animal type. Since the Animal type has a run() method, any type passed in, as long as it is an instance of Animal or its subclasses, will automatically call the actual type's run() method. This is the essence of polymorphism:
For a variable, we only need to know it is of type Animal, without needing to know its exact subtype, and we can safely call the run() method. The specific run() method that is called depends on the exact type of the object at runtime, whether it is an Animal, Dog, Cat, or Tortoise object. This is the true power of polymorphism: the caller only needs to call, without worrying about the details, and when we add a new subclass of Animal, as long as we ensure the run() method is correctly implemented, we do not need to care about how the original code calls it. This is the famous Open/Closed Principle:
Open for extension: allowing new subclasses of Animal;
Closed for modification: no need to modify functions like run_twice() that depend on the Animal type.
Inheritance can also be inherited level by level, just like the relationship from grandfather to father to son. Any class can ultimately trace back to the root class object, and these inheritance relationships look like an inverted tree.
Static Language vs Dynamic Language
For static languages (like Java), if you need to pass in an Animal type, the object passed must be of Animal type or its subclasses; otherwise, the run() method cannot be called.
For dynamic languages like Python, it is not necessary to pass in an Animal type. We only need to ensure that the object passed has a run() method:
class Timer(object):
def run(self):
print('Start...')
This is the "duck typing" of dynamic languages, which does not require a strict inheritance hierarchy. An object is considered a duck as long as it "looks like a duck and walks like a duck."
Python's "file-like object" is an example of duck typing. For a true file object, it has a read() method that returns its content. However, many objects that have a read() method are also considered "file-like objects." Many functions accept parameters that are "file-like objects," and you do not necessarily have to pass in a true file object; you can pass in any object that implements the read() method.
Summary
- Inheritance allows subclasses to directly inherit all functionalities of the parent class, so there is no need to start from scratch; subclasses only need to add their unique methods and can override methods that are not suitable from the parent class.
- The duck typing feature of dynamic languages means that inheritance is not as mandatory as in static languages.
Getting Object Information
When we have a reference to an object, how do we know what type this object is and what methods it has?
Using type()
We can use the type() function to determine the type of an object:
Basic types can all be checked with type():
>>> type(123)
<class 'int'>
>>> type('str')
<class 'str'>
>>> type(None)
<type(None) 'NoneType'>
If a variable points to a function or class, we can also check it with type():
>>> type(abs)
<class 'builtin_function_or_method'>
>>> type(a)
<class '__main__.Animal'>
But what type does the type() function return? It returns the corresponding Class type.
If we want to check in an if statement, we need to compare whether the types of two variables are the same:
>>> type(123)==type(456)
True
>>> type(123)==int
True
>>> type('abc')==type('123')
True
>>> type('abc')==str
True
>>> type('abc')==type(123)
False
For basic data types, we can write int, str, etc. directly, but what if we want to check whether an object is a function? We can use constants defined in the types module:
>>> import types
>>> def fn():
... pass
...
>>> type(fn)==types.FunctionType
True
>>> type(abs)==types.BuiltinFunctionType
True
>>> type(lambda x: x)==types.LambdaType
True
>>> type((x for x in range(10)))==types.GeneratorType
True
Using isinstance()
For class inheritance relationships, using type() can be inconvenient. To check the type of a class, we can use the isinstance() function.
Let’s revisit the previous example. If the inheritance relationship is:
object -> Animal -> Dog -> Husky
Then, isinstance() can tell us whether an object is of a certain type. First, create objects of three types:
>>> a = Animal()
>>> d = Dog()
>>> h = Husky()
Then check:
>>> isinstance(h, Husky)
True
No problem, because the h variable points to a Husky object.
Now check:
>>> isinstance(h, Dog)
True
Although h is of type Husky, since Husky inherits from Dog, h is also of type Dog. In other words, isinstance() checks whether an object is of that type itself or is in the parent inheritance chain of that type.
Thus, we can be sure that h is also of type Animal:
>>> isinstance(h, Animal)
True
Similarly, the actual type of d, which is Dog, is also of type Animal:
>>> isinstance(d, Dog) and isinstance(d, Animal)
True
However, d is not of type Husky:
>>> isinstance(d, Husky)
False
Basic types that can be checked with type() can also be checked with isinstance():
>>> isinstance('a', str)
True
>>> isinstance(123, int)
True
>>> isinstance(b'a', bytes)
True
And it can also check whether a variable is one of several types, for example, the following code can check whether it is a list or a tuple:
>>> isinstance([1, 2, 3], (list, tuple))
True
>>> isinstance((1, 2, 3), (list, tuple))
True
Always prefer using isinstance() to check types, as it can capture both the specified type and its subclasses.
Using dir()
If you want to get all attributes and methods of an object, you can use the dir() function, which returns a list of strings. For example, to get all attributes and methods of a str object:
>>> dir('ABC')
['__add__', '__class__',..., '__subclasshook__', 'capitalize', 'casefold',..., 'zfill']
Attributes and methods that look like xxx in Python have special purposes. For example, the len method returns the length. In Python, if you call the len() function to get the length of an object, it actually automatically calls the object's len() method internally, so the following code is equivalent:
>>> len('ABC')
3
>>> 'ABC'.__len__()
3
If we want our own class to also use len(myObj), we need to define a len() method:
>>> class MyDog(object):
... def __len__(self):
... return 100
...
>>> dog = MyDog()
>>> len(dog)
100
The rest are ordinary attributes or methods, such as lower() which returns a lowercase string:
>>> 'ABC'.lower()
'abc'
Simply listing attributes and methods is not enough; combined with getattr(), setattr(), and hasattr(), we can directly manipulate the state of an object:
>>> class MyObject(object):
... def __init__(self):
... self.x = 9
... def power(self):
... return self.x * self.x
...
>>> obj = MyObject()
Next, we can test the object's attributes:
>>> hasattr(obj, 'x') # Does it have the attribute 'x'?
True
>>> obj.x
9
>>> hasattr(obj, 'y') # Does it have the attribute 'y'?
False
>>> setattr(obj, 'y', 19) # Set an attribute 'y'
>>> hasattr(obj, 'y') # Does it have the attribute 'y'?
True
>>> getattr(obj, 'y') # Get the attribute 'y'
19
>>> obj.y # Get the attribute 'y'
19
If you try to get a non-existent attribute, it will raise an AttributeError:
>>> getattr(obj, 'z') # Get the attribute 'z'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MyObject' object has no attribute 'z'
You can pass a default parameter, which will return the default value if the attribute does not exist:
>>> getattr(obj, 'z', 404) # Get the attribute 'z', return default value 404 if it doesn't exist
404
You can also get the object's methods:
>>> hasattr(obj, 'power') # Does it have the attribute 'power'?
True
>>> getattr(obj, 'power') # Get the attribute 'power'
<bound method MyObject.power of <__main__.MyObject object at 0x10077a6a0>>
>>> fn = getattr(obj, 'power') # Get the attribute 'power' and assign it to variable fn
>>> fn # fn points to obj.power
<bound method MyObject.power of <__main__.MyObject object at 0x10077a6a0>>
>>> fn() # Calling fn() is the same as calling obj.power()
81
Summary
- Through a series of built-in functions, we can analyze any Python object and access its internal data. It is important to note that we only use these functions when we do not know the object's information. If we can write directly:
sum = obj.x + obj.y
We should not write:
sum = getattr(obj, 'x') + getattr(obj, 'y')
A correct usage example is as follows:
def readImage(fp):
if hasattr(fp, 'read'):
return readData(fp)
return None
Suppose we want to read an image from the file stream fp, we first check whether the fp object has a read method. If it does, then the object is a stream; if not, it cannot be read. hasattr() comes in handy here.
Please note that in dynamic languages like Python, having a read() method does not mean that the fp object is a file stream; it could also be a network stream or a byte stream in memory. However, as long as the read() method returns valid image data, it does not affect the functionality of reading the image.
Instance Attributes and Class Attributes
Since Python is a dynamic language, attributes can be bound to instances created from classes.
The way to bind attributes to an instance is through instance variables or through the self variable:
class Student(object):
def __init__(self, name):
self.name = name
s = Student('Bob')
s.score = 90
But what if the Student class itself needs to bind an attribute? We can directly define the attribute in the class, which is a class attribute that belongs to the Student class:
class Student(object):
name = 'Student'
Once we define a class attribute, although this attribute belongs to the class, all instances of the class can access it. Let's test it:
>>> class Student(object):
... name = 'Student'
...
>>> s = Student() # Create instance s
>>> print(s.name) # Print name attribute, since the instance does not have a name attribute, it will continue to look for the class's name attribute
Student
>>> print(Student.name) # Print class's name attribute
Student
>>> s.name = 'Michael' # Bind name attribute to the instance
>>> print(s.name) # Since instance attributes take precedence over class attributes, it will mask the class's name attribute
Michael
>>> print(Student.name) # However, the class attribute has not disappeared, and we can still access it with Student.name
Student
>>> del s.name # If we delete the instance's name attribute
>>> print(s.name) # When we call s.name again, since the instance's name attribute is not found, the class's name attribute will be displayed
Student
When writing programs, be careful not to use the same name for instance attributes and class attributes, as the same name for instance attributes will mask class attributes, but when you delete the instance attribute and use the same name again, you will access the class attribute.
Summary
- Instance attributes belong to each instance and are independent of each other;
- Class attributes belong to the class, and all instances share a single attribute;
- Do not use the same names for instance attributes and class attributes, as this can lead to hard-to-detect errors.
Advanced Object-Oriented Programming#
Multiple Inheritance, Custom Classes, Metaclasses
Using slots
Normally, when we define a class and create an instance of that class, we can bind any attributes and methods to that instance, which is the flexibility of dynamic languages. First, define the class:
class Student(object):
pass
Then, try to bind an attribute to the instance:
>>> s = Student()
>>> s.name = 'Michael' # Dynamically bind an attribute to the instance
>>> print(s.name)
Michael
You can also try to bind a method to the instance:
>>> def set_age(self, age): # Define a function as an instance method
... self.age = age
...
>>> from types import MethodType
>>> s.set_age = MethodType(set_age, s) # Bind a method to the instance
>>> s.set_age(25) # Call the instance method
>>> s.age # Test result
25
However, binding a method to one instance does not affect another instance:
>>> s2 = Student() # Create a new instance
>>> s2.set_age(25) # Try to call the method
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Student' object has no attribute 'set_age'
To bind methods to all instances, we can bind methods to the class:
>>> def set_score(self, score):
... self.score = score
...
>>> Student.set_score = set_score
Once a method is bound to the class, all instances can call it:
>>> s.set_score(100)
>>> s.score
100
>>> s2.set_score(99)
>>> s2.score
99
In most cases, the set_score method can be directly defined within the class, but dynamic binding allows us to dynamically add functionality to the class during runtime, which is difficult to achieve in static languages.
Using slots
However, what if we want to restrict the attributes of an instance? For example, only allow adding name and age attributes to Student instances.
To achieve this restriction, Python allows you to define a special slots variable when defining a class to limit the attributes that can be added to that class instance:
class Student(object):
__slots__ = ('name', 'age') # Use a tuple to define the allowed attribute names
Now, let’s try:
>>> s = Student() # Create a new instance
>>> s.name = 'Michael' # Bind the 'name' attribute
>>> s.age = 25 # Bind the 'age' attribute
>>> s.score = 99 # Try to bind the 'score' attribute
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Student' object has no attribute 'score'
Since 'score' was not included in slots, it cannot be bound, and attempting to bind it will raise an AttributeError.
It is important to note that the attributes defined in slots only apply to the current class instance and do not apply to inherited subclasses:
>>> class GraduateStudent(Student):
... pass
...
>>> g = GraduateStudent()
>>> g.score = 9999
Unless the subclass also defines slots, in which case the allowed attributes for the subclass instance will be its own slots plus the parent class's slots.
Using @property
When exposing attributes directly, although it is simple, it does not allow for parameter checks, leading to arbitrary changes to scores:
s = Student()
s.score = 9999
This is clearly illogical. To restrict the score range, we can use a set_score() method to set the score and a get_score() method to retrieve it, allowing us to check parameters in the set_score() method:
class Student(object):
def get_score(self):
return self._score
def set_score(self, value):
if not isinstance(value, int):
raise ValueError('score must be an integer!')
if value < 0 or value > 100:
raise ValueError('score must be between 0 ~ 100!')
self._score = value
Now, when operating on any Student instance, we cannot set the score arbitrarily:
>>> s = Student()
>>> s.set_score(60) # ok!
>>> s.get_score()
60
>>> s.set_score(9999)
Traceback (most recent call last):
...
ValueError: score must be between 0 ~ 100!
However, the above method of calling is somewhat complex and does not provide the simplicity of using attributes.
Is there a way to both check parameters and access class variables in a simple way, similar to attributes? For Python programmers pursuing perfection, this is a must!
Remember that decorators can dynamically add functionality to functions? The same applies to class methods. The built-in @property decorator is responsible for turning a method into a property call:
class Student(object):
@property
def score(self):
return self._score
@score.setter
def score(self, value):
if not isinstance(value, int):
raise ValueError('score must be an integer!')
if value < 0 or value > 100:
raise ValueError('score must be between 0 ~ 100!')
self._score = value
The implementation of @property is relatively complex, but we will first examine how to use it. To turn a getter method into a property, simply add @property. At this point, @property itself creates another decorator @score.setter, which is responsible for turning a setter method into a property assignment, thus giving us controllable property operations:
>>> s = Student()
>>> s.score = 60 # OK, actually translates to s.set_score(60)
>>> s.score # OK, actually translates to s.get_score()
60
>>> s.score = 9999
Traceback (most recent call last):
...
ValueError: score must be between 0 ~ 100!
Notice this magical @property; when we operate on instance attributes, we know that this attribute is likely not directly exposed but is implemented through getter and setter methods.
We can also define read-only properties by only defining the getter method and not defining the setter method:
class Student(object):
@property
def birth(self):
return self._birth
@birth.setter
def birth(self, value):
self._birth = value
@property
def age(self):
return 2015 - self._birth
In the above code, birth is a read-write property, while age is a read-only property because age can be calculated based on birth and the current time.
It is particularly important to note: the method name for properties should not conflict with instance variable names. For example, the following code is incorrect:
class Student(object):
# Method name and instance variable both are birth:
@property
def birth(self):
return self.birth
This is because when calling s.birth, it first converts to a method call, and when executing return self.birth, it is treated as accessing the attribute of self, which again converts to a method call, causing infinite recursion and ultimately leading to a stack overflow error (RecursionError).
Summary
@property is widely used in class definitions, allowing callers to write concise code while ensuring necessary parameter checks, thus reducing the likelihood of errors during program execution.
Multiple Inheritance
The class hierarchy is still designed according to mammals and birds:
class Animal(object):
pass
# Major categories:
class Mammal(Animal):
pass
class Bird(Animal):
pass
# Various animals:
class Dog(Mammal):
pass
class Bat(Mammal):
pass
class Parrot(Bird):
pass
class Ostrich(Bird):
pass
Now, if we want to add Runnable and Flyable functionalities to animals, we just need to define the Runnable and Flyable classes:
class Runnable(object):
def run(self):
print('Running...')
class Flyable(object):
def fly(self):
print('Flying...')
For animals that need Runnable functionality, we can inherit from Runnable, such as Dog:
class Dog(Mammal, Runnable):
pass
For animals that need Flyable functionality, we can inherit from Flyable, such as Bat:
class Bat(Mammal, Flyable):
pass
Through multiple inheritance, a subclass can simultaneously inherit all functionalities from multiple parent classes.
MixIn
When designing class inheritance relationships, the main line is usually derived from single inheritance. For example, Ostrich inherits from Bird. However, if we need to "mix in" additional functionalities, we can achieve this through multiple inheritance, such as allowing Ostrich to inherit from Bird while also inheriting Runnable. This design is commonly referred to as MixIn.
To better illustrate the inheritance relationship, we can rename Runnable and Flyable to RunnableMixIn and FlyableMixIn. Similarly, you can define carnivorous CarnivorousMixIn and herbivorous HerbivoresMixIn, allowing an animal to have several MixIns:
class Dog(Mammal, RunnableMixIn, CarnivorousMixIn):
pass
The purpose of MixIn is to add multiple functionalities to a class, so when designing classes, we prioritize combining multiple MixIn functionalities through multiple inheritance rather than designing complex multi-level inheritance relationships.
Many libraries in Python also use MixIn. For example, Python provides TCPServer and UDPServer classes for network services, and to serve multiple users simultaneously, we must use either a multi-process or multi-threaded model, which is provided by ForkingMixIn and ThreadingMixIn. Through combination, we can create suitable services.
For instance, to write a multi-process TCP service, we define it as follows:
class MyTCPServer(TCPServer, ForkingMixIn):
pass
To write a multi-threaded UDP service, we define it as follows:
class MyUDPServer(UDPServer, ThreadingMixIn):
pass
If you plan to implement a more advanced coroutine model, you can define a CoroutineMixIn
:
class MyTCPServer(TCPServer, CoroutineMixIn):
pass
This way, we do not need a complex and large inheritance chain; we can quickly construct the required subclass by selecting and combining the functionalities of different classes.
Summary
- Since Python allows multiple inheritance, MixIn is a common design.
- Languages that only allow single inheritance (like Java) cannot use MixIn design.
Customizing Classes
Python classes also have many special-purpose functions that can help us customize classes.
str
First, we define a Student class and print an instance:
>>> class Student(object):
... def __init__(self, name):
... self.name = name
...
>>> print(Student('Michael'))
<__main__.Student object at 0x109afb190>
Printing results in a bunch of <main.Student object at 0x109afb190>, which is not visually appealing.
How can we make it look better? We just need to define the str() method to return a nicely formatted string:
>>> class Student(object):
... def __init__(self, name):
... self.name = name
... def __str__(self):
... return 'Student object (name: %s)' % self.name
...
>>> print(Student('Michael'))
Student object (name: Michael)
Now, when we print the instance, it not only looks good but also clearly shows important internal data of the instance.
However, careful observers will notice that directly typing the variable without print still results in an unattractive output:
>>> s = Student('Michael')
>>> s
<__main__.Student object at 0x109afb310>
This is because directly displaying a variable does not call str() but calls repr(). The difference is that str() returns a string for users, while repr() returns a string for developers, meaning repr() is for debugging purposes.
The solution is to define a repr() method as well. However, since str() and repr() are usually the same, we can use a shortcut:
class Student(object):
def __init__(self, name):
self.name = name
def __str__(self):
return 'Student object (name=%s)' % self.name
__repr__ = __str__
iter
If a class wants to be used in a for ... in loop, similar to list or tuple, it must implement an iter() method that returns an iterator, and then Python's for loop will continuously call the iterator's next() method to get the next value until it encounters a StopIteration error to exit the loop.
Using the Fibonacci sequence as an example, we can write a Fib class that can be used in a for loop:
class Fib(object):
def __init__(self):
self.a, self.b = 0, 1 # Initialize two counters a, b
def __iter__(self):
return self # The instance itself is the iterator, so return itself
def __next__(self):
self.a, self.b = self.b, self.a + self.b # Calculate the next value
if self.a > 100000: # Condition to exit the loop
raise StopIteration()
return self.a # Return the next value
Now, let’s try using the Fib instance in a for loop:
>>> for n in Fib():
... print(n)
...
1
1
2
3
5
...
46368
75025
getitem
Although the Fib instance can be used in a for loop and looks somewhat like a list, it still cannot be used like a list, for example, to access the fifth element:
>>> Fib()[5]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Fib' object does not support indexing
To behave like a list and allow access by index, we need to implement the getitem() method:
class Fib(object):
def __getitem__(self, n):
a, b = 1, 1
for x in range(n):
a, b = b, a + b
return a
Now, we can access any item in the sequence by index:
>>> f = Fib()
>>> f[0]
1
>>> f[1]
1
>>> f[2]
2
>>> f[3]
3
>>> f[10]
89
>>> f[100]
573147844013817084101
However, lists have a magical slicing method:
>>> list(range(100))[5:10]
[5, 6, 7, 8, 9]
But for Fib, it raises an error. The reason is that getitem() can receive either an int or a slice object, so we need to handle that:
class Fib(object):
def __getitem__(self, n):
if isinstance(n, int): # n is an index
a, b = 1, 1
for x in range(n):
a, b = b, a + b
return a
if isinstance(n, slice): # n is a slice
start = n.start
stop = n.stop
if start is None:
start = 0
a, b = 1, 1
L = []
for x in range(stop):
if x >= start:
L.append(a)
a, b = b, a + b
return L
Now let’s try slicing Fib:
>>> f = Fib()
>>> f[0:5]
[1, 1, 2, 3, 5]
>>> f[:10]
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
However, we have not handled the step parameter:
>>> f[:10:2]
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
We also have not handled negative indices, so implementing getitem() correctly requires a lot of work.
Additionally, if we treat the object as a dict, the parameter of getitem() could also be an object that can serve as a key, such as a string.
Correspondingly, there is the setitem() method, which allows us to assign values to the object as if it were a list or dict. Finally, there is also a delitem() method for deleting an element.
In summary, through the above methods, our custom classes can behave similarly to Python's built-in list, tuple, and dict, thanks to the dynamic language's "duck typing," which does not require enforcing a specific interface.
getattr
Normally, when we call a class's method or attribute, if it does not exist, it raises an error. For example, defining the Student class:
class Student(object):
def __init__(self):
self.name = 'Michael'
Calling the name attribute works fine, but calling a non-existent score attribute results in an error:
>>> s = Student()
>>> print(s.name)
Michael
>>> print(s.score)
Traceback (most recent call last):
...
AttributeError: 'Student' object has no attribute 'score'
The error message clearly tells us that the score attribute was not found.
To avoid this error, in addition to adding a score attribute, Python has another mechanism: writing a getattr() method to dynamically return an attribute. Modify it as follows:
class Student(object):
def __init__(self):
self.name = 'Michael'
def __getattr__(self, attr):
if attr == 'score':
return 99
When calling a non-existent attribute, such as score, the Python interpreter will attempt to call getattr(self, 'score') to try to obtain the attribute, giving us the opportunity to return the value of score:
>>> s = Student()
>>> s.name
'Michael'
>>> s.score
99
Returning a function is also completely acceptable:
class Student(object):
def __getattr__(self, attr):
if attr == 'age':
return lambda: 25
However, the calling method must change:
>>> s.age()
25
Note that getattr is only called when the attribute is not found; existing attributes, such as name, will not be looked up in getattr.
Additionally, note that any call like s.abc will return None by default, as we have not defined a return value for attributes not handled. To ensure that the class only responds to specific attributes, we should raise an AttributeError:
class Student(object):
def __getattr__(self, attr):
if attr == 'age':
return lambda: 25
raise AttributeError('\'Student\' object has no attribute \'%s\'' % attr)
This effectively allows us to dynamically handle all attribute and method calls of a class without any special means.
What is the practical use of this completely dynamic calling feature? It allows us to handle completely dynamic situations.
call
An object instance can have its own attributes and methods. When we call an instance method, we use instance.method() to call it. Can we call it directly on the instance itself? In Python, the answer is yes.
Any class can define a call() method, allowing the instance to be called directly. Here’s an example:
class Student(object):
def __init__(self, name):
self.name = name
def __call__(self):
print('My name is %s.' % self.name)
The calling method is as follows:
>>> s = Student('Michael')
>>> s() # Do not pass the self parameter
My name is Michael.
The call() method can also define parameters. Calling an instance directly is akin to calling a function, so you can treat objects as functions and functions as objects, as there is fundamentally no difference between the two.
If you treat an object as a function, then the function itself can also be dynamically created at runtime, because instances of classes are created at runtime, thus blurring the lines between objects and functions.
So, how do we determine whether a variable is an object or a function? In fact, more often than not, we need to check whether an object is callable. Callable objects are those that can be called, such as functions and instances of classes with a call() method:
>>> callable(Student())
True
>>> callable(max)
True
>>> callable([1, 2, 3])
False
>>> callable(None)
False
>>> callable('str')
False
By using the callable() function, we can check whether an object is a "callable" object.
Summary
Python classes allow defining many customization methods, making it very convenient to generate specific classes.
Using Enum Classes
When we need to define constants, one way is to use uppercase variables with integers, such as months:
JAN = 1
FEB = 2
MAR = 3
...
NOV = 11
DEC = 12
The advantage is simplicity, but the downside is that the type is int and still a variable.
A better way is to define an enumeration type as a class type, where each constant is a unique instance of the class. Python provides the Enum class to implement this functionality:
from enum import Enum
Month = Enum('Month', ('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'))
This gives us an enumeration class of type Month, and we can directly use Month.Jan to reference a constant or enumerate all its members:
for name, member in Month.__members__.items():
print(name, '=>', member, ',', member.value)
The value attribute is automatically assigned to members as an int constant, starting from 1 by default.
If we need more precise control over the enumeration type, we can derive a custom class from Enum:
from enum import Enum, unique
@unique
class Weekday(Enum):
Sun = 0 # Sun's value is set to 0
Mon = 1
Tue = 2
Wed = 3
Thu = 4
Fri = 5
Sat = 6
The @unique decorator helps us check for duplicate values.
There are several ways to access these enumeration types:
>>> day1 = Weekday.Mon
>>> print(day1)
Weekday.Mon
>>> print(Weekday.Tue)
Weekday.Tue
>>> print(Weekday['Tue'])
Weekday.Tue
>>> print(Weekday.Tue.value)
2
>>> print(day1 == Weekday.Mon)
True
>>> print(day1 == Weekday.Tue)
False
>>> print(Weekday(1))
Weekday.Mon
>>> print(day1 == Weekday(1))
True
>>> Weekday(7)
Traceback (most recent call last):
...
ValueError: 7 is not a valid Weekday
>>> for name, member in Weekday.__members__.items():
... print(name, '=>', member)
...
Sun => Weekday.Sun
Mon => Weekday.Mon
Tue => Weekday.Tue
Wed => Weekday.Wed
Thu => Weekday.Thu
Fri => Weekday.Fri
Sat => Weekday.Sat
As we can see, we can reference enumeration constants using member names or directly obtain enumeration constants based on their value.
Using Metaclasses
type()
The biggest difference between dynamic languages and static languages is that the definition of functions and classes is not done at compile time but is dynamically created at runtime.
For example, if we want to define a Hello class, we write a hello.py module:
class Hello(object):
def hello(self, name='world'):
print('Hello, %s.' % name)
When the Python interpreter loads the hello module, it executes all the statements in the module sequentially, resulting in the dynamic creation of a Hello class object. Testing it as follows:
>>> from hello import Hello
>>> h = Hello()
>>> h.hello()
Hello, world.
>>> print(type(Hello))
<class 'type'>
>>> print(type(h))
<class 'hello.Hello'>
The type() function can check the type of a type or variable. Hello is a class, and its type is type, while h is an instance, and its type is class Hello.
We say that the definition of a class is dynamically created at runtime, and the method to create a class is using the type() function.
The type() function can return the type of an object and also create new types. For example, we can create the Hello class using the type() function without defining it with class Hello(object)...:
>>> def fn(self, name='world'): # First define a function
... print('Hello, %s.' % name)
...
>>> Hello = type('Hello', (object,), dict(hello=fn)) # Create Hello class
>>> h = Hello()
>>> h.hello()
Hello, world.
>>> print(type(Hello))
<class 'type'>
>>> print(type(h))
<class '__main__.Hello'>
To create a class object, the type() function takes three parameters in order:
- The name of the class;
- The collection of parent classes to inherit from. Note that Python supports multiple inheritance; if there is only one parent class, do not forget the tuple's single-element syntax;
- The method names and function bindings for the class. Here, we bind the function fn to the method name hello.
Classes created using the type() function are completely the same as those defined directly with class, because when the Python interpreter encounters a class definition, it merely scans the syntax of the class definition and then calls the type() function to create the class.
Under normal circumstances, we use class Xxx... to define classes, but the type() function also allows us to dynamically create classes, meaning that dynamic languages themselves support the dynamic creation of classes at runtime, which is a significant difference from static languages. In static languages, creating classes at runtime requires constructing source code strings and calling a compiler, or using some tools to generate bytecode, which essentially involves dynamic compilation and is very complex.
Metaclass
In addition to using type() to dynamically create classes, we can also use metaclasses to control class creation behavior.
A metaclass, literally translated as metaclass, can be simply explained as follows:
Once we define a class, we can create instances based on that class. So: first define a class, then create instances.
But what if we want to create a class? Then we must create the class based on the metaclass, so: first define a metaclass, then create a class.
Putting it together: first define a metaclass, then create a class, and finally create instances.
Thus, metaclasses allow you to create or modify classes. In other words, you can think of classes as "instances" created by metaclasses.
Metaclasses are the most difficult concept to understand and use in Python's object-oriented programming. Under normal circumstances, you will not encounter situations where you need to use metaclasses, so the following content may not be understandable, and that is okay because you will not likely use it.