python study 2

Three Functions#

Calling Functions

Python has many built-in useful functions that we can call directly. To call a function, you need to know the function name and parameters, such as the absolute value function abs, which takes only one parameter. You can view the documentation directly from Python's official website:
http://docs.python.org/3/library/functions.html#abs
You can also check the help information for the abs function in the interactive command line using help(abs).

When calling a function, if the number of parameters passed is incorrect, a TypeError will be raised, and Python will clearly tell you: abs() has exactly one parameter, but two were given:

>>> abs(1, 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: abs() takes exactly one argument (2 given)

If the number of parameters passed is correct, but the parameter type cannot be accepted by the function, a TypeError will also be raised, with an error message indicating that str is an incorrect parameter type:

>>> abs('a')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: bad operand type for abs(): 'str'

The max function max() can accept any number of parameters and returns the largest one:

>>> max(1, 2)
2
>>> max(2, 3, 1, -5)
3

Data Type Conversion
Python's built-in common functions also include data type conversion functions, such as the int() function, which can convert other data types to integers:

>>> int('123')
123
>>> int(12.34)
12
>>> float('12.34')
12.34
>>> str(1.23)
'1.23'
>>> str(100)
'100'
>>> bool(1)
True
>>> bool('')
False

The function name is actually a reference to a function object, and you can assign the function name to a variable, effectively giving the function an "alias":

>>> a = abs # Variable a points to the abs function
>>> a(-1) # So you can also call the abs function through a
1

Defining Functions
In Python, to define a function, you use the def statement, followed by the function name, parentheses, parameters within the parentheses, and a colon :, then write the function body in an indented block, returning the function's return value with the return statement.

For example, to define a custom absolute value function my_abs:

def my_abs(x):
    if x >= 0:
        return x
    else:
        return -x

Note that once the return statement is executed in the function body, the function is completed, and the result is returned. Therefore, very complex logic can be implemented inside the function through conditional statements and loops.

If there is no return statement, the function will also return a result after execution, but the result will be None. return None can be simplified to return.

When defining a function in the Python interactive environment, note that Python will show a ... prompt. After the function definition is complete, you need to press Enter twice to return to the >>> prompt:

>>>def my_abs(x):
...   if x >= 0:
...       return x
...    else:
...        return -x
... 
>>> my_abs(-9)
9
>>>

If you have already saved the my_abs() function definition in a file named abstest.py, you can start the Python interpreter in the current directory of that file and import the my_abs() function using from abstest import my_abs, noting that abstest is the filename (without the .py extension):

>>> from abstest import my_abs 
>>> my_abs(-9) 
9

Empty Functions
If you want to define a function that does nothing, you can use the pass statement:

def nop():
    pass

The pass statement does nothing, so what is its use? In fact, pass can be used as a placeholder. For example, if you haven't figured out how to write the function's code yet, you can put a pass there to allow the code to run.

pass can also be used in other statements, for example:

if age >= 18:
    pass

Without pass, the code will have a syntax error.

Parameter Checking
When calling a function, if the number of parameters is incorrect, the Python interpreter will automatically check and raise a TypeError:

>>> my_abs(1, 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: my_abs() takes 1 positional argument but 2 were given

However, if the parameter type is incorrect, the Python interpreter cannot help us check. Let's try the difference between my_abs and the built-in function abs:

>>> my_abs('A')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in my_abs
TypeError: unorderable types: str() >= int()
>>> abs('A')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: bad operand type for abs(): 'str'

When inappropriate parameters are passed, the built-in function abs will check for parameter errors, while our defined my_abs lacks parameter checking, leading to an error in the if statement, with an error message different from abs. Therefore, this function definition is not complete.

Let's modify the definition of my_abs to check the parameter type, allowing only integer and float types as parameters. Data type checking can be implemented using the built-in function isinstance():

def my_abs(x):
    if not isinstance(x, (int, float)):
        raise TypeError('bad operand type')
    if x >= 0:
        return x
    else:
        return -x

After adding parameter checking, if an incorrect parameter type is passed, the function can raise an error:

>>> my_abs('A')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in my_abs
TypeError: bad operand type

Returning Multiple Values
Functions can return multiple values.

For example, in games, it is often necessary to move from one point to another, given coordinates, displacement, and angle, to calculate the new coordinates:

import math

def move(x, y, step, angle=0):
    nx = x + step * math.cos(angle)
    ny = y - step * math.sin(angle)
    return nx, ny

The import math statement indicates importing the math package, allowing subsequent code to reference functions like sin, cos, etc.
Then, we can obtain return values simultaneously:

>>> x, y = move(100, 100, 60, math.pi / 6)
>>> print(x, y)
151.96152422706632 70.0

But this is actually an illusion; the value returned by the Python function is still a single value:

>>> r = move(100, 100, 60, math.pi / 6)
>>> print(r)
(151.96152422706632, 70.0)

The return value is a tuple! However, in syntax, returning a tuple can omit parentheses, and multiple variables can simultaneously receive a tuple, assigning values according to position, so the multiple return values of Python functions actually return a tuple, but it is more convenient to write.

Function Parameters
For the caller of the function, it is sufficient to know how to pass the correct parameters and what kind of values the function will return; the complex logic inside the function is encapsulated, and the caller does not need to understand it.

Python's function definitions are very simple, but the flexibility is very large. In addition to the normally defined required parameters, you can also use default parameters, variable parameters, and keyword parameters, allowing the function interface to handle complex parameters while simplifying the caller's code.

Positional Parameters
Let's first write a function to calculate x^2:

def power(x):
    return x * x

For the power(x) function, the parameter x is a positional parameter.

When we call the power function, we must pass exactly one parameter x:

>>> power(5)
25

Now, if we want to calculate x^3, what should we do? We could define another power3 function, but what if we want to calculate x^4, x^5, etc.? We cannot define an infinite number of functions.

You might think of modifying power(x) to power(x, n) to calculate x^n:

def power(x, n):
    s = 1
    while n > 0:
        n = n - 1
        s = s * x
    return s

For this modified power(x, n) function, it can calculate any n power:

>>> power(5, 2)
25

The modified power(x, n) function has two parameters: x and n, both of which are positional parameters. When calling the function, the two values passed are assigned to parameters x and n in order.

Default Parameters
The new power(x, n) function definition is fine, but the old calling code fails because we added a parameter, causing the old code to fail to call normally due to the missing parameter:

>>> power(5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: power() missing 1 required positional argument: 'n'

Python's error message is very clear: calling the function power() is missing a positional parameter n.

At this point, default parameters come into play. Since we often calculate x^2, we can set the default value of the second parameter n to 2:

def power(x, n=2):
    s = 1
    while n > 0:
        n = n - 1
        s = s * x
    return s

Thus, when we call power(5), it is equivalent to calling power(5, 2):

>>> power(5)
25
>>> power(5, 2)
25

For cases where n > 2, you must explicitly pass n, such as power(5, 3).

From the above example, we can see that default parameters can simplify function calls. When setting default parameters, there are a few points to note:

Required parameters come first, and default parameters come later; otherwise, Python's interpreter will raise an error.
How to set default parameters. When a function has multiple parameters, place the parameters that vary more in front and the parameters that vary less behind. The less variable parameters can be set as default parameters.

What are the benefits of using default parameters? The biggest benefit is that it can reduce the difficulty of calling the function.

For example, let's write a function for registering first-grade students, which requires passing in two parameters: name and gender:

def enroll(name, gender):
    print('name:', name)
    print('gender:', gender)

Thus, calling the enroll() function only requires passing in two parameters:

>>> enroll('Sarah', 'F')
name: Sarah
gender: F

If we need to continue passing in age, city, and other information, how do we do that? This would greatly increase the complexity of calling the function.

We can set age and city as default parameters:

def enroll(name, gender, age=6, city='Beijing'):
    print('name:', name)
    print('gender:', gender)
    print('age:', age)
    print('city:', city)

Thus, most students registering do not need to provide age and city, only the two required parameters:

>>> enroll('Sarah', 'F')
name: Sarah
gender: F
age: 6
city: Beijing

Only students whose information does not match the default parameters need to provide additional information:

enroll('Bob', 'M', 7)
enroll('Adam', 'M', city='Tianjin')

It can be seen that default parameters reduce the difficulty of function calls, and when more complex calls are needed, more parameters can be passed to achieve that. Whether for simple or complex calls, the function only needs to be defined once.

When there are multiple default parameters, you can provide default parameters in order, such as calling enroll('Bob', 'M', 7), meaning that besides the two parameters name and gender, the last parameter applies to age, while the city parameter still uses the default value since it was not provided.

You can also provide some default parameters out of order. When providing some default parameters out of order, you need to specify the parameter names. For example, calling enroll('Adam', 'M', city='Tianjin') means that the city parameter uses the provided value, while other default parameters continue to use their default values.

Default parameters are very useful, but if used improperly, you can fall into pitfalls. The biggest pitfall of default parameters is demonstrated as follows:

First, define a function that takes a list, adds an END, and returns it:

def add_end(L=[]):
    L.append('END')
    return L

When you call it normally, the result seems fine:

>>> add_end([1, 2, 3])
[1, 2, 3, 'END']
>>> add_end(['x', 'y', 'z'])
['x', 'y', 'z', 'END']

When you call it using the default parameter, the first result is also correct:

>>> add_end()
['END']

However, when you call add_end() again, the result becomes incorrect:

>>> add_end()
['END', 'END']
>>> add_end()
['END', 'END', 'END']

Many beginners are confused; the default parameter is [], but the function seems to "remember" the list after adding 'END' each time.

The explanation is as follows:
When defining the function, the default parameter L is calculated at that time, which is []. Since the default parameter L is also a variable that points to the object [], every time the function is called, if the content of L is changed, the content of the default parameter will change and will no longer be [] as defined in the function.

When defining default parameters, remember: Default parameters must point to immutable objects!

To modify the above example, we can use None, an immutable object:

def add_end(L=None):
    if L is None:
        L = []
    L.append('END')
    return L

Now, no matter how many times you call it, there will be no problem:

>>> add_end()
['END']
>>> add_end()
['END']

Why design immutable objects like str and None? Because once an immutable object is created, the internal data of the object cannot be modified, which reduces errors caused by modifying data. Additionally, since the object is immutable, in a multi-tasking environment, reading the object simultaneously does not require locking, and there is no problem reading it.

Variable Parameters
In Python functions, you can also define variable parameters. As the name suggests, variable parameters mean that the number of parameters passed can vary; it can be 1, 2, or any number, even 0.

Let's take a mathematical example: given a set of numbers a, b, c, ..., please calculate a^2 + b^2 + c^2 + ....

To define this function, we must determine the input parameters. Since the number of parameters is uncertain, we first think of passing a, b, c, ... as a list or tuple, so the function can be defined as follows:

def calc(numbers):
    sum = 0
    for n in numbers:
        sum = sum + n * n
    return sum

However, when calling, we need to first assemble a list or tuple:

>>> calc([1, 2, 3])
14
>>> calc((1, 3, 5, 7))
84

If we use variable parameters, the way to call the function can be simplified like this:

>>> calc(1, 2, 3)
14
>>> calc(1, 3, 5, 7)
84

So, we change the function parameters to variable parameters:

def calc(*numbers):
    sum = 0
    for n in numbers:
        sum = sum + n * n
    return sum

Defining variable parameters is simply done by adding an asterisk * before the parameter. Inside the function, the parameter numbers receives a tuple, so the function code remains unchanged. However, when calling the function, you can pass any number of parameters, including 0 parameters:

>>> calc(1, 2)
5
>>> calc()
0

If you already have a list or tuple and want to call a variable parameter, you can do it like this:

>>> nums = [1, 2, 3]
>>> calc(nums[0], nums[1], nums[2])
14

This way is certainly feasible, but it is too cumbersome, so Python allows you to add an asterisk * before the list or tuple to pass the elements of the list or tuple as variable parameters:

>>> nums = [1, 2, 3]
>>> calc(*nums)
14

*nums means passing all elements of the nums list as variable parameters. This way of writing is very useful and common.

Keyword Parameters
Variable parameters allow you to pass 0 or any number of parameters, which are automatically assembled into a tuple during the function call. Keyword parameters allow you to pass 0 or any number of named parameters, which are automatically assembled into a dict inside the function.

def person(name, age, **kw):
    print('name:', name, 'age:', age, 'other:', kw)

The person function accepts keyword parameters kw in addition to the required parameters name and age. When calling this function, you can pass only the required parameters:

>>> person('Michael', 30)
name: Michael age: 30 other: {}

You can also pass any number of keyword parameters:

>>> person('Bob', 35, city='Beijing')
name: Bob age: 35 other: {'city': 'Beijing'}
>>> person('Adam', 45, gender='M', job='Engineer')
name: Adam age: 45 other: {'gender': 'M', 'job': 'Engineer'}

What is the use of keyword parameters? They can extend the functionality of the function. For example, in the person function, we ensure that we can receive the two parameters name and age, but if the caller is willing to provide more parameters, we can also receive them. Imagine you are implementing a user registration function, where the username and age are required fields, while others are optional. Using keyword parameters to define this function can meet the registration requirements.

Similar to variable parameters, you can also assemble a dict and convert that dict into keyword parameters to pass in:

>>> extra = {'city': 'Beijing', 'job': 'Engineer'}
>>> person('Jack', 24, city=extra['city'], job=extra['job'])
name: Jack age: 24 other: {'city': 'Beijing', 'job': 'Engineer'}

Of course, the above complex call can be simplified:

>>> extra = {'city': 'Beijing', 'job': 'Engineer'}
>>> person('Jack', 24, **extra)
name: Jack age: 24 other: {'city': 'Beijing', 'job': 'Engineer'}

Named Keyword Parameters
For keyword parameters, the caller can pass any unrestricted keyword parameters. As for which parameters were passed, it needs to be checked in the function body through kw.

Still using the person() function as an example, we want to check if there are city and job parameters:

def person(name, age, **kw):
    if 'city' in kw:
        # There is a city parameter
        pass
    if 'job' in kw:
        # There is a job parameter
        pass
    print('name:', name, 'age:', age, 'other:', kw)

However, the caller can still pass unrestricted keyword parameters:

>>> person('Jack', 24, city='Beijing', addr='Chaoyang', zipcode=123456)

If you want to restrict the names of keyword parameters, you can use named keyword parameters, for example, only accepting city and job as keyword parameters. The function defined in this way is as follows:

def person(name, age, *, city, job):
    print(name, age, city, job)

Unlike keyword parameters kw, named keyword parameters require a special separator *, and the parameters after * are treated as named keyword parameters.

The calling method is as follows:

>>> person('Jack', 24, city='Beijing', job='Engineer')
Jack 24 Beijing Engineer

If the function definition already has a variable parameter, the named keyword parameters that follow do not require a special separator *:

def person(name, age, *args, city, job):
    print(name, age, args, city, job)

Named keyword parameters must be passed with parameter names, which is different from positional parameters. If the parameter names are not passed, the call will raise an error:

>>> person('Jack', 24, 'Beijing', 'Engineer')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: person() missing 2 required keyword-only arguments: 'city' and 'job'

Since the call lacks the parameter names city and job, the Python interpreter treats the first two parameters as positional parameters, while the last two parameters are passed to *args, but the lack of named keyword parameters leads to an error.

Named keyword parameters can have default values, simplifying the call:

def person(name, age, *, city='Beijing', job):
    print(name, age, city, job)

Since the named keyword parameter city has a default value, it can be omitted when calling:

>>> person('Jack', 24, job='Engineer')
Jack 24 Beijing Engineer

When using named keyword parameters, be particularly careful: if there are no variable parameters, you must add a * as a special separator. If you omit the *, the Python interpreter will not be able to distinguish between positional parameters and named keyword parameters.

def person(name, age, city, job):
    # Missing *, city and job are treated as positional parameters
    pass

Parameter Combinations
In Python, you can define functions using required parameters, default parameters, variable parameters, named keyword parameters, and keyword parameters. These five types of parameters can be combined. However, please note that the order of parameter definitions must be: required parameters, default parameters, variable parameters, named keyword parameters, and keyword parameters.

For example, defining a function that includes several of the above parameters:

def f1(a, b, c=0, *args, **kw):
    print('a =', a, 'b =', b, 'c =', c, 'args =', args, 'kw =', kw)

def f2(a, b, c=0, *, d, **kw):
    print('a =', a, 'b =', b, 'c =', c, 'd =', d, 'kw =', kw)

When calling the function, the Python interpreter automatically passes the corresponding parameters according to their position and name.

>>> f1(1, 2)
a = 1 b = 2 c = 0 args = () kw = {}
>>> f1(1, 2, c=3)
a = 1 b = 2 c = 3 args = () kw = {}
>>> f1(1, 2, 3, 'a', 'b')
a = 1 b = 2 c = 3 args = ('a', 'b') kw = {}
>>> f1(1, 2, 3, 'a', 'b', x=99)
a = 1 b = 2 c = 3 args = ('a', 'b') kw = {'x': 99}
>>> f2(1, 2, d=99, ext=None)
a = 1 b = 2 c = 0 d = 99 kw = {'ext': None}

The most amazing thing is that you can call the above functions using a tuple and a dict:

>>> args = (1, 2, 3, 4)
>>> kw = {'d': 99, 'x': '#'}
>>> f1(*args, **kw)
a = 1 b = 2 c = 3 args = (4,) kw = {'d': 99, 'x': '#'}
>>> args = (1, 2, 3)
>>> kw = {'d': 88, 'x': '#'}
>>> f2(*args, **kw)
a = 1 b = 2 c = 3 d = 88 kw = {'x': '#'}

Thus, for any function, you can call it in the form of func(*args, **kw), regardless of how its parameters are defined.

Although you can combine up to five types of parameters, do not use too many combinations at once, as it will make the function interface difficult to understand.

Summary

Python functions have very flexible parameter forms, allowing for both simple calls and the passing of very complex parameters.
Default parameters must use immutable objects; if they are mutable objects, there will be logical errors during program execution!
Pay attention to the syntax for defining variable parameters and keyword parameters:
- *args is for variable parameters, and args receives a tuple;
- **kw is for keyword parameters, and kw receives a dict.
Also, note the syntax for passing variable parameters and keyword parameters when calling functions:
- Variable parameters can be passed directly: func(1, 2, 3), or assembled into a list or tuple and passed using *args: func(*(1, 2, 3));
- Keyword parameters can be passed directly: func(a=1, b=2), or assembled into a dict and passed using **kw: func(**{'a': 1, 'b': 2}).

Using *args and **kw is a Pythonic convention, but you can use other parameter names; however, it is best to use conventional names.

Named keyword parameters are used to restrict the names of parameters that the caller can pass while also providing default values.
When defining named keyword parameters without variable parameters, do not forget to write the separator *; otherwise, the parameters will be treated as positional parameters.

Recursive Functions
Within a function, you can call other functions. If a function calls itself internally, that function is a recursive function.

For example, to calculate the factorial n! = 1 x 2 x 3 x ... x n, we can represent it with the function fact(n), which can be seen as:

fact(n) = n! = 1 × 2 × 3 × ... × (n−1) × n = (n−1)! × n = fact(n−1) × n

Thus, fact(n) can be expressed as n x fact(n-1), with special handling needed when n=1.

So, the recursive way to write fact(n) is:

def fact(n):
    if n == 1:
        return 1
    return n * fact(n - 1)

The above is a recursive function. You can try it:

>>> fact(1)
1
>>> fact(5)
120

If we calculate fact(5), we can see the calculation process according to the function definition as follows:

===> fact(5)
===> 5 * fact(4)
===> 5 * (4 * fact(3))
===> 5 * (4 * (3 * fact(2)))
===> 5 * (4 * (3 * (2 * fact(1))))
===> 5 * (4 * (3 * (2 * 1)))
===> 5 * (4 * (3 * 2))
===> 5 * (4 * 6)
===> 5 * 24
===> 120

The advantage of recursive functions is that they are simple to define and have clear logic. In theory, all recursive functions can be written in a loop, but the logic of loops is not as clear as that of recursion.

When using recursive functions, be careful to prevent stack overflow. In computers, function calls are implemented using a stack data structure. Each time a function call is made, a stack frame is added, and each time a function returns, a stack frame is removed. Since the stack size is not infinite, excessive recursive calls can lead to stack overflow. You can try fact(1000):

>>> fact(1000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in fact
  ...
  File "<stdin>", line 4, in fact
RuntimeError: maximum recursion depth exceeded in comparison

The method to solve stack overflow caused by recursive calls is through tail recursion optimization. In fact, tail recursion and loops are equivalent, so viewing loops as a special case of tail recursion is also acceptable.

Tail Recursion: refers to calling itself at the end of the function, and the return statement cannot contain expressions. This way, the compiler or interpreter can optimize the tail recursion, allowing the recursion to occupy only one stack frame regardless of how many times it is called, thus preventing stack overflow.

The above fact(n) function is not tail recursive because return n * fact(n - 1) introduces a multiplication expression. To change it to a tail recursive form, a bit more code is needed, mainly to pass the product at each step into the recursive function:

def fact(n):
    return fact_iter(n, 1)

def fact_iter(num, product):
    if num == 1:
        return product
    return fact_iter(num - 1, num * product)

You can see that return fact_iter(num - 1, num * product) only returns the recursive function itself, and num - 1 and num * product will be calculated before the function call, not affecting the function call.

The call corresponding to fact(5) is fact_iter(5, 1):

===> fact_iter(5, 1)
===> fact_iter(4, 5)
===> fact_iter(3, 20)
===> fact_iter(2, 60)
===> fact_iter(1, 120)
===> 120

When tail recursion is called, if optimized, the stack will not grow, so no matter how many times it is called, it will not lead to stack overflow.

Unfortunately, most programming languages do not optimize for tail recursion, and the Python interpreter does not do so either. Therefore, even if the above fact(n) function is changed to tail recursion, it will still lead to stack overflow.

Summary

The advantage of using recursive functions is that the logic is simple and clear, while the disadvantage is that deep calls can lead to stack overflow.
Languages that optimize for tail recursion can prevent stack overflow through tail recursion. Tail recursion is actually equivalent to loops; programming languages without loop statements can only implement loops through tail recursion.
The standard Python interpreter does not optimize for tail recursion, so any recursive function has the potential for stack overflow.

Four Advanced Features#

The less code, the higher the development efficiency.
Slicing
Taking a portion of elements from a list or tuple is a very common operation. For example, a list is as follows:

>>> L = ['Michael', 'Sarah', 'Tracy', 'Bob', 'Jack']

How do you take the first 3 elements?

The clumsy way:

>>> [L[0], L[1], L[2]]
['Michael', 'Sarah', 'Tracy']

The reason it is clumsy is that if you extend it to take the first N elements, it becomes difficult.

To take the first N elements, which are the elements with indices 0 to (N-1), you can use a loop:

>>> r = []
>>> n = 3
>>> for i in range(n):
...     r.append(L[i])
... 
>>> r
['Michael', 'Sarah', 'Tracy']

For such frequent operations of taking specified index ranges, using loops is very cumbersome. Therefore, Python provides the slicing (Slice) operator, which can greatly simplify this operation.

For the above problem, to take the first 3 elements, you can complete the slice in one line of code:

>>> L[0:3]
['Michael', 'Sarah', 'Tracy']

L[0:3] means to take from index 0 up to but not including index 3, which means indices 0, 1, and 2, exactly 3 elements.

If the first index is 0, it can also be omitted:

>>> L[:3]
['Michael', 'Sarah', 'Tracy']

You can also start from index 1 and take out 2 elements:

>>> L[1:3]
['Sarah', 'Tracy']

Similarly, since Python supports L[-1] to take the last element, it also supports negative slicing:

>>> L[-2:]
['Bob', 'Jack']
>>> L[-2:-1]
['Bob']

Remember that the index of the last element is -1.

Slicing operations are very useful. Let's first create a list from 0 to 99:

>>> L = list(range(100))
>>> L
[0, 1, 2, 3, ..., 99]

You can easily take out a segment of the sequence through slicing. For example, the first 10 numbers:

>>> L[:10]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The last 10 numbers:

>>> L[-10:]
[90, 91, 92, 93, 94, 95, 96, 97, 98, 99]

The numbers from 11 to 20:

>>> L[10:20]
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

The first 10 numbers, taking one every two:

>>> L[:10:2]
[0, 2, 4, 6, 8]

All numbers, taking one every five:

>>> L[::5]
[0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95]

You can even write nothing and just write [:] to copy a list as is:

>>> L[:]
[0, 1, 2, 3, ..., 99]

A tuple is also a type of list, with the only difference being that tuples are immutable. Therefore, tuples can also use slicing operations, but the result of the operation will still be a tuple:

>>> (0, 1, 2, 3, 4, 5)[:3]
(0, 1, 2)

The string 'xxx' can also be seen as a type of list, where each element is a character. Therefore, strings can also use slicing operations, but the result will still be a string:

>>> 'ABCDEFG'[:3]
'ABC'
>>> 'ABCDEFG'[::2]
'ACEG'

In many programming languages, there are various substring functions for strings, but the purpose is to slice strings. Python does not have specific substring functions; a simple slice operation can accomplish this, making it very simple.

Iteration
If given a list or tuple, we can traverse this list or tuple using a for loop. This traversal is called iteration.

In Python, iteration is accomplished through for ... in.
In many languages, such as C, iterating through a list is done using indices, like this C code:

for (i=0; i<length; i++) {
    n = list[i];
}

It can be seen that Python's for loop is at a higher level of abstraction than C's for loop because Python's for loop can be applied not only to lists or tuples but also to other iterable objects.

Although the list data type has indices, many other data types do not have indices. However, as long as it is an iterable object, whether it has indices or not, it can be iterated. For example, a dict can be iterated:

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> for key in d:
...     print(key)
...
a
c
b

Since the storage of dict is not arranged in the order of a list, the order of the results obtained through iteration may differ.

By default, dict iterates over keys. If you want to iterate over values, you can use for value in d.values(). If you want to iterate over both keys and values simultaneously, you can use for k, v in d.items().

Since strings are also iterable objects, they can also be used in for loops:

>>> for ch in 'ABC':
...     print(ch)
...
A
B
C

Therefore, when we use a for loop, as long as it operates on an iterable object, the for loop can run normally, and we do not need to worry about whether the object is a list or another data type.

So, how do we determine if an object is an iterable object? The method is to check using the Iterable type from the collections.abc module:

>>> from collections.abc import Iterable
>>> isinstance('abc', Iterable) # Check if str is iterable
True
>>> isinstance([1,2,3], Iterable) # Check if list is iterable
True
>>> isinstance(123, Iterable) # Check if integer is iterable
False

The last question is, if you want to implement index-based loops like Java for a list, Python's built-in enumerate function can turn a list into index-element pairs, allowing you to iterate both the index and the element itself in a for loop:

>>> for i, value in enumerate(['A', 'B', 'C']):
...     print(i, value)
...
0 A
1 B
2 C

In the above for loop, two variables are referenced simultaneously, which is very common in Python. For example, the following code:

>>> for x, y in [(1, 1), (2, 4), (3, 9)]:
...     print(x, y)
...
1 1
2 4
3 9

Any iterable object can be used in a for loop, including custom data types, as long as they meet the iteration conditions.

List Comprehensions
List comprehensions are a very simple yet powerful built-in feature in Python that can be used to create lists.

For example, to generate the list [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], you can use list(range(1, 11)).

But how do you generate [1x1, 2x2, 3x3, ..., 10x10]? One method is to use a loop:

>>> L = []
>>> for x in range(1, 11):
...    L.append(x * x)
...
>>> L
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

However, loops are cumbersome, and list comprehensions can replace the loop with a single line of code to generate the above list:

>>> [x * x for x in range(1, 11)]
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

When writing list comprehensions, place the element to be generated x * x in front, followed by the for loop, and you can create the list.

You can also add an if condition after the for loop, allowing you to filter out only the squares of even numbers:

>>> [x * x for x in range(1, 11) if x % 2 == 0]
[4, 16, 36, 64, 100]

You can even use two nested loops to generate permutations:

>>> [m + n for m in 'ABC' for n in 'XYZ']
['AX', 'AY', 'AZ', 'BX', 'BY', 'BZ', 'CX', 'CY', 'CZ']

Three or more nested loops are rarely used.

Using list comprehensions, you can write very concise code. For example, to list all file and directory names in the current directory, you can achieve it with one line of code:

>>> import os # Import the os module, which will be discussed later
>>> [d for d in os.listdir('.')] # os.listdir can list files and directories
['.emacs.d', '.ssh', '.Trash', 'Adlm', 'Applications', 'Desktop', 'Documents', 'Downloads', 'Library', 'Movies', 'Music', 'Pictures', 'Public', 'VirtualBox VMs', 'Workspace', 'XCode']

The for loop can actually use two or even more variables simultaneously, such as dict's items() can iterate over both keys and values:

>>> d = {'x': 'A', 'y': 'B', 'z': 'C' }
>>> for k, v in d.items():
...     print(k, '=', v)
...
y = B
x = A
z = C

Thus, list comprehensions can also use two variables to generate lists:

>>> d = {'x': 'A', 'y': 'B', 'z': 'C' }
>>> [k + '=' + v for k, v in d.items()]
['y=B', 'x=A', 'z=C']

Finally, to convert all strings in a list to lowercase:

>>> L = ['Hello', 'World', 'IBM', 'Apple']
>>> [s.lower() for s in L]
['hello', 'world', 'ibm', 'apple']

if ... else
When using list comprehensions, some learners often get confused about the use of if...else.

For example, the following code outputs even numbers normally:

>>> [x for x in range(1, 11) if x % 2 == 0]
[2, 4, 6, 8, 10]

However, we cannot add else after the last if:

>>> [x for x in range(1, 11) if x % 2 == 0 else 0]
  File "<stdin>", line 1
    [x for x in range(1, 11) if x % 2 == 0 else 0]
                                              ^
SyntaxError: invalid syntax

This is because the if following the for is a filtering condition and cannot have an else; otherwise, how would it filter?

If you want to write if before for, you must add else, or it will raise an error:

>>> [x if x % 2 == 0 for x in range(1, 11)]
  File "<stdin>", line 1
    [x if x % 2 == 0 for x in range(1, 11)]
                       ^
SyntaxError: invalid syntax

This is because the part before for is an expression, which must calculate a result based on x. Therefore, examining the expression: x if x % 2 == 0, it cannot calculate a result based on x because it lacks else`, which must be added:

>>> [x if x % 2 == 0 else -x for x in range(1, 11)]
[-1, 2, -3, 4, -5, 6, -7, 8, -9, 10]

In the above, the expression x if x % 2 == 0 else -x can determine a definite result based on x.

It can be seen that in a list comprehension, the if ... else before for is an expression, while the if after for is a filtering condition, and cannot have else.

Generators
Through list comprehensions, we can directly create a list. However, due to memory limitations, the capacity of lists is certainly limited. Moreover, creating a list containing one million elements not only occupies a large amount of storage space, but if we only need to access a few elements at the front, the vast majority of elements at the back occupy space unnecessarily.

So, if the elements of the list can be calculated according to some algorithm, can we continuously calculate subsequent elements during the loop? This way, we do not need to create a complete list, saving a lot of space. In Python, this mechanism of calculating while looping is called a generator: generator.

There are many ways to create a generator. The first method is very simple; just change the [] of a list comprehension to (), and you create a generator:

>>> L = [x * x for x in range(10)]
>>> L
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> g = (x * x for x in range(10))
>>> g
<generator object <genexpr> at 0x1022ef630>

The only difference between creating L and g is the outer brackets [] and (). L is a list, while g is a generator.

We can directly print each element of the list, but how do we print each element of the generator?

To print one by one, you can use the next() function to get the next return value of the generator:

>>> next(g)
0
>>> next(g)
1
>>> next(g)
4
>>> next(g)
9
>>> next(g)
16
>>> next(g)
25
>>> next(g)
36
>>> next(g)
49
>>> next(g)
64
>>> next(g)
81
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

The generator saves the algorithm, and each time next(g) is called, it calculates the next element's value of g until it reaches the last element. When there are no more elements to calculate, it raises a StopIteration error.

Of course, this method of continuously calling next(g) is quite cumbersome; the correct method is to use a for loop because generators are also iterable objects:

>>> g = (x * x for x in range(10))
>>> for n in g:
...     print(n)
... 
0
1
4
9
16
25
36
49
64
81

Thus, after creating a generator, we generally never call next(); instead, we iterate it directly using a for loop, and we do not need to worry about the StopIteration error.

Generators are very powerful. If the algorithm for calculating the elements is complex and cannot be implemented with a list comprehension, you can also use functions to implement it.

For example, the famous Fibonacci sequence, where every number after the first two can be obtained by adding the two preceding numbers:
1, 1, 2, 3, 5, 8, 13, 21, 34, ...
The Fibonacci sequence cannot be written with a list comprehension, but it can be easily printed using a function:

def fib(max):
    n, a, b = 0, 0, 1
    while n < max:
        print(b)
        a, b = b, a + b
        n = n + 1
    return 'done'

Note the assignment statement:

a, b = b, a + b

This is equivalent to:

t = (b, a + b) # t is a tuple
a = t[0]
b = t[1]

But it does not need to explicitly write out the temporary variable t.

The above function can output the first N numbers of the Fibonacci sequence:

>>> fib(6)
1
1
2
3
5
8
'done'

Upon careful observation, it can be seen that the fib function actually defines the calculation rules for the Fibonacci sequence, starting from the first element and calculating any subsequent elements. This logic is very similar to a generator.

In other words, the above function and the generator are only one step apart. To turn the fib function into a generator function, simply change print(b) to yield b:

def fib(max):
    n, a, b = 0, 0, 1
    while n < max:
        yield b
        a, b = b, a + b
        n = n + 1
    return 'done'

This is another way to define a generator. If a function definition contains the yield keyword, then that function is no longer a normal function but a generator function. Calling a generator function will return a generator:

>>> f = fib(6)
>>> f
<generator object fib at 0x104feaaa0>

Here, the most difficult thing to understand is that the execution flow of a generator function is different from that of a normal function. A normal function executes sequentially and returns when it encounters a return statement or the last line of the function. However, a generator function executes each time next() is called, and it returns when it encounters yield, resuming execution from the last yielded statement.

For example, define a generator function that returns the numbers 1, 3, and 5 in turn:

def odd():
    print('step 1')
    yield 1
    print('step 2')
    yield(3)
    print('step 3')
    yield(5)

When calling this generator function, you first need to create a generator object, and then use the next() function to continuously obtain the next return value:

>>> o = odd()
>>> next(o)
step 1
1
>>> next(o)
step 2
3
>>> next(o)
step 3
5
>>> next(o)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

As you can see, odd is not a normal function but a generator function. During execution, it pauses at yield and continues execution the next time. After three yield calls, there are no more yield statements to execute, so the fourth call to next(o) raises an error.

Please note: Calling a generator function creates a generator object. Calling the generator function multiple times will create multiple independent generators.
You may find that calling next() each time returns 1:

>>> next(odd())
step 1
1
>>> next(odd())
step 1
1
>>> next(odd())
step 1
1

The reason is that odd() creates a new generator object each time. The above code actually creates three completely independent generators, and calling next() on each will return the first value.

The correct way is to create a generator object and then continuously call next() on that one generator object:

>>> g = odd()
>>> next(g)
step 1
1
>>> next(g)
step 2
3
>>> next(g)
step 3
5

Returning to the fib example, we continuously call yield during the loop, which will keep interrupting. Of course, we need to set a condition to exit the loop; otherwise, it will produce an infinite sequence.

Similarly, after changing the function to a generator function, we generally never use next() to get the next return value; instead, we directly use a for loop to iterate:

>>> for n in fib(6):
...     print(n)
...
1
1
2
3
5

However, when calling a generator with a for loop, we find that we cannot obtain the return value of the generator. If you want to get the return value, you must catch the StopIteration error, as the return value is contained in the value of StopIteration:

>>> g = fib(6)
>>> while True:
...     try:
...         x = next(g)
...         print('g:', x)
...     except StopIteration as e:
...         print('Generator return value:', e.value)
...         break
...
g: 1
g: 1
g: 2
g: 3
g: 5
g: 8
Generator return value: done

Summary

Generators are very powerful tools. In Python, you can simply convert list comprehensions into generators, and you can also implement complex logic generators through functions.

To understand how generators work, they continuously calculate the next element during the for loop and end the for loop under appropriate conditions. For generator functions, encountering a return statement or executing the last line of the function body is the instruction to end the generator, and the for loop ends accordingly.

Be sure to distinguish between normal functions and generator functions. Normal function calls return results directly:

>>> r = abs(6)
>>> r
6

Calling a generator function actually returns a generator object:

>>> g = fib(6)
>>> g
<generator object fib at 0x1022ef948>

Iterators
It is already known that the types of objects that can be directly used in for loops include the following:

One category is collection data types, such as lists, tuples, dicts, sets, strings, etc.;

Another category is generators, including generator expressions and generator functions with yield.

These objects that can be directly used in for loops are collectively referred to as iterable objects: Iterable.

You can use isinstance() to check whether an object is an Iterable object:

>>> from collections.abc import Iterable
>>> isinstance([], Iterable)
True
>>> isinstance({}, Iterable)
True
>>> isinstance('abc', Iterable)
True
>>> isinstance((x for x in range(10)), Iterable)
True
>>> isinstance(100, Iterable)
False

Generators can not only be used in for loops but can also be continuously called by the next() function to return the next value until the StopIteration error is raised, indicating that no further values can be returned.

Objects that can be called by the next() function and continuously return the next value are called iterators: Iterator.

You can use isinstance() to check whether an object is an Iterator object:

>>> from collections.abc import Iterator
>>> isinstance((x for x in range(10)), Iterator)
True
>>> isinstance([], Iterator)
False
>>> isinstance({}, Iterator)
False
>>> isinstance('abc', Iterator)
False

Generators are Iterator objects, but lists, dicts, and strings are Iterable but not Iterator.

You can convert Iterable types like lists, dicts, and strings into Iterators using the iter() function:

>>> isinstance(iter([]), Iterator)
True
>>> isinstance(iter('abc'), Iterator)
True

You may wonder why lists, dicts, and strings are not Iterators.

This is because Python's Iterator objects represent a data stream. Iterator objects can be called by the next() function and continuously return the next data until they throw a StopIteration error. You can think of this data stream as an ordered sequence, but we cannot know the length of the sequence in advance; we can only calculate the next data on demand through the next() function. Therefore, the calculation of Iterators is lazy, and it only calculates when the next data needs to be returned.

Iterators can even represent an infinitely large data stream, such as all natural numbers. However, it is impossible to store all natural numbers using a list.

Summary

Any object that can be used in a for loop is of type Iterable;
Any object that can be used in the next() function is of type Iterator, representing a lazily computed sequence;
Collection data types such as lists, dicts, and strings are Iterable but not Iterator, but you can obtain an Iterator object through the iter() function.

The essence of Python's for loop is implemented by continuously calling the next() function, for example:

for x in [1, 2, 3, 4, 5]:
    pass

This is actually completely equivalent to:

# First obtain the Iterator object:
it = iter([1, 2, 3, 4, 5])
# Loop:
while True:
    try:
        # Get the next value:
        x = next(it)
    except StopIteration:
        # Exit the loop when encountering StopIteration
        break