This post goes over some interesting modernization issues that came up during the Python 2.* to 3 migration.
MRO Algorithm Changed From DLR to C3 Linearization
Method Resolution Order (MRO) is the logical path for a child class to follow to resolve an invoked method or an attribute. Having a deterministic order is essential to produce predictable and reproducible class inheritance behaviors.
In Python 2, “Depth-first and Left-to-Right” (DLR) algorithm is used to evaluate multi-level inheritance patterns. In DLR, a base node traverses to the top-most super node first before iterating horizontally left-to-right at each descent. In Python 3, C3 algorithm is used to prioritize children’s importance over their parents. Instead of striving to resolve the top-most (root) super node first, it prioritizes resolving per escalations.
Linear Inheritance
Here’s the commonly-used linear inheritance pattern which results in identical resolution order in both versions.
Invoking B.method()
executes A.method()
since method()
is not defined in the class B
. Invoking B.no_method()
method will expectedly raise AttributeError
as .no_method()
is neither defined in B
and A
.
Diamond Inheritance
MRO from B(A)
or C(A)
Python 2:
B -> A
Python 3:
C -> A
MRO from D(B, C)
Python 2:
D -> B -> A -> C
Python 3:
D -> B -> C -> A
MRO from D(C, B)
Python 2:
D -> C -> A -> B
Python 3:
D -> C -> B -> A
“Eight” Inheritance
MRO from F(D)
Python 2:
F -> D -> B -> A -> C
Python 3:
F -> D -> B -> C -> A
“Mix-in” Inheritance
MRO from G(D, E)
Python 2:
G -> D -> A -> E -> B
Python 3:
G -> D -> A -> E -> B
With Python 2 to Python 3 interpreter changes, MRO changes also affect how inheritances are resolved. This means that if your OOP is structured around multiple inheritances and hierarchy, it might be a good time to double check if the new resolution does not break any existing expectations.
Bytes vs. Strings
One of the biggest pain points is having to deal with strings and bytes - especially when parsing network packets and reading files. Previously in Python 2, operators can be applied to both str
and bytes
interchangeably.
|
|
This equality operator now yields False
in Python 3. Dealing with binary data now requires .encode()
and .decode()
to convert from one type to another.
And couple more relationships.
Memory Allocation
I wrote a small routine below that calculates memory offsets and direction.
|
|
Python 2
a = -2 (0x55eaaa39bd20)
b = -1 (0x55eaaa39bd08)
a,b offset = 0x18
mem_addr | int | offset | direction |
---|---|---|---|
0x55eaaa39bd08 | -1 | 24 B | high -> low |
0x55eaaa39bcf0 | 0 | 24 B | high -> low |
0x55eaaa39bcd8 | 1 | 24 B | high -> low |
0x55eaaa39bcc0 | 2 | 24 B | high -> low |
0x55eaaa39bca8 | 3 | 24 B | high -> low |
0x55eaaa39bc90 | 4 | 24 B | high -> low |
0x55eaaa39bc78 | 5 | 24 B | high -> low |
0x55eaaa39bc60 | 6 | 24 B | high -> low |
0x55eaaa39bc48 | 7 | 24 B | high -> low |
0x55eaaa39bc30 | 8 | 24 B | high -> low |
0x55eaaa39bc18 | 9 | 24 B | high -> low |
0x55eaaa39b9c0 | 34 | 24 B | high -> low |
0x55eaaa39b9a8 | 35 | -1968 B | low -> high |
0x55eaaa39cd58 | 240 | -1968 B | low -> high |
0x55eaaa39d508 | 241 | 24 B | high -> low |
0x55eaaa39d3b8 | 255 | 24 B | high -> low |
0x55eaaa39d3a0 | 256 | -3585288 B | low -> high |
0x55eaaa7086e0 | 257 | -240 B | low -> high |
Python 3
a = -2 (0x953de0)
b = -1 (0x953e00)
a,b offset = -0x20
mem_addr | int | offset | direction |
---|---|---|---|
0x953e00 | -1 | -32 B | low -> high |
0x953e20 | 0 | -32 B | low -> high |
0x953e40 | 1 | -32 B | low -> high |
0x953e60 | 2 | -32 B | low -> high |
0x953e80 | 3 | -32 B | low -> high |
0x953ea0 | 4 | -32 B | low -> high |
0x953ec0 | 5 | -32 B | low -> high |
0x953ee0 | 6 | -32 B | low -> high |
0x953f00 | 7 | -32 B | low -> high |
0x953f20 | 8 | -32 B | low -> high |
0x953f40 | 9 | -32 B | low -> high |
0x955e00 | 255 | -32 B | low -> high |
0x955e20 | 256 | -139864651158032 B | low -> high |
0x7f34c76eb0d0 | 257 | -64 B | low -> high |
In Python 2, print
can be written as a statement but only as a function in Python 3.
|
|
Finding this code pattern is easy with a simple grep
.
|
|
Different Division Behaviors
In Python 2, the division operator “/
” returns a floored value.
|
|
To explicitly get a floating point, either numerator or denominator must be a float.
|
|
In Python 3, the default behavior of the division operator is to return a float.
|
|
But you can floor the value by using double “//
”.
|
|
map
, reduce
, filter
, range
map()
, reduce()
, and filter()
were my go-tos in Python 2 as they abstracted away for
loops into a simple func(func, iterable)
. In Python 2, these three evaluated immediately and returned the resultant.
>>> map(lambda e: e+1, range(10))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
In Python 3, however, they simply return a generator object.
>>> map(lambda x: x+1, range(10))
<map object at 0x7fa1fc2032b0>
And reduce
was removed.
Long and int
In Python 2, there were two types of integers: long
and int
. The long
s can be extended as much as the system memory allows it to. The int
s were contained by the size of C-integers (32 or 64 bits), plus other differences.
In Python 3, these two were merged into a single int
type.