This blog post iterates on some interesting (and some common) modernization issues I saw that came up during the Python 2 to 3 migration process at work. The migration had occurred due to the EOL date for Python 2.7 set to January 1, 2020.

MRO Algorithm Changed From DLR to C3 Linearization

Method Resolution Order is the logical path in which a child (base) class follows the parent (super) class(es) to resolve an invoked method or an attribute. This is essential to produce predictable and reproducible class inheritance behaviors.

In Python 2, “Depth-first and Left-to-Right” (DLR) algorithm is used to evaluate multi-level inheritance patterns. In DLR, a base node traverses to the top-most (root) super node first before iterating horizontally left-to-right at each descent.

On the other hand, in Python 3, C3 algorithm is used which prioritizes children’s importance over their parents (BFS-like). Instead of striving to resolve the top-most (root) super node first, it prioritizes resolving per escalations.

For example, here’s a commonly-used linear inheritance pattern.

1
2
3
4
5
6
7
class A(object):  # 'class A:' in Python 2

    @staticmethod
    def method() -> None:
        print(A)

class B(A): ...

In this example, both Python 2 and 3 produce identical MROs.

Invoking B.method() executes its super A.method() implementation since method() is not defined in B. But if it tries to invoke, for instance, B.no_method() static method, it will raise AttributeError as it’s neither defined in B, A, and object respectively.

Here’s another (diamond) pattern that’s used somewhat commonly as well.

The MRO from class B and C remains nearly identical - with the exception that C tries resolution attempts starting from C instead of B.

But from class D, the MRO bifurcation occurs between Python 2 and 3 implementation. A MRO in Python would look like (for explanation, please refer to description above):

And in Python 3 (for explanation, please refer to description above):

As we can see, class D traverses to depth-first (top-most) and then - at each level of descent - traverses in left-to-right direction. Incidentally, if we change the order of D’s inheritance from C to B, the order becomes counter-clockwise but still preserves the same concept.

Python 2:

Python 3:

This is what happens if we resolve an “eight”-shaped inheritance pattern from F.

Python 2:

Python 3:

This is what happens if we resolve an “mixin”-style inheritance pattern from G.

Python 2 and Python 3:

This is what happens if we resolve a multi-inheritance pattern from K.

Python 2:

Python 3:

In summary, with Python 2 to Python 3 interpreter changes, MRO changes will also affect how class inheritances are resolved. This means that if your OOP is structured around multiple inheritances and hierarchy, it might be a good time to double check if the new resolution does not break any existing expectations.

Memory Allocation

I thought this was interesting. I wrote a small routine that calculates memory offsets and direction. In Python 2 and Python 3, there seems to be a difference.

Python 2:

a = -2 (0x55eaaa39bd20)
b = -1 (0x55eaaa39bd08)
a,b offset = 0x18
+---------------------------------------+-----+------------+-------------+
|                mem_addr               | int |   offset   |  direction  |
+---------------------------------------+-----+------------+-------------+
| 0x55eaaa39bd08 (-0.0000000000000568%) |  -1 |    24 B    | high -> low |
|  0x55eaaa39bcf0 (0.0000000000000000%) |  0  |    24 B    | high -> low |
|  0x55eaaa39bcd8 (0.0000000000000568%) |  1  |    24 B    | high -> low |
|  0x55eaaa39bcc0 (0.0000000000001137%) |  2  |    24 B    | high -> low |
|  0x55eaaa39bca8 (0.0000000000001705%) |  3  |    24 B    | high -> low |
|  0x55eaaa39bc90 (0.0000000000002274%) |  4  |    24 B    | high -> low |
|  0x55eaaa39bc78 (0.0000000000002842%) |  5  |    24 B    | high -> low |
|  0x55eaaa39bc60 (0.0000000000003411%) |  6  |    24 B    | high -> low |
|  0x55eaaa39bc48 (0.0000000000003979%) |  7  |    24 B    | high -> low |
|  0x55eaaa39bc30 (0.0000000000004547%) |  8  |    24 B    | high -> low |
|  0x55eaaa39bc18 (0.0000000000005116%) |  9  |    24 B    | high -> low |
|  0x55eaaa39bc00 (0.0000000000005684%) |  10 |    24 B    | high -> low |
|  0x55eaaa39bbe8 (0.0000000000006253%) |  11 |    24 B    | high -> low |
|  0x55eaaa39bbd0 (0.0000000000006821%) |  12 |    24 B    | high -> low |
|  0x55eaaa39bbb8 (0.0000000000007390%) |  13 |    24 B    | high -> low |
|  0x55eaaa39bba0 (0.0000000000007958%) |  14 |    24 B    | high -> low |
|  0x55eaaa39bb88 (0.0000000000008527%) |  15 |    24 B    | high -> low |
|  0x55eaaa39bb70 (0.0000000000009095%) |  16 |    24 B    | high -> low |
|  0x55eaaa39bb58 (0.0000000000009663%) |  17 |    24 B    | high -> low |
|  0x55eaaa39bb40 (0.0000000000010232%) |  18 |    24 B    | high -> low |
|  0x55eaaa39bb28 (0.0000000000010800%) |  19 |    24 B    | high -> low |
|  0x55eaaa39bb10 (0.0000000000011369%) |  20 |    24 B    | high -> low |
|  0x55eaaa39baf8 (0.0000000000011937%) |  21 |    24 B    | high -> low |
|  0x55eaaa39bae0 (0.0000000000012506%) |  22 |    24 B    | high -> low |
|  0x55eaaa39bac8 (0.0000000000013074%) |  23 |    24 B    | high -> low |
|  0x55eaaa39bab0 (0.0000000000013642%) |  24 |    24 B    | high -> low |
|  0x55eaaa39ba98 (0.0000000000014211%) |  25 |    24 B    | high -> low |
|  0x55eaaa39ba80 (0.0000000000014779%) |  26 |    24 B    | high -> low |
|  0x55eaaa39ba68 (0.0000000000015348%) |  27 |    24 B    | high -> low |
|  0x55eaaa39ba50 (0.0000000000015916%) |  28 |    24 B    | high -> low |
|  0x55eaaa39ba38 (0.0000000000016485%) |  29 |    24 B    | high -> low |
|  0x55eaaa39ba20 (0.0000000000017053%) |  30 |    24 B    | high -> low |
|  0x55eaaa39ba08 (0.0000000000017621%) |  31 |    24 B    | high -> low |
|  0x55eaaa39b9f0 (0.0000000000018190%) |  32 |    24 B    | high -> low |
|  0x55eaaa39b9d8 (0.0000000000018758%) |  33 |    24 B    | high -> low |
|  0x55eaaa39b9c0 (0.0000000000019327%) |  34 |    24 B    | high -> low |
|  0x55eaaa39b9a8 (0.0000000000019895%) |  35 |  -1968 B   | low -> high |
...
|  0x55eaaa39cd58 (0.0000000000136424%) | 240 |  -1968 B   | low -> high |
|  0x55eaaa39d508 (0.0000000000136993%) | 241 |    24 B    | high -> low |
|  0x55eaaa39d4f0 (0.0000000000137561%) | 242 |    24 B    | high -> low |
|  0x55eaaa39d4d8 (0.0000000000138130%) | 243 |    24 B    | high -> low |
|  0x55eaaa39d4c0 (0.0000000000138698%) | 244 |    24 B    | high -> low |
|  0x55eaaa39d4a8 (0.0000000000139266%) | 245 |    24 B    | high -> low |
|  0x55eaaa39d490 (0.0000000000139835%) | 246 |    24 B    | high -> low |
|  0x55eaaa39d478 (0.0000000000140403%) | 247 |    24 B    | high -> low |
|  0x55eaaa39d460 (0.0000000000140972%) | 248 |    24 B    | high -> low |
|  0x55eaaa39d448 (0.0000000000141540%) | 249 |    24 B    | high -> low |
|  0x55eaaa39d430 (0.0000000000142109%) | 250 |    24 B    | high -> low |
|  0x55eaaa39d418 (0.0000000000142677%) | 251 |    24 B    | high -> low |
|  0x55eaaa39d400 (0.0000000000143245%) | 252 |    24 B    | high -> low |
|  0x55eaaa39d3e8 (0.0000000000143814%) | 253 |    24 B    | high -> low |
|  0x55eaaa39d3d0 (0.0000000000144382%) | 254 |    24 B    | high -> low |
|  0x55eaaa39d3b8 (0.0000000000144951%) | 255 |    24 B    | high -> low |
|  0x55eaaa39d3a0 (0.0000000000145519%) | 256 | -3585288 B | low -> high |
|  0x55eaaa7086e0 (0.0000000000146088%) | 257 |   -240 B   | low -> high |
+---------------------------------------+-----+------------+-------------+

Python 3:

a = -2 (0x953de0)
b = -1 (0x953e00)
a,b offset = -0x20
+--------------------------------------+-----+--------------------+-------------+
|               mem_addr               | int |       offset       |  direction  |
+--------------------------------------+-----+--------------------+-------------+
|   0x953e00 (-0.0000000000000568%)    |  -1 |       -32 B        | low -> high |
|    0x953e20 (0.0000000000000000%)    |  0  |       -32 B        | low -> high |
|    0x953e40 (0.0000000000000568%)    |  1  |       -32 B        | low -> high |
|    0x953e60 (0.0000000000001137%)    |  2  |       -32 B        | low -> high |
|    0x953e80 (0.0000000000001705%)    |  3  |       -32 B        | low -> high |
|    0x953ea0 (0.0000000000002274%)    |  4  |       -32 B        | low -> high |
|    0x953ec0 (0.0000000000002842%)    |  5  |       -32 B        | low -> high |
|    0x953ee0 (0.0000000000003411%)    |  6  |       -32 B        | low -> high |
|    0x953f00 (0.0000000000003979%)    |  7  |       -32 B        | low -> high |
|    0x953f20 (0.0000000000004547%)    |  8  |       -32 B        | low -> high |
|    0x953f40 (0.0000000000005116%)    |  9  |       -32 B        | low -> high |
|    0x953f60 (0.0000000000005684%)    |  10 |       -32 B        | low -> high |
|    0x953f80 (0.0000000000006253%)    |  11 |       -32 B        | low -> high |
|    0x953fa0 (0.0000000000006821%)    |  12 |       -32 B        | low -> high |
|    0x953fc0 (0.0000000000007390%)    |  13 |       -32 B        | low -> high |
|    0x953fe0 (0.0000000000007958%)    |  14 |       -32 B        | low -> high |
|    0x954000 (0.0000000000008527%)    |  15 |       -32 B        | low -> high |
|    0x954020 (0.0000000000009095%)    |  16 |       -32 B        | low -> high |
|    0x954040 (0.0000000000009663%)    |  17 |       -32 B        | low -> high |
|    0x954060 (0.0000000000010232%)    |  18 |       -32 B        | low -> high |
|    0x954080 (0.0000000000010800%)    |  19 |       -32 B        | low -> high |
|    0x9540a0 (0.0000000000011369%)    |  20 |       -32 B        | low -> high |
|    0x9540c0 (0.0000000000011937%)    |  21 |       -32 B        | low -> high |
|    0x9540e0 (0.0000000000012506%)    |  22 |       -32 B        | low -> high |
|    0x954100 (0.0000000000013074%)    |  23 |       -32 B        | low -> high |
|    0x954120 (0.0000000000013642%)    |  24 |       -32 B        | low -> high |
|    0x954140 (0.0000000000014211%)    |  25 |       -32 B        | low -> high |
|    0x954160 (0.0000000000014779%)    |  26 |       -32 B        | low -> high |
|    0x954180 (0.0000000000015348%)    |  27 |       -32 B        | low -> high |
|    0x9541a0 (0.0000000000015916%)    |  28 |       -32 B        | low -> high |
|    0x9541c0 (0.0000000000016485%)    |  29 |       -32 B        | low -> high |
|    0x9541e0 (0.0000000000017053%)    |  30 |       -32 B        | low -> high |
|    0x954200 (0.0000000000017621%)    |  31 |       -32 B        | low -> high |
|    0x954220 (0.0000000000018190%)    |  32 |       -32 B        | low -> high |
|    0x954240 (0.0000000000018758%)    |  33 |       -32 B        | low -> high |
|    0x954260 (0.0000000000019327%)    |  34 |       -32 B        | low -> high |
|    0x954280 (0.0000000000019895%)    |  35 |       -32 B        | low -> high |
|    0x9542a0 (0.0000000000020464%)    |  36 |       -32 B        | low -> high |
...
|    0x955c20 (0.0000000000136424%)    | 240 |       -32 B        | low -> high |
|    0x955c40 (0.0000000000136993%)    | 241 |       -32 B        | low -> high |
|    0x955c60 (0.0000000000137561%)    | 242 |       -32 B        | low -> high |
|    0x955c80 (0.0000000000138130%)    | 243 |       -32 B        | low -> high |
|    0x955ca0 (0.0000000000138698%)    | 244 |       -32 B        | low -> high |
|    0x955cc0 (0.0000000000139266%)    | 245 |       -32 B        | low -> high |
|    0x955ce0 (0.0000000000139835%)    | 246 |       -32 B        | low -> high |
|    0x955d00 (0.0000000000140403%)    | 247 |       -32 B        | low -> high |
|    0x955d20 (0.0000000000140972%)    | 248 |       -32 B        | low -> high |
|    0x955d40 (0.0000000000141540%)    | 249 |       -32 B        | low -> high |
|    0x955d60 (0.0000000000142109%)    | 250 |       -32 B        | low -> high |
|    0x955d80 (0.0000000000142677%)    | 251 |       -32 B        | low -> high |
|    0x955da0 (0.0000000000143245%)    | 252 |       -32 B        | low -> high |
|    0x955dc0 (0.0000000000143814%)    | 253 |       -32 B        | low -> high |
|    0x955de0 (0.0000000000144382%)    | 254 |       -32 B        | low -> high |
|    0x955e00 (0.0000000000144951%)    | 255 |       -32 B        | low -> high |
|    0x955e20 (0.0000000000145519%)    | 256 | -139864651158032 B | low -> high |
| 0x7f34c76eb0d0 (0.0000000000146088%) | 257 |       -64 B        | low -> high |
+--------------------------------------+-----+--------------------+-------------+

Not something that will affect the runtime (maybe tiny performance-related?), but just an interesting observation.

Bytes vs. Strings

One of the biggest pain points was having to separately deal with strings and bytes - especially around network packets and files. Previously in Python 2, operators were able to be applied on both str and bytes interchangeably as such:

1
b'a' * 100 == 'a' * 100

This equality operator would yield True in Python 2 and False in Python 3. The simplicity of dealing with binary data equally as strings was replaced by requiring to use .encode() and .decode() to convert from one type to another.

And couple more relationships for convenience.

Easier Type Hints

In Python 2, I personally disliked how it didn’t have type notations like Java or C. This led me to identifying simple mistakes of wrong type operations too late until the buggy line was executed (Python doesn’t detect type errors early since it interprets instead of compiling the code).

Although you would be able to utilize tools like mypy and generate *.pyi files to separately define types, Python 3 allows for more convenient syntax as such:

1
2
3
4
5
6
7
8
from typing import (
    Any,
    List,
)

def f(*a, **kw) -> List[Any]:
    b: Tuple[Any] = a
    return [*b, *b[::-1]]

Print()

In Python 2, print can be written as a statement but only as a function in Python 3.

1
2
print "Hello"   # Python 2
print("Hello")  # Python 2 and Python 3

Finding this code pattern is easy with a simple grep.

1
fgrep -Hr 'print ' .

Different Division Behaviors

In Python 2, the division operator “/” returns a floored value.

1
>>> 1/2  # 0

To explicitly get a floating point, either numerator or denominator must be a float.

1
>>> 1/2.  # 0.5

In Python 3, the default behavior of the division operator is to return a float.

1
>>> 1/2  # 0.5

But you can floor the value by using double “//”.

1
>>> 1//2  # 0

Pickle Serializations

In some places of our codebase, there were a few static references to pickle protocol version explicitly being set to 2. The reasoning was probably to maintain explicit consistency in other codebases, but it also meant with Python 3,

map, reduce, filter, and range

The three functions, map(), reduce(), and filter() were my go-tos in Python 2 as they abstracted away for loops into a simple func(func, iterable). In Python 2, these three evaluated immediately and returned the resultant product.

1
2
>>> map(lambda _: _+1, range(10))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In Python 3, however, they simply return a generator object to later consume.

1
2
>>> map(lambda x: x+1, range(10))
<map object at 0x7fa1fc2032b0>

And reduce was removed.

Long and int

In Python 2, there were two types of integers: long and int. The longs can be extended as much as the system memory allows it to. The ints were contained by the size of C-integers (32 or 64 bits). But there are other differences as well.

In Python 3, these two were merged into a single int type.