Skip to content

Commit d1122b9

Browse files
committed
80% of not to migrate chapter + 20% of import chapter
1 parent 5fa0197 commit d1122b9

File tree

6 files changed

+170
-6
lines changed

6 files changed

+170
-6
lines changed

chapters/how_not_to_migrate.tex

+94
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
\chapter{How not to migrate}
2+
3+
The first Python 3 release was in 2008. It's fair to say that among people reading this book, published in 2019, some postponed the migration as much as they could.
4+
5+
Maybe they didn't have the time or the resources.
6+
7+
Maybe they didn't have the knowledge.
8+
9+
Maybe they didn't see a good cost/benefit ratio in this endavor.
10+
11+
So what about just not doing it?
12+
13+
\section{The cost of migrating}
14+
15+
How expensive is porting Python 2 code? If it's just a bunch a small scripts here and there, every .py file can be ported in a few minutes. The price then rise for each line of code, each external dependancy, espacially C extensions. The worst case scenario is when your Python code is a a plugin for an embedded interpretter in another program (QGIS, Blender, etc): you are tied to the software editor.
16+
17+
Anyway, there is no sure way to assess the time, energy and skill you'll need.
18+
19+
A collection of data analysis scripts may be a matter of a few hours with the proper knowledge. A fair sized Django website can take a few days. A GUI app may very well take a few weeks. Complicated libraries like numpy or twited took months. I know at least one project that had one million lines of code that started in Python 2.5 with plenty of customizations to the interpretter: it will never get ported.
20+
21+
Of course, porting means working on something that doesn't produce immediat value. And it implies sometimes also porting or replacing dependancies, which is an additional expense with zero benefit.
22+
23+
Consider as well that porting anything with more than 5000 lines of code without some kind of automatic testing is not realistic. Projects that waited more than a decade to upgrade could also lack this kind of infrastructure. In this case, the cost doubles: you need to add testing first. A good idea in any case, but not a free one.
24+
25+
If you decided that you need a hybrid Python 2/3 support, the transition will be even more costly: it's longer, harder, and makes the code more complicated.
26+
27+
At last, performances may suffer for a release or too. If you have a lot of pure Python text/numeric calculation, you may very well take a definitive performance hit.
28+
29+
All good reasons to decide not undergo the migration at all. Actually, some project, faced with this decision, decided to rewrite everything in another language entirely.
30+
31+
However, despite all those gloomy predictions, most code bases I encountered in my professionnal life could be ported in under 2 weeks, and without too much hassle. It's was boring, repetitive work, not hard work. It's easy to picture the task as bigger as it really is, yet we are not all Google-scale corporations.
32+
33+
After the port, most of the knowledge gained from the initial version, the fixed bugs and the documentation where still relevant in the new one. Not to mention team training and tooling. In the end, it's not only a lower price than expected, but precious assets preserved.
34+
35+
And of course, there are benefits you get after the migration.
36+
37+
On the other hand, staying on Python 2.7 means paying some price as well.
38+
39+
\section{The cost of not migrating}
40+
41+
The usual line for pushing people to migrate is about all the things you'd gain once on Python 3: sane I/O support, easier maintenance and debugging, cleaner syntax, nice features and goodies... Indeed, I can voutch for that: coding in Python 3, espacially in 3.6 and after, is a vastly superior experience. It's not about one huge thing that makes it all better, but rather the collection of the thousand things that add up on the long run. You don't see it at first, but once you got back to Python 2, it hurts!
42+
43+
Then again, given the cost of migrating, a carrot alone is not enough. What's the stick then ?
44+
45+
First, Python 2.7 will obviously loose any official support, meaning no bug fix, and no security patch. This also means it won't be easily installable on future systems. E.G: Python 3.4 is the last one that can be installed on Windows XP. Ubuntu 20.04 plan to be Python 3 only. OS in the future may very well not be able to run cPython 2.7 out of the box.
46+
47+
But the real problem are the dependancies. Many popular libraries, like Django and Numpy, already dropped support for 2.7 in their latest release. New tools, such as the recently very popular code formatter "black", only supports modern Python versions.
48+
49+
Not migrating means that, little by little, you cut yourself from the rest of the Python ecosystem. And it's one of the biggest strength of the language.
50+
51+
Nevertheless, let's assume you made the decision to not migrate, what are your options ?
52+
53+
\section{Python 2.7 executable will keep working}
54+
55+
As you may have guessed, the Python 2.7 executable is not going to self destruct on january the 1st, 2020. In fact, we can expect all binaries and source code to be still available for download from python.org, since it already lists legacy ones down to Python 2.0.1!
56+
57+
So not only your program will keep working, but you will still be able to install it on other machines.
58+
59+
The \gls{EOL} is simply means the official Python developpement team will stop working on it. This means any new bug or security flaw found will never be addressed by the core devs.
60+
61+
Even in the unlikely event that you can't use or get the legacy cPython 2.7 executable, there are still alternatives. First, Python being free software, the sources are and will always be available, meaning you can compile it. This ensure than not only you can always produce an executable for you current plateform, but you can potentially adapt it to a new plateform. Tools like \href{https://github.com/pyenv/pyenv}{pyenv} make this solution even easier.
62+
63+
If you don't rely on many c-extensions, another alternative is to use \href{https://pypy.org}{pypy}, an Python implementation written... in Python. It's slow to startup, but very fast after a few seconds of warm up, thanks to it's Just In Time compiler. While it only supported a limited sets of c-extensions, it's highly compatible with Python 2.7 and 3.6, and the authors says they'll support the 2.7 version ad-vitam. There is even a \href{https://github.com/squeaky-pl/portable-pypy}{portable version for Linux} that doesn't require any installation.
64+
65+
Talking about Linux, many distribution have LTS releases, supported for 5 years, that included Python 2.7, and hence will have to support it for the upcomming years. E.G: Ubuntu 18.04 includes Python 2.7 and will be supported until 2023.
66+
67+
Most Mac also come with Python 2.7 embeded, and it's not going to be stripped from currently deployed systems.
68+
69+
I also assume community managed repositories such as \href{https://www.linuxtricks.fr/wiki/centos-rhel-ajouter-des-depots-supplementaires}{Centos EPEL} or \href{https://launchpad.net/~deadsnakes/+archive/ubuntu/ppa}{Ubuntu deadsnake ppa} will keep old versions around.
70+
71+
\section{Distributing}
72+
73+
Distributing your software is going to be trickier. You can play cat and mouse with the interpretter, but your user may not have this patience. And there is also the problem of dependancies.
74+
75+
You may want to list all the dependancies of your project (\lstinline{pip freeze > requirements.txt} in your virtualenv will do the job) with the precise versions that works with your project. You can then download them individually (\lstinline{pip download -r requirements.txt}will dump them in the current dir).
76+
77+
Indeed, nobody can know if they will all still be around past 2020: you are at the mercy of the numerous lib authors.
78+
79+
Of course, having so many files to distribute is not very convenient. To solve this problem, you have three choices:
80+
81+
\begin{itemize}
82+
83+
\item If you want to keep the traditional \lstinline{pip install} workflow, you can install your own instance of \href{https://github.com/pypa/warehouse}{warehouse}. This is the software running pypi.org, the source pip uses to download packages, and it's open source . You can setup your own with just the packages you need and use \lstinline{pip install --index-url <your_warehouse_domain>} from now on.
84+
\item You can bunddle your entire virtualenv into one big zip and ship that. \href{https://github.com/pantsbuild/pex}{pex} is an utility built by Twitter that does exactly this (but only for Linux). Finding the proper incantation to make it work can be a challenge. I just copy paste this again and again: \lstinline{pex . -r requirements.txt -o entre_point_of_your_project.pex --python python2.7 -c entre_point_of_your_project.py -f dist --disable-cache}. The resulting file will contain your project and all it's dependancies, still require the Python executable, but will run like a standalone script other than this.
85+
\item You can compile your entire project into a totally stand alone executable that not only embeds all dependancies, but also the entire Python runtime. This is a very nice way to provide a Python project to non technical user. Several tools exist to do this: cv_freeze, py2exe, py2app, etc. Of all of them, I can warmly recommand the excellent \href{nuitka.net}{nuitka}: it's very robust, provide a nice performance boost and reliably work on Windows, Linux and Mac. It's a bit hard to setup, as you'll need to install a compiler and Python headers. E.G: clang on Mac (if you got homebrew it's \lstinline{brew install --with-toolchain llvm}), gcc on Linux (ubuntu will need \lstinline{apt install build-essential python-dev}, centos will need \lstinline{yum groupinstall "Development Tools" and yum install python-devel})) and \href{https://sourceforge.net/projects/mingw-w64/}{MinGw-w64} on Windows. In the simplest case, \lstinline{python -m nuitka your_project_entre_point.py --standalone} will work. And for the other, you'll have to read the doc.
86+
\end{itemize}
87+
88+
Finally, if you distribute your software and the main component, Python 2.7, is not supported anymore, you may have legal and business issues. For this particular scenario, you want to look for commercial support. It is highly likely that some companies will be happy to charge you for taking care of the extented support you need for Python 2.7. As I write this book, I haven't encountered an offer in the wild, but I'd knock at the door of institutions like continuum.io if I were shopping for one. They have been around for long enough with a solid custom distribution to be the first I'd call. And no, I have no relation with them what so ever.
89+
90+
\section{Make a plan}
91+
92+
<lire l'article sur pas besoin de migrer depuis 2.7>
93+
94+
Yes, avoiding to do work, is, in the end, still work.

chapters/imports.tex

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
2+
\chapter{Imports}
3+
4+
5+
\section{Name and location changes}
6+
7+
In the process of cleaning the standard library, several modules and objects have been moved around or renamed. It's straighforward, although boring, to deal with in the process of migration, because the changes are well documented, easy to compensate for, and the tooling has excellent support. Therefor, while I will demonstrate manual fixes for all of those since I can't assume anything about your constraintes, I would even more strongly recommand tooling for this issue in particular.
8+
9+
The main manual technic to solve this mismatch is to replace the old import with the new Python 3 location/name, or if 2/3 support is desired, perform a conditional import:
10+
11+
\begin{py2and3}
12+
try:
13+
# attempt to import using the old name or place
14+
except ImportError:
15+
# use the new name or place
16+
\end{py2and3}
17+
18+
19+
20+
21+
22+
\section{Import semantics}
23+
24+
https://docs.python.org/3.0/whatsnew/3.0.html
25+
26+
27+
28+
%https://python-future.org/compatible_idioms.html#standard-library
29+
30+
31+
32+

chapters/io.tex

+11
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,17 @@ \chapter{Migrating your strings}
55

66
If you jumped to this chapter, remember we assume you perfectly undertstand what was in the previous one. Most people don't, so you may want to read it.
77

8+
\section{Summary}
9+
10+
\begin{labeling}{summary}
11+
\item [Code file encoding]: set your editor to UTF8, use # coding: utf8
12+
\item [Properly using bytes(), str() and unicode()]: use __future__ then mark bytes manually
13+
\item [Opening files]: use \lstinline{io.open(),} specify \lstinline{encoding} for text and \lstinline{b} the rest.
14+
\item [Fun with file paths]: migrate code and on-disk files to utf8, use surrogateescape
15+
\item [Formatting]: use Python 3.5 if you format a lot of bytes
16+
\item [from buffer() to memoryview()]: replace \lstinline{buffer()} by \lstinline{memoryview()} but make sure you only pass \lstinline{bytes()}
17+
\end{labeling}
18+
819
\section{Code file encoding}
920

1021
First, you should make sure you use the best practices in your Python 2 code before moving to Python 3. To do so, ensure you know the encoding of all the code files in your project. Ideally, it should be UTF-8.

glossaries.tex

+6
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,12 @@
3131
description={In Python, a dunder (for \textquote{double underscore}) variable, function or module is one with a named surrounded with two pairs of underscores. E.G: \lstinline{__doc__} is a dunder variable, \lstinline{__call__()} is a dunder method and \lstinline{__init__.py} is a dunder module. They signal that there is an automated behavior behind this object, and because each behavior is different, that you must read the documentation to understand it. E.g: \lstinline{__doc__} is automatically created and filled with a module docstring, \lstinline{__call__()} is automatically used when calling a function or instanciating a class and \lstinline{__init__.py} is automatically executed when importing the moduling that contains it. The benefit of this syntax is to standout in the code, so that it's very easy to spot where the magic is used. Also, it makes sure developpers have a very low probability of using a conflicting name by mistake}
3232
}
3333

34+
\newglossaryentry{EOL}
35+
{
36+
name=EOL,
37+
description={The EOL, for End Of Life, is the date after which a software is not officially supported anymore. E.G: for Python 2.7, the EOL has been scheduled for january the 1st, 2020.}
38+
}
39+
3440
\newglossaryentry{generator}
3541
{
3642
name=generator,

main.tex

+2-6
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@
108108

109109
\hypersetup{pageanchor=true}
110110

111-
\part{The differences between Python 2 and 3}
111+
\include{chapters/how_not_to_migrate}
112112

113113
\include{chapters/which_python}
114114

@@ -120,11 +120,7 @@ \part{The differences between Python 2 and 3}
120120

121121
\include{chapters/io}
122122

123-
\chapter{Imports}
124-
125-
Name changes, absolute imports
126-
127-
%https://python-future.org/compatible_idioms.html#standard-library
123+
\include{chapters/imports}
128124

129125
\chapter{Object Oriented Programming}\label{chap:oop}
130126

todo.txt

+25
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,32 @@
4545

4646
>>> f = open("jalape\xf1o.txt")Traceback (most recent call last):...IOError: [Errno 2] No such file or directory: 'jalapeño.txt'>>>
4747

48+
https://morepypy.blogspot.com/2019/10/pypy-v72-released.html
4849

4950

5051

5152
In early versions of Python 3, such names were silently discarded (made invisible). Yikes
53+
54+
- ('re', 'ASCII','stat', 'ST_MODE'),
55+
56+
- ajouter non-ascii-bytes-literals pylint et voir autres réglages
57+
58+
59+
leanup of the sys module: removed sys.exitfunc(), sys.exc_clear(), sys.exc_type, sys.exc_value, sys.exc_traceback.
60+
61+
Cleanup of the array.array type: the read() and write() methods are gone; use fromfile() and tofile() instead. Also, the 'c' typecode for array is gone – use either 'b' for bytes or 'u' for Unicode characters.
62+
63+
Cleanup of the operator module: removed sequenceIncludes() and isCallable().
64+
Cleanup of the thread module: acquire_lock() and release_lock() are gone; use acquire() and release() instead.
65+
Cleanup of the random module: removed the jumpahead() API.
66+
The new module is gone.
67+
The functions os.tmpnam(), os.tempnam() and os.tmpfile() have been removed in favor of the tempfile module.
68+
The tokenize module has been changed to work with bytes. The main entry point is now tokenize.tokenize(), instead of generate_tokens.
69+
string.letters and its friends (string.lowercase and string.uppercase) are gone. Use string.ascii_letters etc. instead. (The reason for the removal is that string.letters and friends had locale-specific behavior, which is a bad idea for such attractively-named global “constants”.)
70+
Renamed module __builtin__ to builtins (removing the underscores, adding an ‘s’). The __builtins__ variable found in most global namespaces is unchanged. To modify a builtin, you should use builtins, not __builtins__!
71+
72+
73+
- https://github.com/squeaky-pl/portable-pypy
74+
75+
76+
There is a catastrophic bug in the Python curses module in 3.4.0: http://bugs.python.org/issue21088

0 commit comments

Comments
 (0)