Skip to content

Commit 5fa0197

Browse files
committed
Chapter 5 ok
1 parent 5300631 commit 5fa0197

File tree

2 files changed

+62
-16
lines changed

2 files changed

+62
-16
lines changed

chapters/io.tex

+61-15
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,20 @@ \section{Opening files}
187187

188188
This does nothing in Python 3 and provides you with the \lstinline{open()} from Python 3 in Python 2.
189189

190+
As as side note, \lstinline{file()} doesn't exist in Python 2, and you really should use \lstinline{open()} instead. If you used it for an \lstinline{isinstance()} check, use \lstinline{io.open()}, then make a check with \lstinline{io.IOBase} instead. If you have to support both Python 2 and 3, and may receive a \lstinline{file()} from some code you don't have control over, you can alway do a conditional:
191+
192+
\begin{py2and3}
193+
import sys
194+
import io
195+
196+
if sys.version_info.major < 3:
197+
file_types = (io.IOBase, file)
198+
else:
199+
file_types = (io.IOBase,)
200+
\end{py2and3}
201+
202+
And call \lstinline{isinstance()} on that.
203+
190204
\section{Fun with file paths}
191205

192206
File paths are one of those features that just fork 99\% of the time, until it doesn't. One reason is that Python is a cross-plateform language, but different operating systems may treat paths differently.
@@ -316,6 +330,7 @@ \section{Fun with file paths}
316330
317331
I also know it is tempting to use the excellent \lstinline{pathlib} at this point, especially since there is a backport on pypi. But in my opinion, it would add complexity to the migration. Better keep it for the new python-3-only projects.
318332
333+
319334
\section{Formatting}
320335
321336
In Python 2, you could call \lstinline{.format()} on both \lstinline{str()} and \lstinline{unicode()}. This ability has been removed in Python 3: only text can be formatted this way, or using the newest f-strings.
@@ -324,22 +339,62 @@ \section{Formatting}
324339
325340
If your bytes are text, you should decode anyway, so you'll be able to format all you want. If your bytes need to be manipulated as-is, then either you must target Python 3.5, or change all your byte formatting code for something you do manually. Given the work the later option represent, I would advise to just target 3.5 if you have a lot of bytes to format.
326341
327-
\section{Wait, there is I/O}
342+
One other thing to remember is that formatting string using the \lstinline{r} prefix changed from Python 2 to Python 3. Normally, Python treat any \lstinline{\\} (anti-slask) in a string \gls{literal} in a spacial way: if you us \lstinline{\\n} it will insert a line break, if you use \lstinline{\\t} it will insert a tab, if you use \lstinline{\\uxxx} it will insert a unicode character matching the code point in \textquote{xxx}, etc. This does not happen when the string is created in another way: only with string literals, so only with hardcoded strings.
328343
344+
Sometime, however, this feature must be disabled so you can actually insert an anti-slash. The typical example is the use of hard-coded windows file paths (although many Win \gls{API} now accept forward ones as well):
329345
346+
\begin{py3}
347+
>>> print("C:\Program Files\Nope")
348+
File "<stdin>", line 1
349+
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 16-17: malformed \N character escape
350+
>>> print("C:\Program Files\nope")
351+
C:\Program Files
352+
ope
353+
\end{py3}
330354
355+
One simple solution is to escape the anti-slash... with another one:
356+
357+
\begin{py3}
358+
>>> print("C:\\Program Files\\Nope")
359+
C:\Program Files\Nope
360+
>>> print("C:\\Program Files\\nope")
361+
C:\Program Files\nope
362+
\end{py3}
331363
332-
%All backslashes in raw string literals are interpreted literally. This means that '\U' and '\u' escapes in raw strings are not treated specially. For example, r'\u20ac' is a string of 6 characters in Python 3.0, whereas in 2.6, ur'\u20ac' was the single “euro” character. (Of course, this change only affects raw string literals; the euro character is '\u20ac' in Python 3.0.)
364+
Since it is annoying to do so, Python allow you to use the \lstinline{r} prefix to just disable any special interpretation of \lstinline{\\}:
333365
366+
\begin{py3}
367+
>>> print(r"C:\Program Files\Nope")
368+
C:\Program Files\Nope
369+
>>> print(r"C:\Program Files\nope")
370+
C:\Program Files\nope
371+
\end{py3}
334372
335-
\section{file()}
373+
But Python 2 used to still interpret unicode escape:
336374
337375
\begin{py2}
338-
from io import IOBase
339-
340-
if isinstance(someobj, IOBase):
376+
>>> print(ur'\u20ac')
377+
341378
\end{py2}
342379
380+
Not only Python 3 doesn't, it actually doesn't accept mixing \lstinline{u} and \lstinline{r} at all:
381+
382+
\begin{py3}
383+
>>> print(ur'\u20ac')
384+
File "<stdin>", line 1
385+
print(ur'\u20ac')
386+
^
387+
SyntaxError: invalid syntax
388+
\end{py3}
389+
390+
So use \lstinline{__future__}, and go back to manually escaping if you must use unicode escape codes. I would advise no to if you can, and just use UTF8 and add the literal a caracter. This way way you don't have to bother and it works either way:
391+
392+
\begin{py2and3}
393+
# coding: utf8
394+
from __future__ import unicode_literals
395+
print(r'\€/') # Happy euro sign :)
396+
\end{py2and3}
397+
343398
\section{from buffer() to memoryview()}
344399
345400
Those two functions (in fact, they are more like classes) respectively create objects of the same name - a \lstinline{buffer} and a \lstinline{memoryview}. Both are a way to get a subset of something without copying it:
@@ -396,12 +451,3 @@ \section{from buffer() to memoryview()}
396451
send_those_bytes(buffer(tps_reports, i, step))
397452
398453
\end{py2}
399-
400-
401-
402-
403-
404-
405-
buffer. Par exemple, pour écrire des octets sur stdout, utilisez sys.stdout.buffer.write(b'abc').
406-
407-
unfortunately, changing sys.stdout to accept only unicode breaks a lot of libraries that expect it to accept encoded bytestrings. – nosklo Dec 4 '09 at 19:14

glossaries.tex

+1-1
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454
\newglossaryentry{literal}
5555
{
5656
name=literal,
57-
description={A litteral is a datastructure that be be created using syntax - the litteral notation - instead of using a constructor. E.G: you can use the notation \lstinline{[]} instead of calling \lstinline{list()} to create a list, hence, lists are litterals. In Python, strings, bytes, integers, floats, complexes, lists, tuples, sets and dictionaries are literals, while all other objects are not}
57+
description={A literal is a datastructure that be be created using syntax - the literal notation - instead of using a constructor. E.G: you can use the notation \lstinline{[]} instead of calling \lstinline{list()} to create a list, hence, lists are literals. In Python, strings, bytes, integers, floats, complexes, lists, tuples, sets and dictionaries are literals, while all other objects are not}
5858
}
5959

6060
\newglossaryentry{list comprehension}

0 commit comments

Comments
 (0)