You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: chapters/io.tex
+61-15
Original file line number
Diff line number
Diff line change
@@ -187,6 +187,20 @@ \section{Opening files}
187
187
188
188
This does nothing in Python 3 and provides you with the \lstinline{open()} from Python 3 in Python 2.
189
189
190
+
As as side note, \lstinline{file()} doesn't exist in Python 2, and you really should use \lstinline{open()} instead. If you used it for an \lstinline{isinstance()} check, use \lstinline{io.open()}, then make a check with \lstinline{io.IOBase} instead. If you have to support both Python 2 and 3, and may receive a \lstinline{file()} from some code you don't have control over, you can alway do a conditional:
191
+
192
+
\begin{py2and3}
193
+
import sys
194
+
import io
195
+
196
+
if sys.version_info.major < 3:
197
+
file_types = (io.IOBase, file)
198
+
else:
199
+
file_types = (io.IOBase,)
200
+
\end{py2and3}
201
+
202
+
And call \lstinline{isinstance()} on that.
203
+
190
204
\section{Fun with file paths}
191
205
192
206
File paths are one of those features that just fork 99\% of the time, until it doesn't. One reason is that Python is a cross-plateform language, but different operating systems may treat paths differently.
@@ -316,6 +330,7 @@ \section{Fun with file paths}
316
330
317
331
I also know it is tempting to use the excellent \lstinline{pathlib} at this point, especially since there is a backport on pypi. But in my opinion, it would add complexity to the migration. Better keep it for the new python-3-only projects.
318
332
333
+
319
334
\section{Formatting}
320
335
321
336
In Python 2, you could call \lstinline{.format()} on both \lstinline{str()} and \lstinline{unicode()}. This ability has been removed in Python 3: only text can be formatted this way, or using the newest f-strings.
@@ -324,22 +339,62 @@ \section{Formatting}
324
339
325
340
If your bytes are text, you should decode anyway, so you'll be able to format all you want. If your bytes need to be manipulated as-is, then either you must target Python 3.5, or change all your byte formatting code for something you do manually. Given the work the later option represent, I would advise to just target 3.5 if you have a lot of bytes to format.
326
341
327
-
\section{Wait, there is I/O}
342
+
One other thing to remember is that formatting string using the \lstinline{r} prefix changed from Python 2 to Python 3. Normally, Python treat any \lstinline{\\} (anti-slask) in a string \gls{literal} in a spacial way: if you us \lstinline{\\n} it will insert a line break, if you use \lstinline{\\t} it will insert a tab, if you use \lstinline{\\uxxx} it will insert a unicode character matching the code point in \textquote{xxx}, etc. This does not happen when the string is created in another way: only with string literals, so only with hardcoded strings.
328
343
344
+
Sometime, however, this feature must be disabled so you can actually insert an anti-slash. The typical example is the use of hard-coded windows file paths (although many Win \gls{API} now accept forward ones as well):
329
345
346
+
\begin{py3}
347
+
>>> print("C:\Program Files\Nope")
348
+
File "<stdin>", line 1
349
+
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 16-17: malformed \N character escape
350
+
>>> print("C:\Program Files\nope")
351
+
C:\Program Files
352
+
ope
353
+
\end{py3}
330
354
355
+
One simple solution is to escape the anti-slash... with another one:
356
+
357
+
\begin{py3}
358
+
>>> print("C:\\Program Files\\Nope")
359
+
C:\Program Files\Nope
360
+
>>> print("C:\\Program Files\\nope")
361
+
C:\Program Files\nope
362
+
\end{py3}
331
363
332
-
%All backslashes in raw string literals are interpreted literally. This means that '\U' and '\u' escapes in raw strings are not treated specially. For example, r'\u20ac' is a string of 6 characters in Python 3.0, whereas in 2.6, ur'\u20ac' was the single “euro” character. (Of course, this change only affects raw string literals; the euro character is '\u20ac' in Python 3.0.)
364
+
Since it is annoying to do so, Python allow you to use the \lstinline{r} prefix to just disable any special interpretation of \lstinline{\\}:
333
365
366
+
\begin{py3}
367
+
>>> print(r"C:\Program Files\Nope")
368
+
C:\Program Files\Nope
369
+
>>> print(r"C:\Program Files\nope")
370
+
C:\Program Files\nope
371
+
\end{py3}
334
372
335
-
\section{file()}
373
+
But Python 2 used to still interpret unicode escape:
336
374
337
375
\begin{py2}
338
-
from io import IOBase
339
-
340
-
if isinstance(someobj, IOBase):
376
+
>>> print(ur'\u20ac')
377
+
€
341
378
\end{py2}
342
379
380
+
Not only Python 3 doesn't, it actually doesn't accept mixing \lstinline{u} and \lstinline{r} at all:
381
+
382
+
\begin{py3}
383
+
>>> print(ur'\u20ac')
384
+
File "<stdin>", line 1
385
+
print(ur'\u20ac')
386
+
^
387
+
SyntaxError: invalid syntax
388
+
\end{py3}
389
+
390
+
So use \lstinline{__future__}, and go back to manually escaping if you must use unicode escape codes. I would advise no to if you can, and just use UTF8 and add the literal a caracter. This way way you don't have to bother and it works either way:
391
+
392
+
\begin{py2and3}
393
+
# coding: utf8
394
+
from __future__ import unicode_literals
395
+
print(r'\€/') # Happy euro sign :)
396
+
\end{py2and3}
397
+
343
398
\section{from buffer() to memoryview()}
344
399
345
400
Those two functions (in fact, they are more like classes) respectively create objects of the same name - a \lstinline{buffer} and a \lstinline{memoryview}. Both are a way to get a subset of something without copying it:
@@ -396,12 +451,3 @@ \section{from buffer() to memoryview()}
396
451
send_those_bytes(buffer(tps_reports, i, step))
397
452
398
453
\end{py2}
399
-
400
-
401
-
402
-
403
-
404
-
405
-
buffer. Par exemple, pour écrire des octets sur stdout, utilisez sys.stdout.buffer.write(b'abc').
406
-
407
-
unfortunately, changing sys.stdout to accept only unicode breaks a lot of libraries that expect it to accept encoded bytestrings. – nosklo Dec 4 '09 at 19:14
Copy file name to clipboardExpand all lines: glossaries.tex
+1-1
Original file line number
Diff line number
Diff line change
@@ -54,7 +54,7 @@
54
54
\newglossaryentry{literal}
55
55
{
56
56
name=literal,
57
-
description={A litteral is a datastructure that be be created using syntax - the litteral notation - instead of using a constructor. E.G: you can use the notation \lstinline{[]} instead of calling \lstinline{list()} to create a list, hence, lists are litterals. In Python, strings, bytes, integers, floats, complexes, lists, tuples, sets and dictionaries are literals, while all other objects are not}
57
+
description={A literal is a datastructure that be be created using syntax - the literal notation - instead of using a constructor. E.G: you can use the notation \lstinline{[]} instead of calling \lstinline{list()} to create a list, hence, lists are literals. In Python, strings, bytes, integers, floats, complexes, lists, tuples, sets and dictionaries are literals, while all other objects are not}
0 commit comments