Skip to content

Commit

Permalink
Created using Colab
Browse files Browse the repository at this point in the history
  • Loading branch information
sualeh committed May 7, 2024
1 parent 53faf08 commit a7ffd94
Showing 1 changed file with 188 additions and 152 deletions.
340 changes: 188 additions & 152 deletions Notebooks/4_javascript_encoding.ipynb
Original file line number Diff line number Diff line change
@@ -1,154 +1,190 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Encoding"
]
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/github/sualeh/What-a-Character/blob/go/Notebooks/4_javascript_encoding.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uMlWvwmHes_l"
},
"source": [
"# Encoding"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "v92DTDtSes_m"
},
"source": [
"## Converting to Bytes\n",
"\n",
"Always specify encoding to avoid cross-platform surprises."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "2IJXeRfpes_m"
},
"outputs": [],
"source": [
"%%script node\n",
"\n",
"let original = \"Aß東𐐀\";\n",
"\n",
"let utf8Bytes = new TextEncoder().encode(original);\n",
"let roundTrip = new TextDecoder().decode(utf8Bytes);\n",
"\n",
"console.log(roundTrip);"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "U-Sw2yUnes_n"
},
"source": [
"> **Bad decoding**\n",
"\n",
"If an incorrect encoding is speccified, no exceptions may be thrown even if data gets corrupted."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Je4RcX6-es_n"
},
"outputs": [],
"source": [
"%%script node\n",
"\n",
"original = \"Aß東𐐀\";\n",
"\n",
"utf8Bytes = new TextEncoder(\"utf-8\").encode(original);\n",
"roundTrip = new TextDecoder(\"utf-16\").decode(utf8Bytes);\n",
"\n",
"// NOTE: No encoding errors are reported!\n",
"console.log(roundTrip);"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oFJbdbfTes_n"
},
"source": [
"## Writing Files"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "k7Uk8WTUes_o"
},
"outputs": [],
"source": [
"%%script node\n",
"\n",
"const fs = require('fs');\n",
"\n",
"let original = \"Aß東𐐀\";\n",
"\n",
"fs.writeFile('test.txt', original, 'utf-8', function (err) {\n",
" // Ignore errors\n",
"});"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VbC5rzFqes_o"
},
"source": [
"## Reading Files"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "E2Vq8ho_es_o"
},
"outputs": [],
"source": [
"%%script node\n",
"\n",
"const fs = require('fs');\n",
"\n",
"fs.readFile('test.txt', 'utf-8', function (err, data) {\n",
" // Ignore errors\n",
" // Write what was read\n",
" console.log(data);\n",
"});"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IK_Yzxm5es_o"
},
"source": [
"If you specify an incorrect encoding when reading a file, you can get gibberish."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "0rYqgrJbes_o"
},
"outputs": [],
"source": [
"%%script node\n",
"\n",
"const fs = require('fs');\n",
"\n",
"fs.readFile('test.txt', 'utf-16le', function (err, data) {\n",
" // Reads unexpected characters\n",
" console.log(data);\n",
"});"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
},
"colab": {
"provenance": [],
"include_colab_link": true
}
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Converting to Bytes\n",
"\n",
"Always specify encoding to avoid cross-platform surprises."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%script node \n",
"\n",
"let original = \"Aß東𐐀\";\n",
"\n",
"let utf8Bytes = new TextEncoder().encode(original);\n",
"let roundTrip = new TextDecoder().decode(utf8Bytes);\n",
"\n",
"console.log(roundTrip);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> **Bad decoding**\n",
"\n",
"If an incorrect encoding is speccified, no exceptions may be thrown even if data gets corrupted."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%script node \n",
"\n",
"original = \"Aß東𐐀\";\n",
"\n",
"utf8Bytes = new TextEncoder(\"utf-8\").encode(original);\n",
"roundTrip = new TextDecoder(\"utf-16\").decode(utf8Bytes);\n",
"\n",
"// NOTE: No encoding errors are reported!\n",
"console.log(roundTrip);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Writing Files"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%script node\n",
"\n",
"const fs = require('fs');\n",
"\n",
"let original = \"Aß東𐐀\";\n",
"\n",
"fs.writeFile('test.txt', original, 'utf-8', function (err) {\n",
" // Ignore errors\n",
"});"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reading Files"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%script node\n",
"\n",
"const fs = require('fs');\n",
"\n",
"fs.readFile('test.txt', 'utf-8', function (err, data) {\n",
" // Ignore errors\n",
" // Write what was read\n",
" console.log(data);\n",
"});"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you specify an incorrect encoding when reading a file, you can get gibberish."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%script node\n",
"\n",
"const fs = require('fs');\n",
"\n",
"fs.readFile('test.txt', 'utf-16le', function (err, data) {\n",
" // Reads unexpected characters\n",
" console.log(data);\n",
"});"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
"nbformat": 4,
"nbformat_minor": 0
}

0 comments on commit a7ffd94

Please sign in to comment.