Skip to content

Commit

Permalink
Remove code that is in the notebooks
Browse files Browse the repository at this point in the history
  • Loading branch information
sualeh committed May 5, 2024
1 parent 9a2e09c commit 153c013
Show file tree
Hide file tree
Showing 3 changed files with 0 additions and 402 deletions.
111 changes: 0 additions & 111 deletions Slides/part1/what-a-character-unicode-support-in-java.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,117 +43,6 @@ So,
- Java 5 APIs allow for `int` code points instead of surrogate pairs


## Valid Character Literals

```java
char ch1 = 'a';
char ch2 = ''; // (Not an ASCII character!)
```

However,
```java
char ch3 = '𐐀'; // (Not a BMP character!)
```
is a syntax error, since 'DESERET CAPITAL LETTER LONG I' - 𐐀 needs surrogate pairs.


## Unicode Notation in Strings

- `\uHHHH` - where H is a case-insensitive hexadecimal character
- Only supports the Basic Multilingual Plane
- Supplementary characters are represented as **surrogate pairs**

```java
char ch5 = '\u00EA'; // ‘ê’
String str1 = "a\u00ea\u00f1\u00fcc"; // “aêñüc”
String str2 = "A\u00EA\u00F1\u00FCC"; // “AêñüC”
```


## Literal Surrogate Pairs

- Supplementary characters are written as **surrogate pairs**

```java
// Character outside the BMP
String str3 = "\uD801\uDC00"; // ‘𐐀’
```

**Result:**

`str3.length()` is 2
`str3.codePointCount()` is 1


## Unicode Code Point Literals

- `0xHHHHHH` - where H is a case-insensitive hexadecimal character
- Specify code plane with code point
- No surrogate pairs needed
- Not all Java APIs support `int` code points


## Unicode Code Point Literals

```java
// Character outside the BMP
int cp1 = 0x010400; // 𐐀
String str4 = new StringBuffer()
.appendCodePoint(cp).toString();
String str5 = Character.toString(cp1);
```

**Result:**

`str4.length()` is 2
`str4.codePointCount()` is 1


## Escape Sequences

Built-in escape sequences

![w:700](escape-sequences.png "Escape Sequences")


## Unicode Cases

```java
String greekWord = "ΣΚΎΛΟΣ"; // dog
String greekLower = greekWord.toLowerCase();
```

**Result:**
`greekLower` is "σκύλος"
(Notice that the first and last letter are both sigma)


## Unicode Cases

```java
String germanWord = "straße"; // street
String germanUpper = germanWord.toUpperCase();
```

**Result:**
`germanUpper` is "STRASSE"
(Notice that the string lengths are different)


## Unicode Integer Parsing

```java
String hindiNumber = "१२३४५६७८९०";
int number = Integer.parseInt(hindiNumber);
```

**Result:**

`number` is 1234567890




## Use the Character Class

```java
Expand Down
145 changes: 0 additions & 145 deletions Slides/part1/what-a-character-unicode-support-in-javascript.md

This file was deleted.

Loading

0 comments on commit 153c013

Please sign in to comment.