From 4bd7186820acaeb8d12fff14ab44a96c5e76da71 Mon Sep 17 00:00:00 2001 From: Hyeseong Kim Date: Tue, 19 Nov 2024 01:35:34 +0900 Subject: [PATCH 1/5] draft RFC for int semantics --- text/0000-int.md | 124 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 124 insertions(+) create mode 100644 text/0000-int.md diff --git a/text/0000-int.md b/text/0000-int.md new file mode 100644 index 0000000..f173550 --- /dev/null +++ b/text/0000-int.md @@ -0,0 +1,124 @@ +--- +Feature Name: rescript-integers +Start Date: 2024-11-19 +RFC PR: (leave this empty) +ReScript Issue: (leave this empty) +--- + +## Summary + +Semantics deifinition of the ReScript's `int` type and integer primitives. + +## Motivation + +ReScript has three numeric primitive types, `int`, `float` and `bigint`. + +The semantics of `float` and `bigint` completely match JavaScript's ones, but `int` is unique to ReScript and originally came from OCaml's `int` type. + +`int` stands for 32-bit signed integers. It's a bit unusual for a language to have int32 only and no other precision — mostly for historical reasons, and it isn't very clear due to differences in behavior with JavaScript. + +This RFC describes its semantics and chosen trade-offs as precisely as possible. + +## Definition + +TBD + +```res +type int + +let n = 100 +``` + +Using unbounded integer literals may result in compile-time errors with messages such as `"Integer literal exceeds the range of representable integers of type int."` + +## Primitives + +Let `max_value` be $2^{31}-1$ and `min_value` be $-2^{31}$. + +### `fromNumber(x: number)` + +1. If `x` is JavaScript's `Infinity`, return `max_value`. +2. If `x` is JavaScript's `-Infinity`, return `min_value`. +3. Let `int32` be [`ToInt32`]`(x)`, return `int32`. + +The actions 1 and 2 are intended to reduce confusion when converting from an infinate value. (e.g. https://github.com/rescript-lang/rescript/issues/6737) However, it can be omitted if it is obvious that the `x` is not `Infinity` or `-Infinity`. + +The [`ToInt32`] behavior follows the definition in ECMA-262 as is. In action, the ReScript compiler uses `bitwiseOR(number, 0)`. This is what appears in the output as `number | 0`. And this removes all special numbers defined in IEEE-754. `int` never contain the following values: + +- `NaN` +- `Infinity` and `-Infinity` +- `-0` + +`fromNumber(x)` must be idempotent. + +### `add(x: int, y: int)` + +1. Let `number` be mathmatically $x + y$. +2. Let `int32` be `fromNumber(number)`, return `int32`. + +### `subtract(x, y)` + +1. Let `number` be mathmatically $x - y$. +2. Let `int32` be `fromNumber(number)`, return `int32`. + +### `multiply(x, y)` + +1. Let `number` be mathmatically $x * y$. +2. Let `int32` be `fromNumber(number)`, return `int32`. + +### `exponentiate(x, y)` + +1. Let `number` be mathmatically $x ^ y$. +2. Let `int32` be `fromNumber(number)`, return `int32`. + +`exponentiate(x, y)` must match the result of `multiply` accumulated `y` times. + +```js +function exponentiate(x, y) { + let int32 = 1; + for (let i = 0; i < y; i++) { + int32 *= x; + } + return int32 | 0; +} +``` + +### `divide(x, y)` + +1. If `y` equals `0`, raise `Divide_by_zero`. +2. Let `number` be mathmatically $x / y$. +3. Let `int32` be `fromNumber(number)`, return `int32`. + +### `remainder(x, y)` + +1. If `y` equals `0`, raise `Divide_by_zero`. + +### `abs(x)` + +1. If `x` is `min_value`, raise `Overflow_value`. + +## API consideration + +## Questions + +### Why do we even use `int`? + +The use of `int` is primarily for backward compatibility — not with OCaml, but with all existing ReScript codebases. + +Additionally, using `int` is beneficial for JavaScript programs since major JavaScript engines treat integers differently. + +Depending on the implementation, integer values (especially 32-bit integers) may have a distinct memory representation compared to floating-point numbers. For example, V8 (the JavaScript engine in Chromium) employs an internal element kind called "SMI" (Small integers). This provides an efficient memory representation for signed 32-bit integers and enhances runtime performance by avoiding heap allocation. + +At compile time, the compiler ensures that certain operations are restricted to using only `int` types. This increases the likelihood of utilizing the optimized execution paths for SMIs and reduces the potential for runtime de-optimization caused by element kind transitions. + +### Why do we truncate values instead of bounds-checking? + +It is also for backward compatibility. Bounds-checking and failure early may be more useful for fast feedback loop, but we don't want to break any programs that (accidentally) worked before. + +The `number | 0` is actually the most concise output form we can consistently use. Introducing any other runtime codes universally would lead to significant code bloat in the output. + +## Future posibilities + +Guaranteeing the use of int32 types may offer additional advantages in the future when targeting WebAssembly or alternative native backends. + +[`ToInt32`]: https://262.ecma-international.org/#sec-toint32 From 6b58c9769ab3d0bec3f26ff4e2e736cb0bafdecf Mon Sep 17 00:00:00 2001 From: Hyeseong Kim Date: Tue, 19 Nov 2024 01:38:30 +0900 Subject: [PATCH 2/5] assign a PR number --- text/{0000-int.md => 0001-int.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename text/{0000-int.md => 0001-int.md} (98%) diff --git a/text/0000-int.md b/text/0001-int.md similarity index 98% rename from text/0000-int.md rename to text/0001-int.md index f173550..defb90f 100644 --- a/text/0000-int.md +++ b/text/0001-int.md @@ -1,7 +1,7 @@ --- Feature Name: rescript-integers Start Date: 2024-11-19 -RFC PR: (leave this empty) +RFC PR: https://github.com/rescript-lang/rfcs/pull/1 ReScript Issue: (leave this empty) --- From 5534179afca641e370c8c903702198b73041f27b Mon Sep 17 00:00:00 2001 From: Hyeseong Kim Date: Thu, 21 Nov 2024 10:36:04 +0900 Subject: [PATCH 3/5] add restriction to avoid `Math.imul` --- text/0001-int.md | 32 +++++++++++++++++++++++++------- 1 file changed, 25 insertions(+), 7 deletions(-) diff --git a/text/0001-int.md b/text/0001-int.md index defb90f..dc41523 100644 --- a/text/0001-int.md +++ b/text/0001-int.md @@ -66,20 +66,38 @@ The [`ToInt32`] behavior follows the definition in ECMA-262 as is. In action, th 1. Let `number` be mathmatically $x * y$. 2. Let `int32` be `fromNumber(number)`, return `int32`. +The `multiply(x, y)` must produce the same result as `add(x)` accumulated `y` times. + +```res +let multiply = (x, y) => { + let id = 0 + let rec multiply = (x, y, acc) => { + switch y { + | 0 => acc + | n => multiply(x, n - 1, add(x, acc)) + } + } + multiply(x, y, id) +} +``` + ### `exponentiate(x, y)` 1. Let `number` be mathmatically $x ^ y$. 2. Let `int32` be `fromNumber(number)`, return `int32`. -`exponentiate(x, y)` must match the result of `multiply` accumulated `y` times. +The `exponentiate(x, y)` must produce the same result as `multiply(x)` accumulated `y` times. -```js -function exponentiate(x, y) { - let int32 = 1; - for (let i = 0; i < y; i++) { - int32 *= x; +```res +let exponentiate = (x, y) => { + let id = 1 + let rec exponentiate = (x, y, acc) => { + switch y { + | 0 => acc + | n => exponentiate(x, n - 1, multiply(x, acc)) + } } - return int32 | 0; + exponentiate(x, y, id) } ``` From a3aa732588d5d139fe120cf4c1b3da73550ae4a8 Mon Sep 17 00:00:00 2001 From: Hyeseong Kim Date: Fri, 22 Nov 2024 01:21:39 +0900 Subject: [PATCH 4/5] add a future consideration --- text/0001-int.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/text/0001-int.md b/text/0001-int.md index dc41523..2fd0295 100644 --- a/text/0001-int.md +++ b/text/0001-int.md @@ -117,6 +117,8 @@ let exponentiate = (x, y) => { ## API consideration +TBD + ## Questions ### Why do we even use `int`? @@ -135,6 +137,12 @@ It is also for backward compatibility. Bounds-checking and failure early may be The `number | 0` is actually the most concise output form we can consistently use. Introducing any other runtime codes universally would lead to significant code bloat in the output. +### Can we somehow make it match JavaScript's `number`? + +Perhaps in the future, we can make our number literals actually match JavaScript's number semantics. We could also rename `int` to `int32` and assign another literal like `0l`, as it was in OCaml syntax. + +However, this is not something that will happen in the near future. It won't occur until we are confident in our migration strategy to avoid breaking existing codebases. If done incorrectly, it could completely break compatibility with existing code. + ## Future posibilities Guaranteeing the use of int32 types may offer additional advantages in the future when targeting WebAssembly or alternative native backends. From a559ca67c836562b010dae224e7b9c55986f0d09 Mon Sep 17 00:00:00 2001 From: Hyeseong Kim Date: Sat, 22 Mar 2025 16:53:23 +0900 Subject: [PATCH 5/5] Update proposal * fix typos * more explanation --- text/0001-int.md | 75 ++++++++++++++++++++++++++++++------------------ 1 file changed, 47 insertions(+), 28 deletions(-) diff --git a/text/0001-int.md b/text/0001-int.md index 2fd0295..51a4aa7 100644 --- a/text/0001-int.md +++ b/text/0001-int.md @@ -7,11 +7,11 @@ ReScript Issue: (leave this empty) ## Summary -Semantics deifinition of the ReScript's `int` type and integer primitives. +Semantics definition of the ReScript's `int` type and integer primitives. ## Motivation -ReScript has three numeric primitive types, `int`, `float` and `bigint`. +ReScript has three numeric primitive types, `int`, `float`, and `bigint`. The semantics of `float` and `bigint` completely match JavaScript's ones, but `int` is unique to ReScript and originally came from OCaml's `int` type. @@ -21,49 +21,59 @@ This RFC describes its semantics and chosen trade-offs as precisely as possible. ## Definition -TBD +`int` is a built-in type. ```res type int +``` + +A numeric literal with only an integer part has type `int`. +```res let n = 100 ``` -Using unbounded integer literals may result in compile-time errors with messages such as `"Integer literal exceeds the range of representable integers of type int."` +The valid range of an integer literal is limited to the range of signed 32-bit integers $[-2^{31} .. 2^{31}-1]$. + +Using unbounded numbers in literals may result in compile-time errors with messages such as `"Integer literal exceeds the range of representable integers of type int."` ## Primitives -Let `max_value` be $2^{31}-1$ and `min_value` be $-2^{31}$. +Let `min_value` be $-2^{31}$ and `max_value` be $2^{31}-1$ -### `fromNumber(x: number)` +### `fromNumber: (x: number) => int` 1. If `x` is JavaScript's `Infinity`, return `max_value`. 2. If `x` is JavaScript's `-Infinity`, return `min_value`. 3. Let `int32` be [`ToInt32`]`(x)`, return `int32`. -The actions 1 and 2 are intended to reduce confusion when converting from an infinate value. (e.g. https://github.com/rescript-lang/rescript/issues/6737) However, it can be omitted if it is obvious that the `x` is not `Infinity` or `-Infinity`. +Actions 1 and 2 are intended to relax confusion when converting from infinite value directly. (e.g. https://github.com/rescript-lang/rescript/issues/6737) However, it can be omitted if the input is obviously not `Infinity` or `-Infinity`. + +The [`ToInt32`] behavior follows the definition in ECMA-262 as is. ReScript compiler uses `bitwiseOR(number, 0)` in action. This is what appears in the output as `number | 0`, which truncates all special numbers defined in IEEE-754. -The [`ToInt32`] behavior follows the definition in ECMA-262 as is. In action, the ReScript compiler uses `bitwiseOR(number, 0)`. This is what appears in the output as `number | 0`. And this removes all special numbers defined in IEEE-754. `int` never contain the following values: +`int` never contains the following values: +- `-0` - `NaN` - `Infinity` and `-Infinity` -- `-0` +- $x < $`min_value` +- $x > $`max_value` `fromNumber(x)` must be idempotent. -### `add(x: int, y: int)` +### `add: (x: int, y: int) => int` -1. Let `number` be mathmatically $x + y$. +1. Let `number` be mathematically $x + y$. 2. Let `int32` be `fromNumber(number)`, return `int32`. -### `subtract(x, y)` +### `subtract: (x: int, y: int) => int` -1. Let `number` be mathmatically $x - y$. +1. Let `number` be mathematically $x - y$. 2. Let `int32` be `fromNumber(number)`, return `int32`. -### `multiply(x, y)` +### `multiply: (x: int, y: int) => int` -1. Let `number` be mathmatically $x * y$. +1. Let `number` be mathematically $x * y$. 2. Let `int32` be `fromNumber(number)`, return `int32`. The `multiply(x, y)` must produce the same result as `add(x)` accumulated `y` times. @@ -81,9 +91,9 @@ let multiply = (x, y) => { } ``` -### `exponentiate(x, y)` +### `exponentiate: (x: int, y: int) => int` -1. Let `number` be mathmatically $x ^ y$. +1. Let `number` be mathematically $x ^ y$. 2. Let `int32` be `fromNumber(number)`, return `int32`. The `exponentiate(x, y)` must produce the same result as `multiply(x)` accumulated `y` times. @@ -101,45 +111,54 @@ let exponentiate = (x, y) => { } ``` -### `divide(x, y)` +### `divide: (x: int, y: int) => int` 1. If `y` equals `0`, raise `Divide_by_zero`. -2. Let `number` be mathmatically $x / y$. +2. Let `number` be mathematically $x / y$. 3. Let `int32` be `fromNumber(number)`, return `int32`. -### `remainder(x, y)` +### `remainder: (x: int, y: int) => int` 1. If `y` equals `0`, raise `Divide_by_zero`. +2. Let `number` be mathematically $x / y$. -### `abs(x)` +### `abs: (x: int) => int` 1. If `x` is `min_value`, raise `Overflow_value`. ## API consideration +These primitive operations for `int` often don't work as intended by the user due to the `fromNumber` truncation. + +APIs that use this should make it safer by providing appropriate errors with standard types. + +### Standard error types + TBD ## Questions ### Why do we even use `int`? -The use of `int` is primarily for backward compatibility — not with OCaml, but with all existing ReScript codebases. +Using `int` is primarily for backward compatibility — not with OCaml, but with all existing ReScript codebases. -Additionally, using `int` is beneficial for JavaScript programs since major JavaScript engines treat integers differently. +Additionally, using `int` benefits JavaScript programs since major JavaScript engines treat integers differently. -Depending on the implementation, integer values (especially 32-bit integers) may have a distinct memory representation compared to floating-point numbers. For example, V8 (the JavaScript engine in Chromium) employs an internal element kind called "SMI" (Small integers). This provides an efficient memory representation for signed 32-bit integers and enhances runtime performance by avoiding heap allocation. +Depending on the implementation, integer values (especially 32-bit integers) may have a distinct memory representation compared to floating-point numbers. For example, V8 (the JavaScript engine for Chromium and Node.js) employs an internal element kind called "SMI" (Small integers). This provides an efficient memory representation for signed 32-bit integers and enhances runtime performance by avoiding heap allocation. -At compile time, the compiler ensures that certain operations are restricted to using only `int` types. This increases the likelihood of utilizing the optimized execution paths for SMIs and reduces the potential for runtime de-optimization caused by element kind transitions. +At compile time, the compiler ensures that certain operations are restricted to using only `int` types. This increases the likelihood of utilizing the optimized execution paths for SMIs and reduces the potential for runtime de-optimization caused by element-kind transitions. ### Why do we truncate values instead of bounds-checking? -It is also for backward compatibility. Bounds-checking and failure early may be more useful for fast feedback loop, but we don't want to break any programs that (accidentally) worked before. +It is also for backward compatibility. + +Bounds-checking and failure early may be more useful for a fast feedback loop, but we don't want to break any programs that (accidentally) worked before. -The `number | 0` is actually the most concise output form we can consistently use. Introducing any other runtime codes universally would lead to significant code bloat in the output. +The `number | 0` is the most concise output form we can consistently use. Introducing any other runtime codes universally would lead to significant code bloat in the output. ### Can we somehow make it match JavaScript's `number`? -Perhaps in the future, we can make our number literals actually match JavaScript's number semantics. We could also rename `int` to `int32` and assign another literal like `0l`, as it was in OCaml syntax. +Perhaps, we can make our number literals match JavaScript's number semantics. We could also rename `int` to `int32` and assign another literal like `0l`, as it was in OCaml syntax. However, this is not something that will happen in the near future. It won't occur until we are confident in our migration strategy to avoid breaking existing codebases. If done incorrectly, it could completely break compatibility with existing code.