View this file with results and syntax highlighting here.
This portion of the spec is still potentially subject to major changes.
The •
symbol is used to access values other than primitives provided by BQN.
All system values described in the BQN specification are optional: an implementation does not have to include any of them. However, if a system value with one of the names given below is included, then it must have the specified behavior. For namespaces this rule applies to individual fields as well: a namespace may be provided with only some of the fields, but a field with one of the given names must behave as specified.
Section | Contents |
---|---|
Execution | •BQN , •ReBQN , •primitives |
Control | •_while_ |
Scripts | •Import , •args , •Exit , … |
Files (paths, metadata, access, opened) | •file , •FChars , •FLines , •FBytes |
Input and output (terminal) | •Out , •Show , •Repr , •Fmt , …, •term |
Interface (FFI) | •SH , •FFI |
Platform | •platform |
Operation properties | •Type , •Glyph , •Source , •Decompose |
Namespaces | •ns |
Data structures | •HashMap |
Time | •UnixTime , •Delay , •_timed , … |
Math | •math |
Random generation | •rand , •MakeRand |
Bitwise operations | •bit |
Name | Summary |
---|---|
•BQN |
Evaluate the argument string in an isolated scope |
•ReBQN |
Create a BQN-like evaluation function with options 𝕩 |
•primitives |
List of primitives as glyph-value pairs |
The left argument to •BQN
or the result of •ReBQN
, if given, is a list of up to three elements, giving a prefix of •state
(see next section) during evaluations of that function. Thus ⟨"","xyz"⟩•BQN"•name"
returns "xyz"
.
•ReBQN
accepts a namespace 𝕩
. The following options are specified if supported:
Option | Values (default first) |
---|---|
repl |
"none" , "strict" , "loose" |
primitives |
List of glyph-value pairs; default •primitives |
system |
"all" , "none" , "safe" or list of names |
scope |
"none" , "read" , "modify" or list of name-setting pairs |
The option repl
indicates how variables are retained across calls: with "none" they are not saved; with "strict", they are saved and can't be redefined; and with "loose" they may be redefined. Each element in primitives
gives the glyph and value for a primitive to be made available. The value must have an operation type and its type determines the primitive's role. system
in general gives the list of system values to be made available, with shorthand values to indicate all currently-available ones, none of them, or only a subset that cannot be used to interact with anything outside of the execution context. scope
indicates allowed interaction with the scope in which •ReBQN
is called (not loaded): with "read" variables may be read and with "modify" they may be read or modified.
Name | Summary |
---|---|
•_while_ |
Run 𝔽 as long as 𝔾 returns 1 |
The 2-modifier •_while_
repeatedly checks 𝕨𝔾v
, where v
is initially 𝕩
. The result must be 0
or 1
, or this causes an error. If 0
, it returns v
, and if 1
it updates v
to 𝕨𝔽v
and starts again. An equivalent implementation minus the boolean check is {𝕨(𝕨𝕊𝔽)⍟𝔾𝕩}
. However, this code consumes stack space with every call in a BQN implementation without tail call optimization. •_while_
should use only a constant amount of memory.
Name | Summary |
---|---|
•Import |
Load a script file |
•state |
⟨•path, •name, •args⟩ |
•args |
Arguments passed to current file |
•path |
Current file's path |
•name |
Current filename |
•wdpath |
Shell's working directory path |
•Exit |
Leave the top-level running program |
•Import
loads another BQN script. The script is evaluated in its own isolated scope. Like a block, its result is either the result of the last line, or a namespace if it exports with ⇐
at the top level.
The right argument is a filename, which may be relative or absolute. Relative paths are taken relative to the source file where this instance of •Import
was written. The left argument, if given, is the list of arguments that should be passed through to the file as •args
. If no left argument is given then ⟨⟩
is used for •args
. However, the behavior is different in this case. The same file will only be loaded once in a given BQN program by •Import
calls with no left argument: the first such call saves the returned value, even if it is mutable, and subsequent calls return this saved value. To avoid this and reload the file, pass a left argument of ⟨⟩
.
•args
is the arguments passed as the file was invoked, either from the command line or •Import
. For command line calls it is a list of strings.
•path
simply gives the path of the file in which it appears. It includes a trailing slash but not the name of the file itself.
•name
gives the name, including the extension, of the file in which it appears. It doesn't include the path.
•wdpath
returns the path of the current working directory, like the Unix pwd
command, but including a trailing slash.
•Exit
immediately terminates the running BQN process. If the argument is a valid return code (on Unix, an integer), it is returned; otherwise, the default return code (the one returned when the end of the program is reached) is used.
The system namespace value •file
deals with file operations. For the purposes of •file
, paths in the filesystem are always strings. As with •Import
, file paths may be relative or absolute, and relative paths are relative to •path
, except in •file.At
which allows 𝕨
to specify an alternate base directory. The value •path
used for a particular instance of •file
is determined by the file that contains that instance.
Possible fields of •file
are given in the subsections below.
The following functions manipulate paths and don't access files. Each takes a relative or absolute path 𝕩
, and At
may also take a base directory 𝕨
.
Name | Summary |
---|---|
path |
Path of this source file, that is, •path |
At |
Join base path 𝕨 , or •path if not given, to 𝕩 |
Name |
File name including extension |
Parent |
Path of the containing directory, with trailing backslash |
BaseName |
File name, with dot and extension removed |
Extension |
File extension, including leading dot |
Parts |
List of parent, base name, and extension |
The path returned by At
is not simplified to remove .
or ..
: it contains all components of the two joined paths. If 𝕩
is an absolute path, it returns 𝕩
. If 𝕩
is empty, it returns 𝕨
unchanged.
Metadata functions may query information about a file or directory but do not read to or write from it. Each takes a path 𝕩
, and some functions also allow new data in 𝕨
. The returned data in any case is the specified property.
Name | Summary |
---|---|
Exists |
1 if the file exists and 0 otherwise |
Type |
A character indicating the file's type |
Created |
Time created |
Accessed |
Time of last access |
Modified |
Time of last modification |
Size |
Total size in bytes |
Permissions |
Query or set file permissions |
Owner |
Query or set owner user ID and group ID number |
RealPath |
Canonical path of file |
Times are Unix timestamps, that is, non-leap seconds since the Unix epoch, as used by time system values. File permissions on Unix are a three-element list of numbers giving the permissions for the owner, group, and other users. The file type is one of the following characters for the POSIX file types, matching Unix ls -l
with 'f'
instead of '-'
.
'f'
: File'd'
: Directory'l'
: Symlink'p'
: Pipe (FIFO)'s'
: Socket'b'
: Block device'c'
: Character device
RealPath
converts the argument file to an equivalent one without any components that are .
or ..
or symlinks. This requires the directories along the path to check if they are symlinks.
File access functions read or write files, either by manipulating files as a whole or interacting with the contents. Whole-file functions cannot overwrite target files: that is, Rename
and Copy
must give an error if a file exists at 𝕨
, and CreateDir
if a file exists at 𝕩
, while Chars
, Lines
, and Bytes
can overwrite the contents of an existing file 𝕨
. However, these three functions must give an error if 𝕨
exists and is a directory.
Name | Summary |
---|---|
Open |
Return an open file object based on 𝕩 |
Rename |
Rename file 𝕩 with path 𝕨 |
Copy |
Copy file 𝕩 to path 𝕨 |
CreateDir |
Create a directory at path 𝕩 |
Remove |
Delete file 𝕩 |
RemoveDir |
Recursively delete directory 𝕩 and all contents |
List |
Return names of all files in directory 𝕩 |
Chars |
Read from or write to entire file, as characters |
Lines |
Read from or write to entire file, as lines |
Bytes |
Read from or write to entire file, as bytes |
MapBytes |
Create a memory-mapped array aliasing the file |
Rename
, Copy
, and CreateDir
return the path of the new file. Remove
and RemoveDir
return 1
to indicate successful removal (and error otherwise).
List
returns filenames only, without full paths. It lists all files and directories including hidden ones, but not the current and parent directory names .
and ..
.
Functions Chars
, Lines
, and Bytes
are all ambivalent. If only 𝕩
is given, then it is a filename, and the result is the contents of the file in the appropriate format. If there are two arguments, then 𝕨
is the filename and 𝕩
is the desired contents. These are written to the file, overwriting its contents, and the absolute filename 𝕨
is returned. The three formats are:
- Chars: List of BQN characters, effectively UTF-32. The file is assumed to be UTF-8 encoded.
- Lines: List of BQN strings. The file is decoded as with chars, then split into lines by CR, LR, or CRLF line endings.
- Bytes: List of single-byte values, stored as BQN characters from
@
to@+255
.
The MapBytes
function only takes one argument, a filename, and returns an array matching the result of Bytes
. However, the array data should be memory-mapped allowing it to be loaded into memory on use. Undefined behavior, including a crash or writing arbitrary memory, may result if the underlying file is modified after MapBytes
is called, and the result or slices of it are still used.
The following short names can also be provided for file access. If so, they use the definitions from above even if •file
is not provided.
Name | Equivalent |
---|---|
•FChars |
•file.Chars |
•FLines |
•file.Lines |
•FBytes |
•file.Bytes |
Not yet specified.
Name | Summary |
---|---|
•Out |
Print argument string |
•Show |
Print argument value |
•Repr |
String representation of 𝕩 , if possible |
•Fmt |
Format value for printing |
•ParseFloat |
Convert from string to floating point number |
•Out
prints a string to stdout, with a trailing newline. •Show
displays a BQN value to the programmer (the representation is not specified, and does not need to be plain text). •Fmt
returns a string (not a character table: lines are separated by linefeeds) indicating how 𝕩
would be printed by the interactive environment. Both •Show
and •Fmt
may take a left argument configuring how the value should be formatted.
•Repr
attempts to return a string so that •BQN •Repr 𝕩
matches 𝕩
. If 𝕩
contains any mutable values (operations or namespaces), this is not possible. However, if such a values is stateless, in the sense that they don't access variables outside of their own scopes, it is permissible for •Repr
to return source code that would create a value with identical behavior.
•ParseFloat
returns the numeric value given by a string 𝕩
in integer, decimal, or scientific notation. The whole of 𝕩
must match the regular expression -?(\.[0-9]+|[0-9]+\.?[0-9]*)([eE][-+]?[0-9]+)?
or an error is given. This format is similar to BQN's numeric literals but with many differences. Only -
(not ¯
) can be used for a negative sign, and a positive exponent may be optionally preceded by +
. A dot .
indicates the decimal regardless of locale, and digits can be absent either before or after it, but not both. The function should make an effort to return the nearest possible value to the exact one represented, but is not required to round perfectly in all cases.
The system namespace •term
gives fine-grained control of input and output when running in a terminal emulator or similar text-based interface.
Name | Summary |
---|---|
Flush |
Flush output buffer |
RawMode |
Set raw mode (no input processing) if 1 , unset if 0 |
CharB |
Read a character, blocking until one is available |
CharN |
Read a character if available but return 0 if not |
The function •SH
allows BQN to call other programs, as an operating system shell would. •FFI
allows it to call functions compiled by C or compatible languages—these are stored in files that traditionally have names like lib*.so
in Unix. In both cases the callee can run unrestricted code, so only trusted programs and functions should be called this way.
Name | Summary |
---|---|
•SH |
Execute shell command and return exitcode‿stdout‿stderr |
•FFI |
Load a native function from a shared object file |
The argument to •SH
is a list of strings giving the command and its arguments (for example "mv"‿"old"‿"new"
). The command is executed synchronously, and the result is a list of three elements: the command's exit code, text written to stdout, and text written to stderr. In both cases the text is a plain string containing all text emitted by the program. Text is interpreted as UTF-8, with an error if it's not valid UTF-8.
The arguments to •FFI
are a file path for 𝕨
(interpreted relative to •path
if necessary, like •file
functions), and a function descriptor for 𝕩
, which gives the function name, argument and result types, and information about how to convert these values. The format of 𝕩
is described in the next section. The result is a BQN function that calls the specified function. This call can crash, mutate values, or invoke other unexpected behavior if the function interferes with memory used by BQN.
In a call to •FFI
, 𝕩
follows the pattern "result"‿"fn"‿"arg0"‿"arg1"‿...
, that is, a string for the result type, one for the function name, and any number of strings indicating argument types. 𝕩
must always be a list.
The function name is an arbitrary string. In order to look up the appropriate function in shared object file 𝕨
, it's encoded as UTF-8.
Types are to be interpreted according to the C ABI appropriate for the platform used. The grammar for a result or argument type is given below, using BNF as in the BQN grammar. Quoted values here are single characters: the type isn't tokenized and can't contain spaces. A •FFI
implementation does not need to support all combinations of types.
conv = type ( ":" bqn )?
type = ( "i" | "u" | "f" ) nat # number
| "a" # BQN object
| ( "*" | "&" ) type? # pointer
| "[" nat "]" type # array
| "{" ( conv ( "," conv )* )? "}" # struct
bqn = ( "i" | "u" | "f" | "c" ) nat
nat = digit+
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
By default, the returned function takes a list of arguments 𝕩
, requires 𝕨
to be an empty list if present, and returns a value corresponding to the C result. Some argument-specific rules can change this:
- The result type may also be the empty string
""
, indicating a void or ignored result, or"&"
, indicating an ignored result, using a mutable argument for the BQN result, as discussed below. It can't contain the character"&"
otherwise. - An argument type may be preceded by up to one
>
, and up to one𝕨
or𝕩
, in any order. Arguments with𝕨
are taken from𝕨
in order, and the others from𝕩
. If no arguments come from𝕨
, the BQN function may be called monadically. If an argument type contains>
, it must be the only value in its BQN argument (𝕨
or𝕩
), and that argument will be treated not as a list but as an entire value.
Beginning with the type declarations themselves, a number such as f32
corresponds to a C type with the given quality (i
for signed integer, u
for unsigned, f
for floating-point) and width in bits. The corresponding BQN value is a number, and should be converted exactly for integers and with rounding for decreasing-type conversions. For conversions to or from an integer type, attempting to convert a value to a type that can't contain it, or one outside of the exactly representable integer range (-2⋆53
to 2⋆53
for IEEE doubles), results in an error.
A pointer such as *u8
comes from a BQN list (or a pointer object, as discussed in the next paragraph). If the symbol &
is used rather than *
, the pointer is called mutable and its contents after the function call completes are also returned as an element of the result. If there is any mutable pointer, the result is a list, unless the result type is "&"
, in which case there must be exactly one mutable pointer and the result is its value alone. These prefixes can only be used in arguments, meaning that a BQN value is provided, and this value determines the length of both the input and the mutable result.
When the result contains a pointer type, that value will be returned as a new pointer object with the given type. Furthermore, a pointer in an argument accepts a pointer object whose element type is compatible with the specified one. Types are compared recursively for compatibility: they're compatible if all contained types (such as fields of a struct) are compatible, and all other aspects (such as array length) are identical, or if one is an untyped pointer *
and the other is any pointer. This means a bare pointer type *
or &
can't accept a list, but allows any pointer. When used with &
, the object is included in the result unchanged.
The letter a
indicates that a BQN value is to be passed directly, interpreted in whatever way makes sense for the implementation. The array and struct types indicate C structs and arrays, and correspond to BQN lists.
The bqn
value in a conv
term indicates a BQN element type to be used. It can be appear after the whole type, or any member of a struct, and applies to the final component (that is, type
term) of the type and one preceding *
, &
, or [n]
if present (if a type ends in **
, it applies to both *
s). This portion of the type corresponds to a BQN list of the given element type, interpreted much like bitwise conversion •bit._cast
. The C type is treated as pure data, a stream of bits. For a prefix *
or &
, the data in question is the region of memory pointed to.
A pointer object encapsulates a pointer into memory, and can be used in FFI functions. It is either untyped (represented by an empty type string ""
), or has an element type and a stride in bytes.
Name | Summary |
---|---|
Read |
Read value at offset 𝕩 |
Write |
Write value 𝕩 at offset 𝕨⊣0 |
Add |
Return a new pointer offset by 𝕩 |
Sub |
Offset by -𝕩 , or return the offset from 𝕩 to this pointer |
Cast |
Return a new pointer to the same location with type 𝕩 |
Field |
Return a new pointer to field number 𝕩 , maintaining stride |
An untyped pointer gives an error on any Read
, Write
, Add
, or Sub
. Offsets must be integers, and are multiplied by the pointer stride, giving a displacement in bytes, in order to add it to the location. A BQN value to be written is converted in the same way as an FFI argument of the pointer's element type, and a value to be read is converted like an FFI result. This relies on the C ABI for layout details, and could be modelled as an FFI interface to a C function that writes to, or reads from, a pointer.
Add
and Sub
, with an integer argument, return a new pointer, maintaining the argument type and stride. When the argument to Sub
is a pointer, it must be typed, and the two pointers must have compatible types and the same stride. The result is the difference in bytes between the pointers, divided by the common stride. If the difference wasn't an even multiple of the stride, Sub
gives an error.
The argument to Cast
is an element type, or ""
, and the result is accordingly a typed or untyped pointer. If typed, the returned pointer's stride is the width of its element type; that is, the original pointer's stride is ignored.
Field
gives an error unless the pointer's element type is a struct or array. It must be a natural number less than the length of that type. The result is a pointer that points to the field with that index, with the type of that field and the stride of the original pointer.
The platform is the set of mechanisms responsible for evaluating the current instance of BQN, including CPU, operating system, host language, and the BQN implementation itself, when applicable. System namespace •platform
contains information about it.
Name | Summary |
---|---|
os |
Operating system: "linux", "windows" |
environment |
What evaluates the implementation: "native", "wasm", "jvm" |
cpu.arch |
Instruction set architecture of the CPU: "x86-64", "aarch64" |
bqn.impl |
Name of the BQN implementation: "CBQN", "JS BQN" |
bqn.implVersion |
Implementation-specific version numbering |
os
, environment
, and cpu.arch
are reported in all lowercase. The value "unknown"
is used if that aspect of the platform can't be determined, and "none"
if it's not present.
The operating system is the tool that manages system resources such as processes, files, and devices. When there are multiple OS layers, for example when using a VM or compatibility layer, the one that directly interfaces with BQN (the lowest or "most virtual") should be chosen.
The environment is the system that evaluates the BQN implementation, generally either the hardware itself ("native") or a language runtime such as Javascript, the JVM, WebAssembly (Wasm), or Julia.
Namespace •platform.cpu
reports CPU information, with arch
giving its base instruction set architecture. When run on an emulated CPU the emulated architecture may be reported, that is, native-compiled BQN implementations may simply report their compilation target.
Namespace •platform.bqn
reports information about the implementation of BQN in use. The format of its name and version numbering can be freely chosen by the implementation. Taken together, these should uniquely identify the source code of the running implementation if possible.
Name | Summary |
---|---|
•Type |
Return a number indicating type |
•Glyph |
Return the glyph for a primitive |
•Source |
Return the source of a block, as a string |
•Decompose |
Show the parts of a compound function |
Each function in this section is monadic.
•Type
gives its argument's type, as a number from the table below:
Number | Type |
---|---|
0 | Array |
1 | Number |
2 | Character |
3 | Function |
4 | 1-modifier |
5 | 2-modifier |
6 | Namespace |
•Glyph
gives the glyph corresponding to a primitive as a single character, for example returning '+'
given an argument matching +
. It causes an error if the argument is not a primitive.
•Source
gives a string containing a block's source, including the enclosing braces {}
. It causes an error if the argument is not a block. In contrast to •Glyph
, this function does not give full information about 𝕩
because the result cannot convey environment or mutable identity.
•Decompose
breaks down one level of a compound function, returning a list with a code giving what kind of structure it has followed by each of its components. Possible codes are listed in the table below according to the rule that forms the function—train or modifier application. Non-function values, and some functions, can't be broken down. They are still classified with a code: -1 for a non-operation, 0 for a primitive, and 1 for other operations. "Other" includes blocks and system operations. The result is thus a list of length 2 to 4, and •Decompose
cannot cause an error.
Kind | Code | Components |
---|---|---|
Non-operation | -1 | 𝕩 |
Primitive | 0 | 𝕩 |
Other | 1 | 𝕩 |
2-train | 2 | g,h |
3-train | 3 | f,g,h |
1-mod | 4 | 𝕗,𝕣 |
2-mod | 5 | 𝕗,𝕣,𝕘 |
The system namespace •ns
contains functionality for working with namespaces.
Name | Summary |
---|---|
Keys |
List of normalized field names in 𝕩 |
Has |
1 if 𝕨 contains a field named 𝕩 and 0 otherwise |
Get |
Current value in 𝕨 of the field named 𝕩 |
Keys
returns field names as strings, normalized in the sense that all underscores are removed and alphabetic characters are converted to lowercase. The order of the names is unspecified. Has
and Get
accept names with any spelling. Get
causes an error if 𝕩
isn't the name of a field of 𝕨
, while Has
causes an error only if it isn't a string, returning 0
for any string that isn't a valid name.
The system function •HashMap
creates a mutable object from the list of initial keys 𝕨
and values 𝕩
that maintains an association mapping keys to values. It has the following fields:
Name | Summary |
---|---|
Count |
return the number of keys present |
Keys |
return all keys as a list in the order they were set |
Values |
return all values corresponding to the keys |
Has |
return 1 if key 𝕩 is present and 0 otherwise |
Get |
return value for key 𝕩 , and 𝕨 or error if not found |
Set |
set value for key 𝕨 to 𝕩 , possibly replacing an existing value |
Delete |
remove the entry for key 𝕩 |
Fields Count
, Keys
, and Values
take one argument and ignore it. Set
and Delete
return the map object itself.
Any BQN value can be used as a key, and two keys are considered the same if they match (as in ≡
). The initial key list 𝕨
passed to •HashMap
can't contain duplicates, that is, the function errors if ¬∧´∊𝕨
. This key list defines the initial ordering used by Keys
, which may then be modified by Set
and Delete
. When Set
adds a key that's not currently present, that key is added to the end of the ordering. When Delete
removes a key, it's removed from the ordering, and will be added to the end if set again.
Name | Summary |
---|---|
•UnixTime |
Time between Unix epoch and function call |
•MonoTime |
Monotonically-increasing time counter for relative measurement |
•Delay |
Wait at least 𝕩 seconds, and return the actual wait time |
•_timed |
Call 𝔽 on 𝕩 𝕨⊣1 times, and return the average duration |
•_maxTime_ |
Call 𝔽 on the arguments, but fail if it takes over 𝕨𝔾𝕩 seconds |
All times are measured in seconds.
The Unix epoch is 1970-01-01 00:00:00 UTC, and Unix time is the number of seconds since this epoch, with adjustments for leap seconds. •UnixTime
is intended for absolute time measurement and should use the source most accurate reflects Unix time when it's called. •MonoTime
is intended for relative measurement and should use the method that gives the most precise time differences over the course of the program. Its return value must never decrease between calls.
•_timed
returns the total time taken divided by the number of function calls (𝕨
if provided and 1 otherwise), including the overhead required for the outer loop that counts iterations (which will typically be negligible in comparison to the BQN code).
More accurately the modifier •_maxTime_
may fail if execution of 𝔽
takes over 𝕨𝔾𝕩
seconds, and should fail as quickly as it is practically able to. The most likely way to implement this modifier is to interrupt execution at the given time. If 𝔽
completes before the interrupt there is no need to measure the amount of time it actually took.
System namespace •math
contains mathematical utilities that are not easily implemented with basic arithmetic, analogous to C's math.h
.
Correctly-rounded arithmetic functions: monadic Cbrt⇐3⊸√
, Log2⇐2⋆⁼⊢
, Log10⇐10⋆⁼⊢
, Log1p⇐⋆⁼1⊸+
, Expm1⇐1-˜⋆
; dyadic Hypot⇐+⌾(ט)
.
Standard trigonometric functions Sin
, Cos
, Tan
, Sinh
, Cosh
, Tanh
, with inverses preceded by a
(ASin
, etc.) and accessable with ⁼
. Additionally, the dyadic function ATan2
giving the angle of vector 𝕨‿𝕩
relative to 1‿0
. All trig functions measure angles in radians.
Special functions Fact
and LogFact
giving the factorial and its natural logarithm, possibly generalized to reals as the gamma function Γ(1+𝕩), and Comb
giving the binomial function "𝕨
choose 𝕩
". Also the error function Erf
and its complement ErfC
. The implementations LogFact ← ⋆⁼Fact
and ErfC ← 1-Erf
are mathematically correct but these two functions should support greater precision for a large argument.
The greatest common divison GCD
and least common multiple LCM
of two numbers. Behavior for arguments other than natural numbers is not yet specified.
•MakeRand
initializes a deterministic pseudorandom number generator with seed value 𝕩
. •rand
, if it exists, is a globally accessible generator initialized at first use; this initialization should use randomness from an outside source if available. These random generators aren't required to be cryptographically secure and should always be treated as insecure. A random generator has the following member functions:
Name | Summary |
---|---|
Range |
A number, or array of shape 𝕨 , selected from ↕𝕩 |
Deal |
A simple random sample of 𝕨⊣𝕩 elements of ↕𝕩 |
Subset |
A sorted SRS of ↕𝕩 , with 𝕨 elements if given |
For each of these functions, 𝕩
is a natural number. For Range
, 𝕨
must be a valid shape (natural number, or list or unit array of natural numbers) if given, and for Deal
and Subset
it's a natural number less than or equal to 𝕩
. All selections are made uniformly at random, that is, each possible result is equally likely. A simple random sample (SRS) of k
elements from list s
is a list of k
distinct elements of s
in any order. Both the choice of elements and their ordering must be uniformly random. Recommended algorithms for SRS selection are variants of a partial Knuth shuffle.
When 𝕨
isn't given, Deal
's result contains all elements of ↕𝕩
, making it a random shuffle of those values, or random permutation. In Subset
, a random choice is made uniformly from the 2⋆𝕩
subsets of ↕𝕩
, so that a subset of any length may be returned.
In Range
, 𝕩
may be 0
. In this case the result consists of floating-point numbers in the unit interval from 0 to 1. The numbers should have an overall uniform distribution, but their precision and whether the endpoints 0 and 1 are possible may depend on the implementation.
Ranges up to 2⋆32
must be supported (that is, a maximum integer result of (2⋆32)-1
) if the number system accommodates it. In implementations based on double-precision floats it's preferable but not required to support ranges up to 2⋆53
.
The system namespace •bit
gives functions for efficiently applying bitwise and two's complement integer operations to arrays of data. These functions should compute result values with native CPU instructions, preferably SIMD if available.
Name | Args | Type | Behavior |
---|---|---|---|
_not |
1 | bit | ¬ |
_and |
2 | bit | ∧ |
_or |
2 | bit | ∨ |
_xor |
2 | bit | ≠ |
_neg |
1 | integer | - |
_add |
2 | integer | + |
_sub |
2 | integer | - |
_mul |
2 | integer | × |
_cast |
1 | any | identity |
The _cast
modifier is special and not considered an operation; see below. Each operation is exposed as a 1-modifier that takes up to four numbers for its operand.
- Operation width
- Result element width
- Right argument element width
- Left argument element width
The operand is extended to length 3 for monadic operations, and 4 for dyadic operations, by repeating the last element (like
»⟜(4⥊⊢´)
). It must be a number, or array of numbers with rank at most 1.
An example call is a 16‿1•bit._add b
, to perform 16-bit additions on two boolean arrays with a boolean result.
To apply a bitwise operation, each argument is represented as a stream of bits based on the width given for it, then split into units whose width is the operation width. The operation is applied to these units. The result is again treated as a stream of bits and split according to the result width, with each unit forming a result element.
The operation width, along with the "Type" column above, determines what operation is performed. For bit operations it has no effect, except to constrain the argument sizes according to the shape rules below. Integer operations support widths of 8 and above, and should support higher values such as 128 if available. For all of them, there is no difference between wrapping signed and unsigned operations given that the argument and result widths are the same.
Argument and result widths correspond to little-endian binary representations according to the following table (operation widths don't—see the "type" field in the table above). Here "boolean" indicates value 0 or 1, "signed integer" indicates two's complement representation, and "character" is a code point in an unsigned representation. Either type may be used for an argument, but the result will always use a primary type.
Width | Primary type | Also |
---|---|---|
1 | Boolean | |
8 | Signed integer | Character |
16 | Signed integer | Character |
32 | Signed integer | Character |
64 | IEEE 754 double |
An argument must be an array of numbers or an array of characters. Its elements must fit into the appropriate type. The "cell size" is the length in bits of a 1-cell, that is, widthׯ1⊑1∾≢arg
, and must be a multiple of the operation width. The "leading shape" is ¯1↓≢arg
. For two-argument functions one argument can be scalar-extended if it has rank 1 and cell size equal to the operation width. Otherwise both arguments must have the same cell size, and the same leading shape. The result shape is the leading shape of any non-extended argument followed by its cell size divided by the result element width. As a scalar-extended argument indicates a single operation input, it's reused every time the operation is applied.
Another tool is provided for performing direct conversions, with no operation applied. The 1-modifier •bit._cast
takes a two-element operand and one argument, for example ⟨8,16‿'c'⟩•bit._cast ints
to convert each pair of numbers in ints
into a 2-byte character. Each element of 𝕗
is a number or number-character pair giving width and type. The argument is encoded according to the first and decoded according to the second. The possible type characters and typical supported widths are given below.
Character | Widths | Quality |
---|---|---|
'u' |
1, 8, 16, 32 | Unsigned integer, or boolean |
'i' |
8, 16, 32 | Signed integer |
'c' |
8, 16, 32 | Character |
'f' |
64 | IEEE float |