diff --git a/documentation/concept/array.md b/documentation/concept/array.md index 0c5edb00..6ecb5393 100644 --- a/documentation/concept/array.md +++ b/documentation/concept/array.md @@ -103,21 +103,6 @@ QuestDB always stores arrays in vanilla form. If you transform an array's shape and then store it to the database, QuestDB will physically rearrange the elements, and store the new array in vanilla shape. -## Array and NULL/Nan/Infinity values - -QuestDB does not support `NULL` values in arrays. In a scalar `DOUBLE` column, -if the value is `NaN`, it is treated as `NULL` in calculations. However, when it -appears inside an array, it is treated as such, and not `NULL`. If it appears in -an array and you take it out using the array access expression, the resulting -scalar value will again be treated as `NULL`, however whole-array operations -like `array_sum` will treat it according to its floating-point number semantics. - -QuestDB currently has an inconsistency in treating floating-point infinities. -They are sometimes treated as `NULL`, and sometimes not. Infinity values -currently produce unspecified behavior as scalar `DOUBLE` values, but inside an -array, while performing whole-array operations, they are consistently treated as -infinity. - ## The ARRAY literal You can create an array from scalar values using the `ARRAY[...]` syntax, as diff --git a/documentation/concept/replication.md b/documentation/concept/replication.md index 92bbcebd..820f2034 100644 --- a/documentation/concept/replication.md +++ b/documentation/concept/replication.md @@ -87,8 +87,8 @@ timeline Currently : AWS S3 : Azure Blob Store : NFS Filesystem + : Google Cloud Storage Next-up : HDFS - Later on : Google Cloud Storage ``` Something missing? Want to see it sooner? [Contact us](/enterprise/contact)! @@ -109,6 +109,12 @@ is: An example of a replication object store configuration using NFS is: `replication.object.store=fs::root=/mnt/nfs_replication/final;atomic_write_dir=/mnt/nfs_replication/scratch;` +An example of a replication object store configuration using GCS is: + +`replication.object.store=gcs::bucket=;root=/;credential=;` + +For `GCS`, you can also use `credential_path` to set a key-file location. + See the [Replication setup guide](/docs/operations/replication) for direct examples. diff --git a/documentation/operations/replication.md b/documentation/operations/replication.md index 9f2d3289..c56379e0 100644 --- a/documentation/operations/replication.md +++ b/documentation/operations/replication.md @@ -201,6 +201,34 @@ With your values, skip to the [Setup database replication](/docs/operations/replication/#setup-database-replication) section. +### Google GCP GCS + +First, create a new Google Cloud Storage (GCS) bucket, most likely with `Public access: Not public`. + +Then create a new service account, and give it read-write permissions for the bucket. The simplest +role is `Storage Admin`, but you may set up more granular permissions as needed. + +Create a new private key for this user and download it in `JSON` format. Then encode this key as `Base64`. + +If you are on Linux, you can `cat` the file and pass it to `base64`: + +``` +cat .json | base64 +``` + +Then construct the connection string: + +``` +replication.object.store=gcs::bucket=;root=/;credential=; +``` + +If you do not want to put the credentials directly in the connection string, you can swap the `credential` +key for `credential_path`, and give it a path to the key-file. + +With your values, continue to the +[Setup database replication](/docs/operations/replication/#setup-database-replication) +section. + ## Setup database replication Set the following changes in their respective `server.conf` files: diff --git a/documentation/reference/function/row-generator.md b/documentation/reference/function/row-generator.md index fdf43edd..067680b6 100644 --- a/documentation/reference/function/row-generator.md +++ b/documentation/reference/function/row-generator.md @@ -94,3 +94,213 @@ use the same seed in long_sequence. | ------------------ | | 0.8251337821991485 | | 0.2714941145110299 | + +## generate_series + +Rather than generating a fixed number of entries, `generate_series` can instead be used +to generate entries within a bounded range. + +This function supports `LONG` and `DOUBLE` generation. There is also a `TIMESTAMP` generating version, +which can be found [here](/documentation/reference/function/timestamp-generator.md) + +The `start` and `end` values are interchangeable, and a negative `step` value can be used to +obtain the series in reverse order. + +The series is inclusive on both ends. + +**Arguments:** + +`generate_series(start_long, end_long, step_long)` - generates a series of longs. + +`generate_series(start_double, end_double, step_double)` - generates a series of doubles. + + +**Return value:** + +Return value type is `times--- +title: Row generator +sidebar_label: Row generator +description: Row generator function reference documentation. +--- + +The `long_sequence()` function may be used as a row generator to create table +data for testing. Basic usage of this function involves providing the number of +iterations required. Deterministic pseudo-random behavior can be achieved by +providing seed values when calling the function. + +This function is commonly used in combination with +[random generator functions](/docs/reference/function/random-value-generator/) +to produce mock data. + +## long_sequence + +- `long_sequence(iterations)` - generates rows +- `long_sequence(iterations, seed1, seed2)` - generates rows deterministically + +**Arguments:** + +-`iterations`: is a `long` representing the number of rows to generate. -`seed1` +and `seed2` are `long64` representing both parts of a `long128` seed. + +### Row generation + +The `long_sequence()` function can be used to generate very large datasets for +testing e.g. billions of rows. + +`long_sequence(iterations)` is used to: + +- Generate a number of rows defined by `iterations`. +- Generate a column `x:long` of monotonically increasing long integers starting + from 1, which can be accessed for queries. + +### Random number seed + +When `long_sequence` is used conjointly with +[random generators](/docs/reference/function/random-value-generator/), these +values are usually generated at random. The function supports a seed to be +passed in order to produce deterministic results. + +:::note + +Deterministic procedural generation makes it easy to test on vasts amounts of +data without actually moving large files around across machines. Using the same +seed on any machine at any time will consistently produce the same results for +all random functions. + +::: + +**Examples:** + +```questdb-sql title="Generating multiple rows" +SELECT x, rnd_double() +FROM long_sequence(5); +``` + +| x | rnd_double | +| --- | ------------ | +| 1 | 0.3279246687 | +| 2 | 0.8341038236 | +| 3 | 0.1023834675 | +| 4 | 0.9130602021 | +| 5 | 0.718276777 | + +```questdb-sql title="Accessing row_number using the x column" +SELECT x, x*x +FROM long_sequence(5); +``` + +| x | x\*x | +| --- | ---- | +| 1 | 1 | +| 2 | 4 | +| 3 | 9 | +| 4 | 16 | +| 5 | 25 | + +```questdb-sql title="Using with a seed" +SELECT rnd_double() +FROM long_sequence(2,128349234,4327897); +``` + +:::note + +The results below will be the same on any machine at any time as long as they +use the same seed in long_sequence. + +::: + +| rnd_double | +| ------------------ | +| 0.8251337821991485 | +| 0.2714941145110299 | + +## generate_series + +Rather than generating a fixed number of entries, `generate_series` can instead be used +to generate entries within a bounded range. + +This function supports `LONG` and `DOUBLE` generation. There is also a `TIMESTAMP` generating version, +which can be found [here](/documentation/reference/function/timestamp-generator.md) + +The `start` and `end` values are interchangeable, and a negative `step` value can be used to +obtain the series in reverse order. + +The series is inclusive on both ends. + +The final argument is optional, and defaults to `1`. + +As a pseudo-table, the function can be called in isolation (`generate_series()`), or as +part of a select (`SELECT * FROM generate_series()`). + +**Arguments:** + +`generate_series(start_long, end_long, step_long)` - generates a series of longs. + +`generate_series(start_double, end_double, step_double)` - generates a series of doubles. + + +**Return value:** + +Return value type is `long` or `double`. + +**Examples:** + +```questdb-sql title="fwd long generation" demo +generate_series(-3, 3, 1); +-- or +generate_series(-3, 3); +``` + +| generate_series | +| --------------- | +| -3 | +| -2 | +| -1 | +| 0 | +| 1 | +| 2 | +| 3 | + +```questdb-sql title="bwd long generation" demo +generate_series(3, -3, -1); +``` + +| generate_series | +| --------------- | +| 3 | +| 2 | +| 1 | +| 0 | +| -1 | +| -2 | +| -3 | + +```questdb-sql title="fwd double generation" demo +generate_series(-3d, 3d, 1d); +-- or +generate_series(-3d, 3d); +``` + +| generate_series | +| --------------- | +| -3.0 | +| -2.0 | +| -1.0 | +| 0.0 | +| 1.0 | +| 2.0 | +| 3.0 | + +```questdb-sql title="bwd double generation" demo +generate_series(-3d, 3d, -1d); +``` + +| generate_series | +| --------------- | +| 3.0 | +| 2.0 | +| 1.0 | +| 0.0 | +| -1.0 | +| -2.0 | +| -3.0 | \ No newline at end of file diff --git a/documentation/reference/function/timestamp-generator.md b/documentation/reference/function/timestamp-generator.md index abb1dd6c..329a72be 100644 --- a/documentation/reference/function/timestamp-generator.md +++ b/documentation/reference/function/timestamp-generator.md @@ -10,6 +10,7 @@ create data for testing. Pseudo-random steps can be achieved by providing a `step` argument. A `seed` value may be provided to a random function if the randomly-generated `step` should be deterministic. + ## timestamp_sequence - `timestamp_sequence(startTimestamp, step)` generates a sequence of `timestamp` @@ -64,3 +65,95 @@ FROM long_sequence(5); | 3 | 2019-10-17T00:00:00.600000Z | | 4 | 2019-10-17T00:00:00.900000Z | | 5 | 2019-10-17T00:00:01.300000Z | + + +## generate_series + +Rather than generating a fixed number of timestamps, you can instead generate timestamps in a range, +using `generate_series`. + +The step can be provided in either microseconds, or in a period string, similar to `SAMPLE BY`. + +The `start` and `end` values are interchangeable, and a negative `step` value can be used to +obtain the series in reverse order. + +The series is inclusive on both ends. + +**Arguments:** + +There are two timestamp-generating variants of `generate_series`: + +`generate_series(start, end, step_period)` - generate a series of timestamps between `start` and `end` +in periodic steps. +`generate_series(start, end, step_micros)` - generates a series of timestamps between `start` and `end`, +in microsecond steps. + + +**Return value:** + +Return value type is `timestamp`. + +**Examples:** + +```questdb-sql title="fwd series with period" demo +generate_series('2025-01-01', '2025-02-01', '5d'); +``` + +| generate_series | +| --------------------------- | +| 2025-01-01T00:00:00.000000Z | +| 2025-01-06T00:00:00.000000Z | +| 2025-01-11T00:00:00.000000Z | +| 2025-01-16T00:00:00.000000Z | +| 2025-01-21T00:00:00.000000Z | +| 2025-01-26T00:00:00.000000Z | +| 2025-01-31T00:00:00.000000Z | + +```questdb-sql title="bwd series with period" demo +generate_series('2025-01-01', '2025-02-01', '-5d'); +``` + +| generate_series | +| --------------------------- | +| 2025-02-01T00:00:00.000000Z | +| 2025-01-27T00:00:00.000000Z | +| 2025-01-22T00:00:00.000000Z | +| 2025-01-17T00:00:00.000000Z | +| 2025-01-12T00:00:00.000000Z | +| 2025-01-07T00:00:00.000000Z | +| 2025-01-02T00:00:00.000000Z | + +```questdb-sql title="fwd series with micro step demo +generate_series( + '2025-01-01T00:00:00Z'::timestamp, + '2025-01-01T00:05:00Z'::timestamp, + 60_000_000 +); +``` + +| generate_series | +| --------------------------- | +| 2025-01-01T00:00:00.000000Z | +| 2025-01-01T00:01:00.000000Z | +| 2025-01-01T00:02:00.000000Z | +| 2025-01-01T00:03:00.000000Z | +| 2025-01-01T00:04:00.000000Z | +| 2025-01-01T00:05:00.000000Z | + + +```questdb-sql title="vwd series with micro step demo +generate_series( + '2025-01-01T00:00:00Z'::timestamp, + '2025-01-01T00:05:00Z'::timestamp, + -60_000_000 +); +``` + +| generate_series | +| --------------------------- | +| 2025-01-01T00:05:00.000000Z | +| 2025-01-01T00:04:00.000000Z | +| 2025-01-01T00:03:00.000000Z | +| 2025-01-01T00:02:00.000000Z | +| 2025-01-01T00:01:00.000000Z | +| 2025-01-01T00:00:00.000000Z | \ No newline at end of file diff --git a/documentation/reference/sql/datatypes.md b/documentation/reference/sql/datatypes.md index 2953efbe..204d7fe6 100644 --- a/documentation/reference/sql/datatypes.md +++ b/documentation/reference/sql/datatypes.md @@ -73,8 +73,8 @@ Many nullable types reserve a value that marks them `NULL`: | Type Name | Null value | Description | | ---------------- | -------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | -| `float` | `NaN` | As defined by IEEE 754 (`java.lang.Float.NaN`). | -| `double` | `NaN` | As defined by IEEE 754 (`java.lang.Double.NaN`). | +| `float` | `NaN`, `+Infinity`, `-Infinity` | As defined by IEEE 754 (`java.lang.Float.NaN` etc.) | +| `double` | `NaN`, `+Infinity`, `-Infinity` | As defined by IEEE 754 (`java.lang.Double.NaN`, etc.) | | `long256` | `0x8000000000000000800000000000000080000000000000008000000000000000` | The value equals four consecutive `long` null literals. | | `long` | `0x8000000000000000L` | Minimum possible value a `long` can take, -2^63. | | `date` | `0x8000000000000000L` | Minimum possible value a `long` can take, -2^63. |