Skip to content

Conversation

@sim1984
Copy link
Contributor

@sim1984 sim1984 commented Oct 30, 2025

GENERATE_SERIES function

The GENERATE_SERIES function creates a series of numbers within a specified interval.
The interval and the step between series values ​​are defined by the user.

Syntax

<generate_series_function> ::=
    GENERATE_SERIES(<start>, <finish> [, <step>]) [AS] <correlation name> [ ( <derived column name> ) ]

Arguments

  • start - The first value in the interval. start is specified as a variable, a literal, or a scalar expression of type
    smallint, integer, bigint or numeric(18, x).

  • finish - The last value in the interval. finish is specified as a variable, a literal, or a scalar expression of
    type smallint, integer, bigint or numeric(18, x). The series stops once the last generated step value exceeds
    the finish value.

  • step - Indicates the number of values to increment or decrement between steps in the series. step is an expression
    of type smallint, integer, bigint or numeric(18, x). step can be either negative or positive, but can't be zero (0). This
    argument is optional. The default value for step is 1.

Returning type

The function GENERATE_SERIES returns a set with BIGINT or NUMERIC(18, x) column, where the scale is
determined by the maximum of the scales of the function arguments.

Rules

  • If start > finish and a negative step value is specified, an empty set is returned.

  • If start < finish and a positive step value is specified, an empty set is returned.

  • If the step argument is zero, an error is thrown.

Examples

SELECT n
FROM GENERATE_SERIES(1, 3) AS S(n);

SELECT n
FROM GENERATE_SERIES(3, 1, -1) AS S(n);

SELECT n
FROM GENERATE_SERIES(0, 9.9, 0.1) AS S(n);

SELECT 
  DATEADD(n MINUTE TO timestamp '2025-01-01 12:00') AS START_TIME,
  DATEADD(n MINUTE TO timestamp '2025-01-01 12:00:59.9999') AS FINISH_TIME
FROM GENERATE_SERIES(0, 59) AS S(n);

A similar function exists in PostgreSQL: https://www.postgresql.org/docs/current/functions-srf.html and in MS SQL 2022: https://learn.microsoft.com/en-us/sql/t-sql/functions/generate-series-transact-sql?view=sql-server-ver17

PostgreSQL has the ability to generate date and time sequences, but we don't have interval types, so this isn't implemented. However, by generating a sequence of numbers, we can generate any sequence.

@sim1984
Copy link
Contributor Author

sim1984 commented Oct 30, 2025

PSQL procedure vs built-in function performance test

CREATE EXCEPTION E_INVALID_STEP 'Procedure SP_GENERATE_SERIES required values of parameter STEP_VALUE non equal zero';

SET TERM ^;

CREATE OR ALTER PROCEDURE SP_GENERATE_SERIES (
  START_VALUE  BIGINT,
  FINISH_VALUE BIGINT,
  STEP_VALUE   BIGINT DEFAULT 1)
RETURNS (
  CURRENT_VALUE BIGINT)
AS
BEGIN
  IF (STEP_VALUE = 0) THEN EXCEPTION E_INVALID_STEP;
  IF (STEP_VALUE > 0) THEN
    FOR CURRENT_VALUE = START_VALUE TO FINISH_VALUE BY STEP_VALUE DO
      SUSPEND;
  IF (STEP_VALUE < 0) THEN
    FOR CURRENT_VALUE = START_VALUE DOWNTO FINISH_VALUE BY ABS(STEP_VALUE) DO
      SUSPEND;
END^

SET TERM ;^

1000000 records

SELECT 
  COUNT(*), MIN(CURRENT_VALUE), MAX(CURRENT_VALUE)
FROM SP_GENERATE_SERIES(1, 1000000);

                COUNT                   MIN                   MAX
===================== ===================== =====================
              1000000                     1               1000000

Current memory = 19793200
Delta memory = 240
Max memory = 19869584
Elapsed time = 0.294 sec
Buffers = 2048
Reads = 0
Writes = 0
Fetches = 0
SELECT 
  COUNT(*), MIN(N), MAX(N)
FROM GENERATE_SERIES(1, 1000000) S(N);

                COUNT                   MIN                   MAX
===================== ===================== =====================
              1000000                     1               1000000

Current memory = 19809936
Delta memory = 208
Max memory = 19869616
Elapsed time = 0.047 sec
Buffers = 2048
Reads = 0
Writes = 0
Fetches = 0

10000000 records

SELECT 
  COUNT(*), MIN(CURRENT_VALUE), MAX(CURRENT_VALUE)
FROM SP_GENERATE_SERIES(1, 10000000);

                COUNT                   MIN                   MAX
===================== ===================== =====================
             10000000                     1              10000000

Current memory = 19828352
Delta memory = 240
Max memory = 36589008
Elapsed time = 2.867 sec
Buffers = 2048
Reads = 0
Writes = 0
Fetches = 0
SELECT 
  COUNT(*), MIN(N), MAX(N)
FROM GENERATE_SERIES(1, 10000000) S(N);
                COUNT                   MIN                   MAX
===================== ===================== =====================
             10000000                     1              10000000

Current memory = 19844848
Delta memory = 208
Max memory = 19869616
Elapsed time = 0.453 sec
Buffers = 2048
Reads = 0
Writes = 0
Fetches = 0

@aafemt
Copy link
Contributor

aafemt commented Oct 30, 2025

It is a system selectable procedure with fixed number of parameters and output. Why did you create a new BLR verb instead of using existing BLR for calling it by name?

@sim1984
Copy link
Contributor Author

sim1984 commented Oct 30, 2025

It is a system selectable procedure with fixed number of parameters and output. Why did you create a new BLR verb instead of using existing BLR for calling it by name?

  1. We don't have any system stored procedures, except for those in system packages.
  2. The output type takes the maximum scale of the arguments.

@dyemanov
Copy link
Member

It is a system selectable procedure with fixed number of parameters and output. Why did you create a new BLR verb instead of using existing BLR for calling it by name?

This PR correctly extends blr_table_value_fun with a new subcode, it does not introduce a new BLR code. This was added exactly to implement new table-valued functions (and they may have non-procedure-like syntax).

@aafemt
Copy link
Contributor

aafemt commented Oct 30, 2025

2. The output type takes the maximum scale of the arguments.

It can be handled at DSQL stage.

This PR correctly extends blr_table_value_fun with a new subcode, it does not introduce a new BLR code.

Ugh, I see.

Copy link
Member

@hvlad hvlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few little notes.

## Arguments

* `start` - The first value in the interval. `start` is specified as a variable, a literal, or a scalar expression of type
`smallint`, `integer`, `bigint` or `numeric(18, x)`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about INT128?

}
;

%type <valueExprNode> step_opt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call it comma_value_opt.

DsqlDescMaker::fromNode(dsqlScratch, &stepDesc, stepItem, true);

// common scale
const auto scale = MIN(MIN(startDesc.dsc_scale, finishDesc.dsc_scale), stepDesc.dsc_scale);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's documented to be the max, but here you use min.


if (!field)
{
const auto newField = FB_NEW_POOL(dsqlScratch->getPool()) dsql_fld(dsqlScratch->getPool());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use field directly without that extra newField.

}
};

class GenSeriesFunctionSourceNode : public TableValueFunctionSourceNode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it's better to declare the class as final instead of each method. Methods should be declared with override.

class GenSeriesFunctionSourceNode : public TableValueFunctionSourceNode
{
public:
explicit GenSeriesFunctionSourceNode(MemoryPool& pool) : TableValueFunctionSourceNode(pool)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the common convention to put : ... in the same line as the constructor.


class GenSeriesFunctionScan final : public TableValueFunctionScan
{
enum GenSeriesTypeItemIndex : unsigned
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why unsigned instead of UCHAR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't enum class a much more elegant thing than use common prefix for legacy enums?

GenSeriesFunctionScan(CompilerScratch* csb, StreamType stream, const Firebird::string& alias,
ValueListNode* list);

protected:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use override in methods.

}

// common scale
const auto scale = MIN(MIN(startDesc->dsc_scale, finishDesc->dsc_scale), stepDesc->dsc_scale);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIN vs MAX


rpb->rpb_number.increment();

do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't go to suggest better thing, but for me this code and m_recordExists is very confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants