The specifications described in previous pages are linear. They are a series of Data Field units and SET units which are performed in sequence for each and every record.
But specs also allows some structure in specifications, including loops, conditional execution, run-in and run-out cycles. This page describes those structures.
The Arithmetic-Logical Unit (ALU) page described expressions and used them in PRINT data fields.
A specific type of expression is the condition, which is an expression that evaluates to either 1 (true) or 0 (false). The result of such expression is often called boolean. In fact any expression can be used as a condition. In that case any zero value, empty string or NaN is treated as the boolean false, and anything else is treated as the boolean true.
Examples of condition expressions:
a < 2len(s) > 5#0 = 0sqrt(#4)=7#3 % 2-- #3 is odd!(#3 % 2)-- #3 is evensqrt(a)-- Is the square root ofaa positive number? That is equivalent toa>0, because for values ofasmaller than zerosqrtreturnsNaN#4 -= 1-- returns the value of#4after subtracting1, and will be treated as true until #4 goes to zero.
This is the if-statement from many programming languages. The general structure is as follows:
IF (condition) THEN
unit-sequence
ELSEIF (condition) THEN
unit-sequence
ELSEIF (condition) THEN
unit-sequence
ELSE
unit-sequence
ENDIF
The THEN keywords are necessary. The ELSEIF (condition) THEN blocks are optional and you can have as many as you would like. The ELSE (condition) block is also optional, while the ENDIFis required. NOTE: A conditional execution structure without ENDIF will sometimes work, but this should not be relied on, and it can make nested conditional statements ambiguous.
In any conditional execution statement, at most one of the unit sequences is executed. In a conditional execution statement that includes an ELSE clause, exactly one unit-sequence is executed.
A simple, one-line example:
specs a: w1 1 /is/ nextword IF "a%2" THEN /odd/ nextword ELSE /even/ nextword ENDIF
| Input | Output |
|---|---|
| 0 | 0 is even |
| 1 | 1 is odd |
| 25 | 25 is odd |
| 404 | 404 is even |
| 2.5 | 2.5 is even |
| 3.5 | 3.5 is odd |
| hello | hello is even |
As usual with computers, the GIGO (garbage in -- garbage out) rule applies. The first four inputs are numbers and the output is correct. Attempting to use the remainder operator on 2.5 forces specs to treat that value as an integer, which evaluates to 2. Similarly, 3.5 evaluates to 3. All strings that don't look like numbers evaluate to zero when forced to be treated like integers, so hello turns out to be even.
Loops take many forms in different programming languages: while-loops, for-loops, repeat-until, or even the dreaded conditional goto. In specs there is only one kind of loop: the while-loop:
WHILE (condition) DO
unit-sequence
DONE
As long as the condition evaluates to true, the unit sequence will be repeated over and over. Consider the following simple loop:
specs a: w1 1 SET "#1:=a" WHILE "#1>0" DO /*/ n SET "#1-=1" DONE
| Input | Output |
|---|---|
| -1 | -1 |
| 0 | 0 |
| 1 | 1* |
| 25 | 25************************* |
| 1.5 | 1.5** |
| hello | hello |
Note: In the Unix command-line you will have to escape the star with a backslash, otherwise it evaluates to a list of files at the root directory of the file system.
The above specification can be naïvely simplified some:
specs a: w1 1 SET "#1:=a" WHILE "#1-=1" DO /*/ n DONE
But that doesn't work correctly. Suppose the input was 3. So #1 is set to 3. In the first iteration it is reduced to 2 and a star is printed; in the second iteration it is reduced to 1 and a star is printed. But in the third iteration it is reduced to zero and the loop exists having printed only two stars. This is known as an off-by-one error. We can fix it like this:
specs a: w1 1 SET "#1:=a+1" WHILE "#1-=1" DO /*/ n DONE
This now returns the correct values for all the integers from zero and up. But it tends to get into endless loops. Plug in inputs like -1, 1.2, or hello and specs hangs. Version 0.9.5 introduced the while-guard feature that makes specs exit with an error in a case like this, but it is up to the specification to avoid endless loops. Do not use a loop condition such as this unless it is certain that #1 is a positive integer when entering the loop for the first time.
To help with summaries and initialization, specs provides ways to execute certain units only at the Run-In cycle or only at the Run-Out cycle.
Run-In is just another name for the first iteration of the specification. You check for it with the boolean function first():
# Print all the records that match the first record's first word.
specs if "first()" then
set "#0:=word(1)"
endif
if "#0==word(1)" then
1-* 1
endif
The Run-Out cycle runs after the last record is processed. It is only run if it has something to do. There are two ways to do things on the run-out cycles:
- Using the boolean function
eof(). We call this a forced run-out cycle - Using the
EOFkeyword.
The following enhancement of the run-in example will demonstrate both:
# Print all the records that match the first record's first word.
specs if "first()" then
set "#0:=word(1)"
endif
if "#0==word(1)" then
1-* 1
set "#1+=1"
endif
if "eof()" then
/In the run-out cycle./ 1
endif
EOF
/There were/ nw
print "#1" nw
/records that began with/ nw
print "#0" nw
Or a more practical example, from the CMS Pipelines book:
# Summing
specs
1-* 1
a: word 1 .
set #0+=a
eof
/Total:/ 1
print #0 NEXTWORD
| Input | Output |
|---|---|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| Total: 10 |
Here's an alternate implementation with a forced run-out cycle:
# Summing
specs
IF "eof()" THEN
/Total:/ 1
print #0 NEXTWORD
ELSE
a: WORD 1 1
SET #0+=a
ENDIF
Field identifiers can be used for conditional execution when their value changes from record to record. Consider the following example CSV file containing personnel records:
Payroll,Eddy,Eck
Payroll,Janel,Polen
Finance,Leonard,Cockett
Finance,Dorie,Lugo
Finance,Wiley,Cheung
Finance,Carmelo,Reitz
Finance,Donetta,Rybak
R&D,Jamaal,Mcgillis
R&D,Jonna,Scheffer
R&D,Shawnna,Driskell
R&D,Maybell,Ditmore
R&D,Ami,Fentress
R&D,Randee,Tarkington
R&D,Jerica,Jimenez
Sales,Kristopher,Lind
Sales,Margret,Picone
Sales,Damien,Daniel
Support,Deann,Rushton
Support,Spencer,Marse
Support,Devora,Fortier
Suppose we want to print this out, but suppress repetition of the department field:
specs
FIELDSEPARATOR ,
c: FIELD 1 .
FIELD 3 10
/,/ NEXT
FIELD 2 NEXTWORD
BREAK c
ID c 1
Result:
Payroll Eck, Eddy
Polen, Janel
Finance Cockett, Leonard
Lugo, Dorie
Cheung, Wiley
Reitz, Carmelo
Rybak, Donetta
R&D Mcgillis, Jamaal
Scheffer, Jonna
Driskell, Shawnna
Ditmore, Maybell
Fentress, Ami
Tarkington, Randee
Jimenez, Jerica
Sales Lind, Kristopher
Picone, Margret
Daniel, Damien
Support Rushton, Deann
Marse, Spencer
Fortier, Devora
So what happened here? For every record the field identifier c is calculated. Whenever the value of c in the new record is different from its value in the previous record, this sets the break level to c. What does that mean? It means that any specification units following a BREAK c, BREAK b, or BREAK a will get executed, but units following BREAK d will not. Another way of expressing this is that the break levels a, b, and c are established, while break level d is not.
A specification can have multiple field identifiers and so multiple break levels. The highest field identifier alphabetically (capital letters are considered higher than lower-case) sets the break level for the iteration. This break level and all break levels below it are considered established.
The keyword BREAK is one way of using conditional execution with control breaks. Another is the pseudo-function break(). This is a pseudo-function because the only argument it is allowed to have is a field identifier. Example:
specs
FIELDSEPARATOR ,
a: FIELD 1 .
IF break(a) THEN
ID a 1
/Department:/ NW
WRITE
ENDIF
FIELD 3 10
, N
FIELD 2 NW
Result:
Payroll Department:
Eck, Eddy
Polen, Janel
Finance Department:
Cockett, Leonard
Lugo, Dorie
Cheung, Wiley
Reitz, Carmelo
Rybak, Donetta
R&D Department:
Mcgillis, Jamaal
Scheffer, Jonna
Driskell, Shawnna
Ditmore, Maybell
Fentress, Ami
Tarkington, Randee
Jimenez, Jerica
Sales Department:
Lind, Kristopher
Picone, Margret
Daniel, Damien
Support Department:
Rushton, Deann
Marse, Spencer
Fortier, Devora
ABEND units are used to unconditionally halt the running of the specification with an error message. They only make sense within a conditional statement. When an ABEND unit is executed, the output record being prepared is not printed out, the reading of input records is halted, and no further records are processed. The error message is output to standard error.
Here's a trivial example:
specs
/Name:/ 1
WORD 1-2 NW
/Employee ID:/ 25
32-40 NW
/Age:/ 36
a: 55-57 NW
IF 'a>120' THEN
ABEND /Age in record exceeds limit/
ENDIF
ASSERT units allow for sanity checks on internal state or input. An ASSERT token is followed by a condition similar to IF, ELSEIF, or WHILE. When the specs program processes an assertion, it evaluates the condition, and if that condition evaluates to false, the program aborts with a message that contains the text of the failed condition.
Unlike ABENDs, assertions make sense as sanity checks within the normal flow of the specification. As an example, here's a re-write of the above specification:
specs
/Name:/ 1
WORD 1-2 NW
/Employee ID:/ 25
32-40 NW
/Age:/ 36
a: 55-57 NW
ASSERT 'a<=120'
CONTINUE units cause control to move to the next cycle. Any output generated already will be dumped as the output record of this cycle. If no output has been generated then no record will be written. This is useful in IF blocks:
# Our own way to filter out comment lines
specs if "word(1)=='#' THEN CONTINUE ENDIF 1-* 1"
The SKIP-WHILE and SKIP-UNTIL units cause the specification to skip input lines while a condition is true, or until if becomes false. More precisely, the specification stops and goes to the next record, similar to the CONTINUE unit, but once the condition for passing occurs once, the condition is never evalu ated again and future records are passed unchecked. It usually only makes sense to place SKIP-WHILE and SKIP-UNTIL units at the beginning of a specification. Example:
# Print all records with date after July 1st, 2020
w1 a: # The date in yyyymmdd format
SKIP-WHILE 'a<20200701'
1-* 1