-
Notifications
You must be signed in to change notification settings - Fork 35
Defining your parser
Csly uses a parser combinator strategy to generate strongly typed parser instances via the ParserBuilder.BuildParser<T, U> method. The BuildParser method includes two generic arguments; one contains the expression tokens you define and the other is the expected parser output type. More formally, a generated parser object of type Parser<IN,OUT> will use your expression parser (visitor) methods and generic IN and OUT types where:
-
INis anenumtype with regex or generic[Lexeme]decorators that represent all the tokens (symbols) your language accepts, -
OUTis the expected output type for the generated parser instance'sParse(...)method once it is invoked.
The OUT type can be a value or reference type. More information on expression tokens is available in the Lexer page. In the sample parser discussed in the Getting started section, the output type is an integer which happens to contain the result of the arithmetic expression passed to the Parser.Parse(...) method.
You can examine the generated syntax tree by debugging or dumping the parse result object's SyntaxTree property. The tree structure is made up of SyntaxNode objects containing the visitor methods in RuntimeMethodInfo types, and encodes an Abstract Syntax Tree (AST).
The generic types entered at ParserBuilder instantiation are checked when ParserBuilder.BuildParser(...) is called. These types will be used for syntax tree generation and traversal (see below). If the csly-generated parser instantiation fails no Exception will be thrown. Instead, check the Parser<T, U>.IsError flag or optionally, whether the Parser<T, U>.Errors list is null or not. A non-null list of error messages will be populated with line and column indicators as well as reason for failure where the parse failed. todo: link to error message documentation. Additional typing rules are described below.
The syntax tree generated by csly from your expression parser implementation (the visitor), defines methods that csly will use to implement terminal (leaf) and non-terminal (branch) nodes in a syntax tree. Since the expression token type is specified in both ParserBuilder and Parser IN generic parameters, all syntax-tree visitor methods in your custom parser must return a value inheriting from type IN. Depending on the nature of the visitor methods defined in your parser, the expected token types must match both the IN and OUT generic types defined in ParserBuilder. See, for example, the visitor methods defined in the sample implementation of ExpressionParser. Clause rules can be encoded in both IN and OUT types.
Clause rules for both IN and OUT types can be classified from an object-structured perspective from their patterns:
-
Token<IN> (rule : MyToken): for a terminal clause, -
OUT (rule : nonTerminal): for a-non terminal clause, -
List<Token<IN>>(rule : MyToken+): for multiplied terminal clause, -
List<Token<IN>>(rule : nonterminal*): for multiplied non terminal clause, -
Group<IN,OUT>(rule : (MyToken nonterminal)): for a group/sub rule clause, -
ValueOption<OUT> (rule : nonTerminal?): for an optional non terminal clause, -
ValueOption<Group<IN,OUT>> (rule : (MyToken nonterminal)?): for an optional group clause, -
List<Group<IN,OUT>> (rule : (MyToken nonTerminal)*): for a repeated group clause.
TODO:Verify these are accurate and link to examples
An additional way to classify clause rules is as piped or non-piped. Both piped and non-piped rules can also be terminal and non-terminal. Since alternate, or piped (|) choice rules are also terminal or non-terminal, the same typing rules apply as for single (non-piped) statements.
More information is provided in the next section, Implementing a BNF Parser.
Getting started ⬅️ Defining your parser ➡️ Implementing a BNF Parser