dts-tree-sitter generates TypeScript .d.ts
files for interacting the AST from a given tree-sitter grammar.
npm i @asgerf/dts-tree-sitter
npx @asgerf/dts-tree-sitter INPUT > OUTPUT.d.ts
Alternative if you prefer to run without npx
:
node ./node_modules/@asgerf/dts-tree-sitter/build/src/index.js INPUT > OUTPUT.d.ts
where INPUT
is used to locate a node-types.json
file in one of the following locations:
${INPUT}
${INPUT}/node-types.json
${INPUT}/src/node-types.json
node_modules/${INPUT}/src/node-types.json
The tree-sitter-javascript
grammar can be compiled like this:
npm i tree-sitter-javascript
npx @asgerf/dts-tree-sitter tree-sitter-javascript > generated.d.ts
Alternative if you prefer to run without npx
:
node ./node_modules/@asgerf/dts-tree-sitter/build/src/index.js tree-sitter-javascript > generated.d.ts
In the resulting grammar, two of the node types look like this:
export interface ClassDeclarationNode extends SyntaxNodeBase {
type: SyntaxType.ClassDeclaration;
bodyNode: ClassBodyNode;
decoratorNodes?: DecoratorNode[];
nameNode: IdentifierNode;
}
export interface ClassBodyNode extends SyntaxNodeBase {
type: SyntaxType.ClassBody;
memberNodes?: (MethodDefinitionNode | PublicFieldDefinitionNode)[];
}
This can be used like this (see full example):
import * as g from "./generated";
function getMemberNames(node: g.ClassDeclarationNode) {
let result = [];
for (let member of node.bodyNode.memberNodes) {
if (member.type === g.SyntaxType.MethodDefinition) {
result.push(member.nameNode.text);
} else {
result.push(member.propertyNode.text);
}
}
return result;
}
Observe TypeScript do its magic: the type check in the if
promotes the type of member
to a MethodDefinitionNode
in the 'then' branch, and to PublicFieldDefinitionNode
in the 'else' branch.
Tree sitter's TreeCursor
allows fast traversal of an AST, and has two properties with correlated types: nodeType
, and currentNode
.
Once you've checked nodeType
, it's annoying to have to cast currentNode
to the correponding type right afterwards:
if (cursor.nodeType === g.SyntaxType.Function) {
let node = cursor.currentNode as g.Function; // annoying cast
}
There's another way, which is handy in large switches: Cast the cursor itself to a TypedTreeCursor
before switching on nodeType
.
Then the guarded use of currentNode
has the expected type. For example:
function printDeclaredNames() {
let cursor = tree.walk();
do {
const c = cursor as g.TypedTreeCursor;
switch (c.nodeType) {
case g.SyntaxType.ClassDeclaration:
case g.SyntaxType.FunctionDeclaration:
case g.SyntaxType.VariableDeclarator: {
let node = c.currentNode;
console.log(node.nameNode.text);
break;
}
}
} while(gotoPreorderSucc(cursor));
}
node
gets the typeClassDeclarationNode | FunctionDeclarationNode | VariableDeclaratorNode
.- This allows safe access to
node.nameNode
, since each of those types have aname
field. - We don't pay the cost of invoking
currentNode
for other types of nodes.
This happens if you compare types from the general tree-sitter.d.ts
file with those from the generated .d.ts
file.
Every type from tree-sitter.d.ts
has a stronger version in the generated file; make sure you don't mix and match.
This can happen if the grammar contains rules and literals with the same name. For example this grammar rule,
func: $ => seq('func', $.name, $.body)
will produce a named node with type func
, while the 'func'
literal will produce an unnamed node with type func
as well.
This means a check like node.type === 'func'
is not an exact type check, and the type of node
will only be restricted to FuncNode | UnnamedNode<'func'>
. This is not a bug in the generated .d.ts
file: there really are two kinds of nodes you need to handle after that check.
Some possible solutions are:
- Change the grammar to avoid rules with the same name as a keyword.
- Write the check as
node.isNamed && node.type === 'func'
. - Change the declared type of
node
fromSyntaxNode
toNamedNode
.