The goal of this project is to create a library of data structures and algorithms that can be used to build a compiler in Swift.
Since this project is under active development, it's very likely that the following lists are incomplete.
- Classes of unicode scalars (
ScalarClass
). - Regular expression (
RegularExpression
). - Nondeterministic finite automata (
NFA
). - Deterministic finite automata (
DFA
). - Tokenizer (
Tokenizer
). - Grammar (
Grammar
). - LL parser (
LLParser
). - SLR parser (
LRParser
). - LALR parser (
LALRParser
).
- Matching a unicode scalar against a
ScalarClass
. - Derive an
NFA
from aRegularExpression
. - Derive a
DFA
from anNFA
. - Minimize a
DFA
. - Match a string against an
NFA
orDFA
(i.e., execute finite state machine). - Create a matcher that takes pairs of
RegularExpression
s and tokens and returns the correct token for a string based on match. - Create a tokenizer from pairs of
RegularExpression
s and tokens as well as aRegularExpression
representing trivia between tokens that then takes a string and breaks it into individual tokens, skipping the trivia in between them. - Eliminate left recursion from a grammar.
- Perform left refactoring to eliminate backtracking.
- Check if a grammar is backtracking-free.
- Generate a table-driven LL(1) parser from a backtracking-free grammar, which reports whether an input was accepted or rejected.
- Generate an DFA-backed SLR parser from a grammar, which reports whether an input was accepted or rejected.
- Construct a DFA-backed LALR parser from a grammar using the DeRemer and Pennello algorithm, which reports whether an input was accepted or rejected.
enum Token {
case integer
case decimal
case identifier
case unknown
}
let scanner: [(RegularExpression, Token)] = [
(.digit + .digit*, .integer),
(.digit + .digit* + "." + .digit + .digit*, .decimal),
(.alpha + .alphanum*, .identifier),
]
let nfa = NFA(scanner: scanner, nonAcceptingValue: .unknown)
let dfa = nfa.dfa
let minimizedDfa = dfa.minimized
minimizedDfa.match("134") // .integer
minimizedDfa.match("61.613") // .decimal
minimizedDfa.match("x1") // .identifier
minimizedDfa.match("1xy") // .unknown
See the test suite for more usage examples.
-
Engineering a Compiler 2nd ed by Keith Cooper and Linda Torczon.
-
Algorithms 4th ed by Robert Sedgewick and Kevin Wayne.
-
Stanford's Compilers Course by Alex Aiken.
-
Compilers: Principles, Techniques, and Tools by Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman.
-
Efficient Computation of LALR(1) Look-Ahead Sets by Frank DeRemer and Thomas Pennello.
-
Modern Compiler Implementation in C by Maia Ginsburg and Andrew W. Appel.
-
slox - Hand written scanner, recursive descent parser, and a tree-walking interpreter in Swift. See for a demonstration of using Swift's algebraic data types (
enum
s andstruct
s) to represent and render code. Implements the lox programming language. Ported from Java. -
bslox - Very early work-in-progress of what will eventually be a bytecode compiler and virtual machine of lox. Will be porting this from C.
-
FlyingMonkey - Hand written scanner and Pratt parser of the monkey programming language. Ported from Go.
-
Sift - Hand written scanner and parser of subset of Scheme. Ported from Haskell.
-
sparrow - Hand written scanner of the Swift scanner from the official Swift compiler. Ported from the C++ to Swift. See for an example of a complex scanner/lexer with support for rewinding to arbitrary points in the input.
MIT