A Mini Java Compiler Front-End built in Python using PLY (Python Lex-Yacc).
This project performs multiple compiler front-end phases:
Source Code → Lexical Analysis → Syntax Analysis → Abstract Syntax Tree (AST) Generation → Semantic Analysis → Error Reporting
The goal of this project is to understand how real compilers process source code by implementing key compiler-design concepts from scratch.
- Tokenization using PLY Lex
- Recognition of keywords, identifiers, literals, operators, and delimiters
- LALR(1) parser implemented using PLY Yacc
- Grammar-based validation of Mini Java constructs
- Error recovery with multiple syntax error reporting
- Generates an AST for valid source code
- Pretty-prints the AST in a tree-like structure
- Preserves source line information for diagnostics
- Symbol table implementation with nested scopes
- Semantic validation after parsing
- Separate semantic error reporting
enum Color {
RED,
GREEN,
BLUE
}public class Container<T> {
}public int add(int a, int b) {
return a;
}public int add(int a) {
return a;
}
public int add(int a) {
return a;
}Duplicate definitions are detected and reported.
var handler = (int x) -> {
return;
};try {
count = 10;
}
catch(Exception e) {
return;
}The semantic analyzer performs:
- Duplicate class names
- Duplicate enum names
- Duplicate parameter names
- Duplicate variable declarations
- Global scope
- Class scope
- Method scope
- Lambda scope
- Try-Catch scope
- Undeclared variable usage
- Assignment to undeclared variables
- Calls to undefined methods
Detects type mismatches such as:
int x;
x = "hello";Warns when an inner-scope variable hides an outer-scope declaration.
Source Code
↓
Lexer
↓
Parser
↓
AST Generation
↓
Semantic Analysis
↓
Error Summary
Mini-Java-Syntax-Analyzer/
│
├── ast_nodes.py
├── lexer.py
├── parser.py
├── semantic_analyzer.py
├── main.py
│
├── test_input/
│ ├── Container.java
│ ├── ErrorTest.java
│ └── test_input.txt
│
└── README.md
Install PLY:
pip install plypython main.pyEnter code and finish with:
END
python main.py test_input/Container.javaProgram
└── ClassDecl [Test]
└── MethodDecl [hello]
├── ReturnType [void]
├── Params
└── Body
└── ReturnStmt
✓ Semantic analysis passed — no errors or warnings.
[E-SEM-007] Assignment to undeclared variable: 'x'
[E-SEM-006] Call to undefined method: 'foo'
- Lexical Analysis
- Context-Free Grammars
- LALR(1) Parsing
- Abstract Syntax Trees
- Symbol Tables
- Scope Management
- Semantic Analysis
- Compiler Front-End Design
- Error Recovery Techniques
- Local variable declarations
- Expression parsing
- Type inference
- Intermediate Code Generation (Three Address Code)
- Constant Folding Optimization
- Graphviz AST Visualization
- JVM Bytecode Generation
This project is designed as a learning-oriented compiler front-end that demonstrates how programming languages are analyzed before execution. It combines concepts from Compiler Design, Automata & Formal Languages, and Programming Language Theory into a working implementation.