Skip to content
This repository has been archived by the owner on Nov 25, 2024. It is now read-only.

Compiler from a C-subset to LLVM IR & MIPS providing visualisation of the workflow. Includes an advanced testingsystem.

License

Notifications You must be signed in to change notification settings

Are-you-listening/Compilers

Repository files navigation

Compilers - README

CodeFactor

Compiler from a C-subset to LLVM IR & MIPS. A group project for the course Compilers (INFORMAT 1001WETCOP)

By Emil Lambert, Lucas Vanden Abeele, Tibo Verreycken, Kars van Velzen.

Course Info

[Spring, 2024 | Compilers, 1001WETCOP]

The goals off this project for this course consists of (but not limited to): Show expertise in the course material theory, Implementing a compiler

You can find detailed information about the course using this link

Product Requirements

Implementing a compiler for a C-subset to working LLVM-IR and MIPS assembly with at least the Mandatory Functionality, listed below.

Final Product

We developed a compiler for a C-subset to working LLVM-IR and MIPS assembly with a wide range of support. We encourage you to take a look at the provided functionality below. A testing system consisting of 650 tests which will be used for the next year by the teaching assistants to grade the upcoming projects. We received the total grade for this project.

How to run?

Requirements & Dependencies:

  • Install the python packages, see requirements.txt
  • Working gcc compiler on your device
  • For unit tests, a linux Kernel with lli & spim packages installed:
sudo apt install llvm
sudo apt install spim

Video Explanation

Find our demonstration for Mips , LLVM

--unused_var False Disables the optimization to remove unused variables. This might help to better understand the LLVM output

Mandatory Functionality:

(In short we made all the mandatory features) (Links are provided to an example file showing this feature)

Optional Functionality:

(In short we did all the optionals except 'Function overloading' and we did 4 extra optionals ('CFG' and 'array[] automatic size' )', 'Func ptrs', 'calloc en realloc'')

Extra Optional Functionality:

Technical Functionality:

  • README
  • Documentation
  • Unit Tests
  • Start Script
  • PNG Creation from .dot-files

Project Structure:

Compilers
├── example_source_files - Files used in our demo
├── grammar - Directory containing grammar specifications
│   └── grammarC.g4 - ANTLR file containing our grammar specification
├── README.md - This file
├── requirements.txt - Python dependencies
├── script.py - Run all example_source_files
├── src - Directory containing our source doe
│   ├── antlr_files - Grammar files created by ANTLR
│   ├── internal_tools - Directory containing internal tools
│   │   └── IntegrityChecks.py - PreConditions Functions
│   ├── llvm_target - Directory containing code to translate the AST & table data towards LLVM
│   │   ├── AST2LLVM.py - Main visitor to convert to LLVM
│   │   ├── LLVMSingleton.py - Manages the LLVMLite Library to ease the work of our visitor
│   │   ├── MapTable.py - Map variables to a LLVM register
│   │   └── OutputLLVMGenerator.py - Contains methods to convert parts of the tree to LLVM code
│   ├── main - Directory containing the main function
│   │   └──__main__.py - Main file 
│   ├── mips_target - Directory containing files to generated Mips
│   │   ├── AST2MIPS.py - Main visitor to convert to Mips
│   │   ├── MipsLibrary - Our own library, based upon the workings of LLVMLite
│   │   │   ├── Blocks.py - Structure to create blocks
│   │   │   ├── Function.py - Structure to create functions
│   │   │   ├── Instructions - We won't go over each instructions, but these classes are called in the block structure, just like you're used to for LLVMLite.
│   │   │   │   ├── FR
│   │   │   │   │   ├── add_s.py
│   │   │   │   │   ├── c.eq.s.py
│   │   │   │   │   ├── c.le.s.py
│   │   │   │   │   ├── c.lt.s.py
│   │   │   │   │   ├── div_s.py
│   │   │   │   │   ├── FloatMipsInstruction.py
│   │   │   │   │   ├── mul_s.py
│   │   │   │   │   └── sub_s.py
│   │   │   │   ├── I
│   │   │   │   │   ├── addi.py
│   │   │   │   │   ├── addiu.py
│   │   │   │   │   ├── andi.py
│   │   │   │   │   ├── beq.py
│   │   │   │   │   ├── bne.py
│   │   │   │   │   ├── IMipsInstruction.py
│   │   │   │   │   ├── lb.py
│   │   │   │   │   ├── li.py
│   │   │   │   │   ├── lui.py
│   │   │   │   │   ├── lw.py
│   │   │   │   │   ├── mips_not.py
│   │   │   │   │   ├── neg.py
│   │   │   │   │   ├── ori.py
│   │   │   │   │   ├── sb.py
│   │   │   │   │   ├── slti.py
│   │   │   │   │   ├── sltiu.py
│   │   │   │   │   └── sw.py
│   │   │   │   ├── Instruction.py
│   │   │   │   ├── J
│   │   │   │   │   ├── jal.py
│   │   │   │   │   ├── jalr.py
│   │   │   │   │   ├── JMipsInstruction.py
│   │   │   │   │   └── j.py
│   │   │   │   ├── R
│   │   │   │   │   ├── add.py
│   │   │   │   │   ├── addu.py
│   │   │   │   │   ├── div.py
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── jr.py
│   │   │   │   │   ├── mfhi.py
│   │   │   │   │   ├── mflo.py
│   │   │   │   │   ├── mips_and.py
│   │   │   │   │   ├── mips_or.py
│   │   │   │   │   ├── mul.py
│   │   │   │   │   ├── nor.py
│   │   │   │   │   ├── RMipsInstruction.py
│   │   │   │   │   ├── sgt.py
│   │   │   │   │   ├── sll.py
│   │   │   │   │   ├── sllv.py
│   │   │   │   │   ├── slt.py
│   │   │   │   │   ├── srl.py
│   │   │   │   │   ├── srlv.py
│   │   │   │   │   ├── sub.py
│   │   │   │   │   ├── subu.py
│   │   │   │   │   └── xor.py
│   │   │   │   └── S
│   │   │   │       ├── comment.py
│   │   │   │       ├── cvt_s_w.py
│   │   │   │       ├── cvt_w_s.py
│   │   │   │       ├── la.py
│   │   │   │       ├── l_s.py
│   │   │   │       ├── manual_label.py
│   │   │   │       ├── mfc1.py
│   │   │   │       ├── move.py
│   │   │   │       ├── mtc1.py
│   │   │   │       ├── SpecialInstruction.py
│   │   │   │       └── systemCall.py
│   │   │   ├── Memory - Directory containing files related to memory management.
│   │   │   │   ├── Memory.py - Memory object, used for registers or stack locations. Widely used by the registermanager.
│   │   │   │   └── RegisterManager.py - A singleton service with function to retrieve a free registers and spill any data to memory. 
│   │   │   ├── MipsManager.py - Mips Label manager
│   │   │   ├── MipsModule.py - Module manager
│   │   ├── MipsSingleton.py - Similar to LLVMSingleton.py
│   │   ├── OutputMIPSGenerator.py - Similar to it's LLVM counterpart
│   │   ├── PredifinedStructures.py - Predefined functions translated from c to Mips
│   ├── parser - Directory containing code to parse an input file, create & massage the tree
│   │   ├── ArrayCleaner.py - Massage the tree, specificly for arrays
│   │   ├── ArrayPreProcessor.py - Makes sure const values are propagated inside the array size
│   │   ├── ASTCleanerAfter.py - Clean the AST after table creation
│   │   ├── ASTCleaner.py - Clean the AST before table creation
│   │   ├── ASTConversion.py - Reform implicit conversions to explicits and verify semantics
│   │   ├── ASTCreator.py - Create the initial AST from the ANTLR tree
│   │   ├── ASTDereferencer.py - Alters the AST to ease the use of adressess
│   │   ├── ASTIfCleaner.py - Massage If Statements to our format
│   │   ├── ASTLoopCleaner.py - Convert & Clean loop statements to While loops in a desired format
│   │   ├── AST.py - Contains class definitions of our AST Objects & some methods for easy manipulation
│   │   ├── ASTTableCreator.py - Creates symbol tables
│   │   ├── ASTTypedefReplacer.py - Resolves typedefs to their actual type
│   │   ├── ASTVisitor.py - Base class visitor for our AST
│   │   ├── BlacklistVisitor.py - Remove nodes that contain black listed text
│   │   ├── CodeGetter.py - Maps all source code to their respective linenumber & file
│   │   ├── ConstantFoldingVisitor.py - Apply constant folding
│   │   ├── ConstantStatementFolding.py - Apply constant folding for conditional statements
│   │   ├── Constraints - Directory containing semantic checks
│   │   │   ├── AmpersandConstraint.py - Verifies the integrity of the & operator
│   │   │   ├── CheckRvalues.py - Check for invalid Rvalues
│   │   │   ├── CheckUnaryOps.py - Verify consistency of unary operators
│   │   │   ├── CleanGlobalScopeConstraint.py - Verify that all statements in the global scope are allowed
│   │   │   ├── ConstraintChecker.py - Actual visitor that applies all constraints
│   │   │   ├── Constraint.py - Abstract Base Class
│   │   │   ├── FunctionReturnConstraint.py - Verify that every (non-void) function has a return statement
│   │   │   ├── GlobalsConstrained.py - Verify global variable syntax
│   │   │   ├── IOConstraint.py - Check when printf or scanf is used, stdio.h is included
│   │   │   ├── MainFoundConstraint.py - Check if the code to compile, contains a main function
│   │   │   ├── RedefinitionConstrained.py - Check for redefinitions
│   │   │   ├── UndeclaredConstrained.py - Verify if all used variables are declared
│   │   │   ├── UndefinedReferenceConstraint.py - Check for function calls to undefined functions
│   │   │   └── VoidReturnConstraint.py - Check that every void function has no non-empty return statement
│   │   ├── ControlFlow - Directory containing ControlFlow Code
│   │   │   ├── ControlFlowDotVisitor.py - Visualise the CFG
│   │   │   ├── ControlFlowGraph.py - Contains CFG Classes & manipulation methods
│   │   │   └── ControlFlowCreator.py - Creates the actual CFG
│   │   ├── CTypes - 
│   │   │   ├── CFunctionExecuterChar.py - Makes sure that each C type has the right operation behaviour
│   │   │   ├── CFunctionExecuterFloat.py - Makes sure that each C type has the right operation behaviour
│   │   │   ├── CFunctionExecuterInt.py - Makes sure that each C type has the right operation behaviour
│   │   │   ├── CFunctionExecuterPtr.py - Makes sure that each C type has the right operation behaviour
│   │   │   ├── CFunctionExecuter.py - Makes sure that each C type has the right operation behaviour
│   │   │   ├── COperationHandler.py - Simulates operations like it would behave in C
│   │   │   ├── InvalidOperatorFloatError.py - Error used in COperationHandler & -Executer
│   │   │   └── InvalidOperatorPtrError.py - Error used in COperationHandler & -Executer
│   │   ├── DeadCodeRemover.py - Remove unreachable code
│   │   ├── DotVisitor.py - Export the AST to a .dot file
│   │   ├── DynamicAllocation.py - Add dynamic allocation functionality
│   │   ├── EnumConverter.py - Convert enum nodes to our format
│   │   ├── ErrorExporter.py - File containing methods to throw errors
│   │   ├── FileIO.py - - Add FileIO functionality
│   │   ├── FunctionPtrCleaner.py - Cleanup Function Pointer declaration to our declration format
│   │   ├── IdentifierReplacerVisitor.py - Replace variables with known values at compile time
│   │   ├── PointerReformater.py - Reformat the '->' pointer operator towards (* ). operators
│   │   ├── Preproccesing - Directory containing preprocessor code
│   │   │   └── preProcessor.py - Execute preprocessing; remove comments & resolve #directives
│   │   ├── SizeOfTranslater.py - Convert SizeOf expressions, to an integer indicating the size of this type
│   │   ├── StringToArray.py - Make it possible to represent strings as char arrays
│   │   ├── StructCleanerAfter.py - Clean Struct Nodes after Table creation
│   │   ├── StructCleaner.py - Cleanup struct nodes & insert info into the Struct Table
│   │   ├── SwitchConverter.py - Convert switch stamenets to if statements
│   │   ├── Tables - Directory containing tables & type objects
│   │   │   ├── AbstractTable.py - Abstract Base Class for our Tables
│   │   │   ├── FunctionSymbolType.py - Type that holds data about Functions
│   │   │   ├── StructTable.py - Table to map structs info to scopes
│   │   │   ├── SymbolTable.py - Should be clear
│   │   │   ├── SymbolTypeArray.py - Type holding array info
│   │   │   ├── SymbolTypePtr.py - Type holding ptr info
│   │   │   ├── SymbolType.py - Simple type for base types, also used as base class for others
│   │   │   ├── SymbolTypeStruct.py - Type holding struct info
│   │   │   ├── SymbolTypeUnion.py - Type holding union info
│   │   │   ├── TableDotVisitor.py - Exports the symboltable to a .dot file
│   │   │   ├── TypedefTable.py - Table containing info about typedefs
│   │   │   └── TypeNodehandler.py - Tools with regard to creating and retrieving AST TypeNode types
│   │   ├── TypeCleaner.py - Convert 'Type' subtrees to a ASTTypeNode containing the type
│   │   ├── TypeMerger.py - Formats declarations with enums, structs to our format.
│   │   ├── UnarySaVisitor.py - Semantic Analyses for Unary operations
│   │   ├── Utils - Directory containing helper functions
│   │   │   └── ArraySizeReader.py - Help reading array sizes
│   │   ├── ValueAdderVisitor.py - Constant Propagation
│   │   ├── VirtualLineNrVisitor.py - Map code to a virtual linenr (needed instead of linenr by using multiple files)
│   │   ├── VoidReturnAdder.py - Add void returns to void functions
│   │   ├── ScopeCleaner.py - Removes 'Scope' Nodes
│   │   └── UnusedCleaner.py - Removes Code for unused varaibles
└── TestCases - Directory containing our test system
    ├── ABCTests - DIrectory containing the abstract base class test & examples
    │   ├── abcTest.py - Abstract Base Test (from unitTest)
    │   ├── AstLoader.py - Can load & export an AST to .json files
    │   ├── createTests - Directory to create a new test file
    │   │   ├── file_read_json.json - .json upon file read
    │   │   ├── file_read_json_result.json - .json after applied actions
    │   │   ├── read_file_after.dot - .dot file after applied actions
    │   │   ├── read_file_after.png - .png file after applied actions
    │   │   ├── read_file_before.dot - initial AST
    │   │   ├── read_file_before.png - initial AST in png
    │   │   ├── read_file.c - Test input file
    │   │   └── testCaseCreator.py - Action applyer
    │   ├── ReCalibrateTestCases.py - Script to automaticly remake all .json files
    │   ├── temp - Temp directory in which any output files are generated
    │   │   └── temp.txt - File to ensure the directory exists
    │   └── tests - Directory containing test files corresponding to a unittest
    │       ├── error_dict.json - Predefine expected errors for each file
    │       ├── input_dict.json - If exists, predefined input for scanf
    │       ├── test1.c - A test file
    │       ├── test1.json - Generated .json after AST test
    │       └── test1_result.json - Expected .json result
    ├── ASTTests - Directory containing AST unit tests
    └── LLVMTests - Directory containing LLVM unit tests

357 directories, 3512 files

Tests

You can test our example files by running:

python3 script.py

This will run the provided examples from example_source_files

Unit Tests

For our project we have lots of testcases, ... So if for some reason a situation occurs that is unexpected, please consolidate our testcases.

LLVM Tests are tests that run the entire compiler and cross-reference their output (from prints) with the gcc output. This Part of our testEnvironment is the best indicator of our compiler's capabilities. Currently, we have around 650 tests in this directory. Some classification is made, but not every feature has its own testSuite. A lot of testfiles also have a combination of testcase features.

Run our testcases using:

python3 -m unittest discover -v

You can also run the tests in parallel Requires:

pip install unittest-parallel
 unittest-parallel -t . -s TestCases --level=class

About

Compiler from a C-subset to LLVM IR & MIPS providing visualisation of the workflow. Includes an advanced testingsystem.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •