CS325 2023/24: Assessed Coursework
A Compiler for MiniC
The CS325 module covers both theoretical and practical aspects of designing and building a compiler. The underlying aim of this coursework is to examine a language specification and write a compiler from scratch using compiler development technologies currently prevalent in industry and research.
Goal of the project
You will develop a compiler for a subset of the C programming language, which we will call MiniC. If you are not familiar with C, it is recommended that you do so, including experimenting with the use of a C compiler such as gcc or clang. However, as our target language is a subset of C, not all aspects of the C language is required to be understood in its entirety. MiniC does not include arrays, structs, unions, files, pointers, sets, switch statements, do statements, for loops, or many of the low level operators. The only data types permitted are bool, int and float. The MiniC language tokens are given in Section 1.1 and the grammar is provided in BNF in Section 1.2. The semantics of MiniC will be that of C (C99) and in case of any ambiguities when implementing your compiler, this will be a good default unless otherwise specified.
Your compiler must be written in C++. You may not have encountered C++ previously as part of your undergrad uate curriculum at Warwick. However the objective of this project is to not get distracted (or worry) about being (or not being) proficient in the language, before writing your compiler, but to learn C++ (or those features of C++) as you go along writing your compiler. You will however find that you are able to draw on your knowledge and experience of procedural and object oriented programming which you learned in your first year with Java and subsequently improved on, in your second year. These will be very much applicable regardless of the language used. Such a scenario will very closely mirror what you will encounter either in industry or research, after graduating from Warwick, when you may very likely be asked to make use of a language that you have not used before or indeed use a completely new language.
As C++ is one of the most widely used languages in industry for developing compilers, we think that using it for your own compiler construction will provide valuable skills for your future. To build your compiler written in C++ we will make use of clang++ a modern C++ compiler supporting C++ 14 standard (C++14) and beyond. clang++ is installed at/modules/cs325/llvm-17.0.1/bin/clang++ in the DCS machines (including the DCS machines you can remotely access as detailed in: https://warwick.ac.uk/fac/sci/dcs/intranet/user guide/remote-login/.
Your compiler will make use of tools provided by the LLVM compiler framework (http://llvm.org/) for generating an LLVM intermediate representation (IR) of the code parsed by your compiler. LLVM is a compiler infrastructure currently used in production by a wide variety of commercial and open source projects as well as being widely used in academic research. The specific version of LLVM you will be using is 17.0.1. It is installed in the DCS undergraduate laboratory workstations at /modules/cs325/llvm-17.0.1/. The code, commands and options to be used for building and generating LLVM IR will be discussed in Section 2. Finally in Section 2.2 we will generate an executable binary from the LLVM IR and test the results of the code compiled using your compiler on an X86 target machine. You can install LLVM and Clang on your own machine by following the instructions given here, but do make sure you install
LLVM version 17.0.1 and Clang (available here).
This coursework will follow techniques discussed in the LLVM tutorial for building a compiler - https://llvm. org/docs/tutorial/. This tutorial develops a compiler for a much smaller language called Kaleidoscope. Throughout the coursework, references will be made to this tutorial as guidance on how to achieve specific tasks. We will also go through parts of this tutorial during laboratory sessions. It is also strongly recommended that you follow good software engineering practices. Particularly the use of version control will be highly beneficial. If you have not already used version control before, take a look at git, a highly versatile distributed version control system for your work.
Tutorials to get you set up with a repository and working with git can be found here. There is also resources for students available in github.edu including creating private git repositories here.
1 Part 1 - Parser and AST
Download the coursework files from the module website. Look in the file named mccomp.cpp. This file contains an implementation of a lexer for MiniC, together with the relevant LLVM header files requi red for developing your compiler. Part 1 of this coursework will entail creating a parser from a language specification and producing an Abstract Syntax Tree (AST) of the code parse
my wechat:_0206girl
Don't hesitate to contact me