Due: Tuesday, Dec 9th by 11:59 PM
Update 11/21: A fixed version of CS340_Assign08.zip has been uploaded. The specific files that changed are MiniVM.rb, src/minilang/scope.clj, and src/minilang/analyzer.clj.
Getting Started
Download CS340_Assign08.zip.
If you are using Counterclockwise under Eclipse, you can import the zipfile as an Eclipse project.
You should copy your parser2.clj and astbuilder.clj files from Assignment 7 into the src/minilang directory. Note that you will need to make a few changes to them as described below.
Your Task
The goal of this project is to write a code generator that can take the ASTs produced from parse trees of minilang programs and generate a sequence of MiniVM instructions which carry out the computation specified by the program.
First things first
The updated program has a new module, node.clj, which contains the Node datatype and functions for working with nodes. You will need to modify your parser2.clj and astbuilder.clj files to use this new module.
For parser2.clj:
- Add
(:require [minilang.node :as node])to the namespace (ns) declaration at the top of the file - Comment out the definition of the Node data type (i.e., the
defrecordwhich defines it) - Change all occurrences of
Node.withnode/make-node
For astbuilder.clj:
- Add
(:require [minilang.node :as node])to the namespace (ns) declaration at the top of the file - Change all occurrences of
minilang.parser2.Node.withnode/make-node(there will probably only be one occurrence, in themake-nodefunction)
Augmented ASTs
This assignment introduces augmented ASTs. An augmented AST has the same form as the basic ASTs produced by your build-ast function, but some of the nodes will have properties. A node's properties can be used to store extra information about the node: specifically, information that will be useful during code generation.
The analyzer.clj module is provided to transform a plain AST into an augmented AST: the analyzer/augment-ast function does the transformation. The analyzer adds three kinds of properties to the AST:
:statement_listnodes will have an:nlocalsproperty, specifying how many local variables are used within the statement list (and any nested statement lists in if or while statements):identifiernodes will have a:regnumproperty, specifying an integer which uniquely identifies the variable named by the identifier- the last
:expression_statement,:var_decl_statement,:if_statement, or:while_statementnode in the top-level statement list will have a:lastproperty whose value istrue
The :regnum property is very useful for generating code for assignments and variable references, since it tells you which MiniVM local variable corresponds to a program variable.
The :last property is useful to allow the code generator to emit code to print the result of the last statement in the top-level statement list: see "Code generation" below.
You will probably not need to use the :nlocals property for anything.
The pp/pretty-print function has been updated to print out property values. For example, when the minilang program
var a; a := 4 * 5; a;
is converted to an augmented AST and pretty printed, the output is
:unit
+--:statement_list :nlocals=1
+--:var_decl_statement :regnum=0
| +--:identifier["a"]
+--:expression_statement
| +--:op_assign
| +--:identifier["a"] :regnum=0
| +--:op_mul
| +--:int_literal["4"]
| +--:int_literal["5"]
+--:expression_statement :last=true
+--:identifier["a"] :regnum=0
Here is an example with multiple variables:
var a; var b; var c; b := 6; c := 3; a := b*c;
This produces the following augmented AST:
:unit
+--:statement_list :nlocals=3
+--:var_decl_statement :regnum=0
| +--:identifier["a"]
+--:var_decl_statement :regnum=1
| +--:identifier["b"]
+--:var_decl_statement :regnum=2
| +--:identifier["c"]
+--:expression_statement
| +--:op_assign
| +--:identifier["b"] :regnum=1
| +--:int_literal["6"]
+--:expression_statement
| +--:op_assign
| +--:identifier["c"] :regnum=2
| +--:int_literal["3"]
+--:expression_statement :last=true
+--:op_assign
+--:identifier["a"] :regnum=0
+--:op_mul
+--:identifier["b"] :regnum=1
+--:identifier["c"] :regnum=2
Notice that each variable is assigned a different :regnum property value.
Code generation
The generate-code function takes an augmented AST as a parameter, and should print MiniVM instructions which execute the computation specified by the AST. Note that generate-code does not need to return any specific value: it should just print the MiniVM instructions using the println function.
Here is the basic idea:
- For statement lists, recursively generate code for each child statement
- Nothing is required for var decl statements
- For expression statements, recursively generate code for the expression, then either pop the result value off by emitting a
popinstruction (if the expression statement does not have the:lastproperty), or print it by emittingsyscall $printlnfollowed bypop(if the expression statement does have the:lastproperty) - For binary operators other than
:op_assign, recursively generate code for the two child sub-expressions, and then emit an appropriate arithmetic instruction (e.g.,add,sub,mul, etc.) - For
:op_assign, recursively emit code for the right-hand-side expression, the emit anstlocalinstruction to store the result in a local variable, using the value of the:regnumproperty of the identifier to know which MiniVM local variable to store into; also see the note below in the "Stack management" section - For
:int_literalnodes, emit anldc_iinstruction to load the constant integer onto the operand stack - For
:identifiernodes which appear in expressions, emit anldlocalinstruction, using the value of the:regnumproperty to know which MiniVM local variable to load from
The compile-unit function is provided for you: it takes a complete augmented AST (with a :unit node as its root) and generates a complete MiniVM program for you. This function is necessary because some prologue and epilogue code is required to form a complete MiniVM program: the prologue creates an initial stack frame, and the epilogue causes the MiniVM program to exit cleanly.
When you test your code generator, you should do so by using the compile-unit function, e.g.
(compile-unit aast)to generate code for the testprog test program at the bottom of codegen.clj. I suggest running this in a REPL.
To try out your generated code, copy the generated instructions and save them to a file — e.g., "prog.mvm" — then execute it interactively with the MiniVM.rb program, e.g.:
./MiniVM.rb -x -i prog.mvm
Stack management
You will need to think carefully about how to manage the MiniVM operand stack. The basic idea is that the code generated for each statement should not cause the operand stack either to grow or shrink.
For :op_assign expressions, the value of the right-hand-side expression should be left on the stack. You can use the dup MiniVM instruction to make a copy of it just before you use stlocal to store it in a local variable: this ensures that a copy is left on the stack after the value is stored.
Hints
Note: this section may be updated.
The MiniVM test programs, MiniVM documentation, and MiniVM instruction reference will probably be useful.
The functions in the node.clj module will be useful for working with augmented AST nodes:
node/has-prop?checks whether a node has a particular propertynode/get-propretrieves the value of a property from a nodenode/childrenretrieves the children of a nodenode/num-childrenreturns a count of how many children a node hasnode/get-childreturns a specified child (0 for the first child, etc.)
Each of these functions has a detailed comment explaining how to use it.
You can check the label of a node by applying the :symbol property to the node, e.g.
(:symbol node)would return :expression_statement if node is an expression statement node.
When printing, you can use the str function to concatenate multiple values into a single string. For example,
(println (str "\tldlocal " regnum))would emit the instruction
ldlocal 4
assuming that regnum has the value 4.
You can use the do construct to execute several Clojure expressions. This is useful, for example, if you need to recursively generate code and then print an instruction, e.g.:
(do
(generate-code (node/get-child node 0))
(generate-code (node/get-child node 1))
(println "\tadd"))Examples
Here are some tests programs and example outputs.
For the following minilang program:
var a; a := 4 * 5; a;
A possible MiniVM program is
main:
enter 0, 1
ldc_i 4
ldc_i 5
mul
dup
stlocal 0
pop
ldlocal 0
syscall $println
pop
ldc_i 0
retFor the following minilang program:
var a; var b; var c; b := 6; c := 3; a := b*c;
A possible MiniVM program is
main:
enter 0, 3
ldc_i 6
dup
stlocal 1
pop
ldc_i 3
dup
stlocal 2
pop
ldlocal 1
ldlocal 2
mul
dup
stlocal 0
syscall $println
pop
ldc_i 0
retGrading
- Evaluation of simple expressions with just constants: 20%
- Evaluation of complex expressions with just constants: 20%
- Assignments of constants to variables: 15%
- Evaluating variables in expressions: 15%
- Evaluation of expressions with constants and variables: 20%
- Printing the value of the last expression using
syscall $println: 10%
Insane extra credit
For extra credit (up to 50 points), implement code generation for :if_statement and :while_statement AST nodes.
See Implementing Control Flow for more information.
This is actually not particularly difficult, and allows you to compile and run rather sophisticated programs. And, imagine how cool it will be to mention that you implemented a compiler for a Turing-complete language with full support for variables and arbitrary control flow targeting a stack-based virtual machine in a functional language at the next party you go to.
Submitting
When you are done, submit the lab to the Marmoset server using either of the methods below.
Important: after you submit, log into the submission server and verify that the correct files were uploaded. You are responsible for ensuring that you upload the correct files. I may assign a grade of 0 for an incorrectly submitted assignment.
From Eclipse
If you have the Simple Marmoset Uploader Plugin installed, select the project (CS340_Assign08) in the package explorer and then press the blue up arrow button in the toolbar. Enter your Marmoset username and password when prompted.
From a web browser
Create a zip file containing your completed project. (If you are in Eclipse, you can use File → Export... → General → Archive File.)
Upload the saved zip file to the Marmoset server as assign08. The server URL is
From the command line
From the command line, run the command
make submit
Type your Marmoset username and password when prompted.
