Due: Tuesday, Dec 13th by 11:59 PM

Update 12/6: link to correct version of prettyprint.clj

Getting Started

Download cs340-assign08.zip.

If you are using Counterclockwise under Eclipse, you can import the zipfile as an Eclipse project.

You should copy your parser2.clj and astbuilder.clj files from Assignment 7 into the src/minilang directory. Note that you will need to make a few changes to them as described below.

Update 12/6: The original assignment skeleton included the wrong version of prettyprint.clj. Please download the correct one and copy it into the src/minilang folder:

prettyprint.clj

Your Task

The goal of this project is to write a code generator that can take the ASTs produced from parse trees of minilang programs and generate a sequence of MiniVM instructions which carry out the computation specified by the program.

Augmented ASTs

This assignment introduces augmented ASTs. An augmented AST has the same form as the basic ASTs produced by your build-ast function, but some of the nodes will have properties. A node’s properties can be used to store extra information about the node: specifically, information that will be useful during code generation.

The analyzer.clj module is provided to transform a plain AST into an augmented AST: the analyzer/augment-ast function does the transformation. The analyzer adds three kinds of properties to the AST:

The :regnum property is very useful for generating code for assignments and variable references, since it tells you which MiniVM local variable corresponds to a program variable.

The :last property is useful to allow the code generator to emit code to print the result of the last statement in the top-level statement list: see “Code generation” below.

You will probably not need to use the :nlocals property for anything.

The pp/pretty-print function has been updated to print out property values. For example, when the minilang program

var a; a := 4 * 5; a;

is converted to an augmented AST and pretty printed, the output is

:unit
+--:statement_list :nlocals=1
   +--:var_decl_statement :regnum=0
   |  +--:identifier["a"]
   +--:expression_statement
   |  +--:op_assign
   |     +--:identifier["a"] :regnum=0
   |     +--:op_mul
   |        +--:int_literal["4"]
   |        +--:int_literal["5"]
   +--:expression_statement :last=true
      +--:identifier["a"] :regnum=0

Here is an example with multiple variables:

var a; var b; var c; b := 6; c := 3; a := b*c;

This produces the following augmented AST:

:unit
+--:statement_list :nlocals=3
   +--:var_decl_statement :regnum=0
   |  +--:identifier["a"]
   +--:var_decl_statement :regnum=1
   |  +--:identifier["b"]
   +--:var_decl_statement :regnum=2
   |  +--:identifier["c"]
   +--:expression_statement
   |  +--:op_assign
   |     +--:identifier["b"] :regnum=1
   |     +--:int_literal["6"]
   +--:expression_statement
   |  +--:op_assign
   |     +--:identifier["c"] :regnum=2
   |     +--:int_literal["3"]
   +--:expression_statement :last=true
      +--:op_assign
         +--:identifier["a"] :regnum=0
         +--:op_mul
            +--:identifier["b"] :regnum=1
            +--:identifier["c"] :regnum=2

Notice that each variable is assigned a different :regnum property value.

Code generation

The generate-code function takes an augmented AST as a parameter, and should print MiniVM instructions which execute the computation specified by the AST. Note that generate-code does not need to return any specific value: it should just print the MiniVM instructions using the println function.

Here is the basic idea:

The compile-unit function is provided for you: it takes a complete augmented AST (with a :unit node as its root) and generates a complete MiniVM program for you. This function is necessary because some prologue and epilogue code is required to form a complete MiniVM program: the prologue creates an initial stack frame, and the epilogue causes the MiniVM program to exit cleanly.

When you test your code generator, you should do so by using the compile-unit function, e.g.

(compile-unit aast)

to generate code for the testprog test program at the bottom of codegen.clj. I suggest running this in a REPL.

To try out your generated code, copy the generated instructions and save them to a file — e.g., “prog.mvm” — then execute it interactively with the MiniVM.rb program, e.g.:

./MiniVM.rb -x -i prog.mvm

Stack management

You will need to think carefully about how to manage the MiniVM operand stack. The basic idea is that the code generated for each statement should not cause the operand stack either to grow or shrink.

For :op_assign expressions, the value of the right-hand-side expression should be left on the stack. You can use the dup MiniVM instruction to make a copy of it just before you use stlocal to store it in a local variable: this ensures that a copy is left on the stack after the value is stored.

Hints

The MiniVM test programs, MiniVM documentation, and MiniVM instruction reference will probably be useful.

The functions in the node.clj module will be useful for working with augmented AST nodes:

Each of these functions has a detailed comment explaining how to use it.

You can check the label of a node by applying the :symbol property to the node, e.g.

(:symbol node)

would return :expression_statement if node is an expression statement node.

When printing, you can use the str function to concatenate multiple values into a single string. For example,

(println (str "\tldlocal " regnum))

would emit the instruction

ldlocal 4

assuming that regnum has the value 4.

You can use the do construct to execute several Clojure expressions. This is useful, for example, if you need to recursively generate code and then print an instruction, e.g.:

(do
  (generate-code (node/get-child node 0))
  (generate-code (node/get-child node 1))
  (println "\tadd"))

Examples

Here are some tests programs and example outputs.

For the following minilang program:

var a; a := 4 * 5; a;

A possible MiniVM program is

main:
    enter 0, 1
    ldc_i 4
    ldc_i 5
    mul
    dup
    stlocal 0
    pop
    ldlocal 0
    syscall $println
    pop
    ldc_i 0
    ret

For the following minilang program:

var a; var b; var c; b := 6; c := 3; a := b*c;

A possible MiniVM program is

main:
    enter 0, 3
    ldc_i 6
    dup
    stlocal 1
    pop
    ldc_i 3
    dup
    stlocal 2
    pop
    ldlocal 1
    ldlocal 2
    mul
    dup
    stlocal 0
    syscall $println
    pop
    ldc_i 0
    ret

Grading

Insane extra credit

For extra credit (up to 50 points), implement code generation for :if_statement and :while_statement AST nodes.

See Implementing Control Flow for more information.

This is actually not particularly difficult, and allows you to compile and run rather sophisticated programs. And, imagine how cool it will be to mention that you implemented a compiler for a Turing-complete language with full support for variables and arbitrary control flow targeting a stack-based virtual machine in a functional language at the next party you go to.

Important: If you decide to do the extra credit, the code you submit must be entirely your own work, meaning that you should not talk about the extra credit with anyone else.

Submitting

When you are done, submit the lab to the Marmoset server using either of the methods below.

Important: after you submit, log into the submission server and verify that the correct files were uploaded. You are responsible for ensuring that you upload the correct files. I may assign a grade of 0 for an incorrectly submitted assignment.

From Eclipse

If you have the Simple Marmoset Uploader Plugin installed, select the project (CS340_Assign08) in the package explorer and then press the blue up arrow button in the toolbar. Enter your Marmoset username and password when prompted.

From a web browser

Create a zip file containing your completed project. (If you are in Eclipse, you can use File → Export… → General → Archive File.)

Upload the saved zip file to the Marmoset server as assign08. The server URL is

https://cs.ycp.edu/marmoset/

From the command line

From the command line, run the command

make submit

Type your Marmoset username and password when prompted.