Computer Science 5641
Compiler Design
Project Part 3 - Parser and Abstract Syntax Trees (100 points)
Due November 12, 2002

Introduction

In this part of the project you will construct a parser for the language described below. In doing this you should make use of either of the scanners you implemented in part 2 of the project. You should use bison as your method for constructing your parser. Your parser should construct an abstract syntax tree and your test program should print out a nicely formatted version of the program understood by the parser (by recursively traversing the AST) or where an parse error occurs in the program.

The Language

The language you will be implementing has the following syntax:

A program consists of a sequence of global variable declarations and function definitions. A variable declaration takes one of two forms:

where Type is one of the three reserved words char, int, or float, the Identifier is any legal identifier token from the scanner and the Expression is any expression as described below.

A function definition takes the format:

where Type and Identifier are the same as above, and the ParamList and StmtList are described below.

An expression in the language is built using the following rules:

In terms of precedence the operators have the following precedence (lowest to highest):

A ParamList consists of zero or more parameter items separated by commas (no comma at the end of the list). A parameter item has the format:

where Type and Identifier are as described previously.

An ArgList consists of zero or more arguments separated by commas (no comma at the end of the list). An argument is any legal Expression as described above.

A Stmtlist consists of zero or more statements. Legal statements are as follows:

Errors

If an input file contains an error you should print out an error message indicating where the first error occurs in the file. As discussed below, you may implement more complex error messages for extra credit.

Output

If an input file is parsed correctly your program should produce an AST as discussed in class and in the textbook. To show this result your code should reproduce (in a readable format) the input program. For example if the input file was:

int count = 0;

int func1 (int x, int y) {
    if (func1 <= 0) return count; fi;
    count = count + x;
    return func1(x,y - 1); }

int main () {
    int n;
    write("Please enter a number: ");
    read(n);
    readln;

    int result = func1(n,5);

    write("The result is ");
    write(result);
    write("\n");
    return 0; }

The output of your program might be:

int count = 0;
int func1( int x , int y ) {
  if ( ( func1 <= 0 ) )
    return count;
  fi;
  count = ( count + x );
  return func1( x , ( y - 1 ) );
}
int main( ) {
  int n;
  write( "Please enter a number: " );
  read( n );
  readln;
  int result = func1( n , 5 );
  write( "The result is " );
  write( result );
  write( "\n" );
  return 0;
}

Note that you do not need to preserve parentheses from the original program and can simply print out every expression completely parenthesized.

What To Turn In

Turn in documented versions of all of your code (including test code). Also document your test cases and show results from your parser on each test file. You should also write a team report on this part of the project and in addition submit a short individual report from each member of the team.

Extra Credit

Errors - for up to 20 points you may add productions to your code to recognize more error situations and possibly provide feedback about several errors from an input program.

LL(1) Parser - for up to 40 points extra credit you may implement an LL(1) version of the parser (see me if you need to add any extra syntax to the grammar to make it LL(1)). You should still implement the bison version of the parser as well.