With this assignment, I want to get everyone on track. I want to be able to congratulate everyone on having built their own compiler at the end of semester! Note that this assignment needs to be done quickly so we can work on the final two code generator phase assignments. Lots of code is provided so you need to just fold in the code and adapt it to your grammar and token names. The objective is to make all errors and warnings have the same format and have syntax errors not stop the syntax analysis phase.

There are several parts to this assignment.

  • Nice yyerror and error count
  • Insert the error tokens into your grammar
  • Insert the yyerrok macros into your grammar

    Nicer Errors

    yyerror catches all the message that come back from the parser. Here are some examples of syntax error messages that might come into yyerror from the parser if YYERROR_VERBOSE is set:

    syntax error, unexpected '+'
    syntax error, unexpected ELSE
    syntax error, unexpected '=', expecting '('
    syntax error, unexpected '=', expecting WHILE
    syntax error, unexpected ID, expecting '(' 
    syntax error, unexpected ID, expecting WHILE
    syntax error, unexpected '+', expecting ',' or ';'
    syntax error, unexpected '/', expecting BOOL or INT or VOID 
    syntax error, unexpected ID, expecting $end or BOOL or INT or VOID 
    

    We will put in a line number, format the message like in the semantic analysis section, and add some extra info when appropriate. Below is the code that will translate these messages. Use this code or improve on it if you want as long as it doesn't effect the output. Note that the code assumes my names for the tokens. You may need to adjust some your token names if you didn't for earlier assignments!! In particular to save some typing I grouped all of the binary assignment operators under assignop much like mulop. This reduces the number of error productions you use. Easy change to grammar.

    Here is code for the error message handling. It is free for you to use. Note it sorts the missing token names so they match the expected output!

    yyerror.cpp, yyerror.h This is code you can use to print out your messages so they conform to a standard style. I say again... it is free code! To integrate this code into your parser you must read and understand yyerror.h. Yes, I mean it. Your code may have different variables names. Remember, this software includes a tiny sort routine that sorts the expecting tokens to print out a message that can be compared despite the order in which you declare the tokens in your .y file!

    • line is the line number where the parser is working at the time of the error.
    • msg is the extended error message that comes from Bison when the YYERROR_VERBOSE macro is set as in using:
      #define YYERROR_VERBOSE
      
      Use this definition in your .y code! You probably already are.
    • This code makes the brash assumption that yytext is relevant for constants.

    Error Count

    You will continue to have two global variables counting the number of errors and number of warnings just as in the last assignment, but now the count will extend to scanner and syntax errors and warnings. Warnings will not stop the successful compilation of the program. Keep count of both kinds and report at the end of the compile exactly in the order and format as in this example:

    Number of warnings: 0
    Number of errors: 666
    

    I will explain shortly where you get errors and warnings and what happens as a result.

    Other Modifications by Phase of the Compiler

    Our compiler now was three phases: lexical analysis (scanner), syntax analysis, and semantic analysis. Each phase can produce its own errors. Here is the description by phase. Some of which you have already done.

    • Lexical analysis: All warnings and errors increment the appropriate counter.

    • Syntax analysis: We will be adding error tokens to our bison grammar to allow the parser to match an error creating legal input and keep running. We will also add yyerrok as needed when we have syncing tokens. A list of possible edits is provided below.

      Our new yyerror function (supplied in this assignment) will pretty print the syntax errors.

    • Semantic analysis: Only if there are no errors in the syntax analysis then you proceed to semantic analysis. The errors in semantic analysis will increase the global error count.

    • At end of compile: Print the total number of errors and warnings from all phases. For example
      Number of warnings: 28
      Number of errors: 496
      
      or
      Number of warnings: 0
      Number of errors: 0
      

    Further Information on Inserting Error Tokens

    We want to add error tokens so syntactic analysis continues past errors. We do this to help the user get as much useful information about their program as we can in one compile.

    Here are the error production edits to your grammar. They are in the order in which I find them in my grammar which is very similar to the standard grammar. In some cases only the beginning of the actions are shown. Sometimes the only thing that is added is the addition of the word yyerrok which indicates that yyerrok macro should be added somewhere in your actions. Your token names, of course, may be different but they will be translated into "nice names" below so your implementation names will be hidden. Even some of your productions may be slightly different. If you can make them the same, do so to get the best score by matching.

    The result of adding these error tokens will be that your compile will now have many shift/reduce conflicts, and reduce/reduce conflicts.

    You may have to adjust your grammar a little to get these to fit. On the final grading assignment you will be graded on the actual error messages you generate, not where you put your error tokens. Try to come as close as you can to my error messages. The text of the errors should be exactly the same. You may miss 10% of the messages or have some extra and it won't count against you. That is OK. The text of the messages should match.

    Your program must not halt because of a syntax error!

  •