Bison is a preprocessor. Bison can generate C or C++ compatible code. It is essentially the same program as yacc with a few changes.
Like Flex a Bison program has 3 sections: definitions, rules, user subroutines. Each section is separated by a pair of percents signs (%) in the first column. The form is roughly like:
definitions %% rules %% user subroutine section
%{
code to copy into the final program
%}
After this comes token declarations, nonterminal type declarations, precedence/association rules, and other options.
Tokens are declared by the type of value information they return.
%token <dvalue> NUMBER %token <varindex> NAMEDeclares token NUMBER to return value information of type <dvalue>
Nonterminal types must also be declared as in:
%type <dvalue> expression %type <dvalue> term %type <dvalue> varornum
Which declares expression to return type <dvalue>.
A list of equal precedence operators preceded by %left if the operators are left associative, or %right if they are right associative, can be given. The order in the list is from lowest precedence to highest. For example the three lines in this order:
%right implies %left or xor %left and %nonassoc notMeans and has the highest precedence and is left associative. or and xor are of equal precedence and lower precedence than and and finally implies is the lowest precedence and is right associative. %nonassoc means that the not operator does not have any associativity rule.
expression: expression '+' term { $$ = $1 + $3; }
| term { $$ = $1; }
;
Also see the program example to follow.
extern FILE *yyin; By setting this variable to a file pointer you open you can set the source of input. However, it is tricky to change the source of input once you start reading input by calling yyparse(). This is because the input tokens are buffered up. So to do an include you have to swap the token buffers. If you ever what to do that check the web for this topic.
Example
Here is a simple numeric calculator program that uses flex to build a scanner and bison to process the syntax. A by-product of the syntax analysis happens to be the calculations. For more complex programs this is not possible.The bison source code is calc.y
The flex source code is calc.l
Here is a Bourne shell script for our Sun machines to compile a program that uses both Bison and Flex using either C or C++:
#!/bin/sh -x bison -v -t -d $1.y # create $1.tab.c and $1.tab.h flex -d $1.l # create lex.yy.c # gcc -g lex.yy.c $1.tab.c -lfl -lm -o $1 # create calc using C g++ -DCPLUSPLUS -g $1.tab.c lex.yy.c -lfl -lm -o $1 # create calc using C++For bison the -v option creates a .output file that contains a verbose description of the parser table created including states and conflicts. This is extremely useful in debugging reduce and shift errors. The -t option loads the debug features so that if the variable yydebug is set to 1 debugging information will be dumped showing every step of parsing. In order to use the yydebug variable you need to declare the variable with extern int yydebug in your bison file. The option -d creates the mandatory .h file for the token definitions that will be used by the flex file.You can choose either the gcc compiler or the g++ compiler. If you use the g++ compiler you need to set the macro variable CPLUSPLUS as in the above script. This declares yylex and includes the string.h file for the benefit of C++. This can be done as above or the declarations can be made by hand.
Note the inclusion of the flex library with -lfl and the optional math library -lm. If this scripts is called dobison then for files calc.y and calc.l you would call dobison calc and the executable file calc would be created.
What are those shift/reduce reduce/reduce errors?
Bison uses what is known as a shift/reduce parser to build a parse tree. In the process of analyzing the input it can either reduce the code by the use of some production or shift a token onto a stack and continue looking. When it can't tell if it should shift or reduce you get a shift/reduce error and it makes a wild guess what you want. This generally happens as a result of a ambiguity in your grammar. If you get a reduce/reduce error that means it can't tell which of two production you meant. It will make a guess and move on.If you get these errors look to resolve ambiguities in your grammar.
A makefile for Bison/Flex
BIN = cb1 # name of thing to be built goes here CC = g++ # CFLAGS = -g # CCFLAGS = -DCPLUSPLUS -g # for use with C++ if file ext is .cc CFLAGS = -DCPLUSPLUS -g # for use with C++ if file ext is .c SRCS = $(BIN).y $(BIN).l OBJS = lex.yy.o $(BIN).tab.o LIBS = -lfl -lm $(BIN): $(OBJS) $(CC) $(CCFLAGS) $(OBJS) $(LIBS) -o $(BIN) $(BIN).tab.h $(BIN).tab.c: $(BIN).y bison -v -t -d $(BIN).y lex.yy.c: $(BIN).l $(BIN).tab.h flex -d $(BIN).l # -d debug all: touch $(SRCS) make clean: rm -f $(OBJS) $(BIN) lex.yy.c $(BIN).tab.h $(BIN).tab.c $(BIN).tar tar: tar -cvf $(BIN).tar $(SRCS) makefileFurther Reading
The Bison Manual