Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
We will be doing 4 labs using lex and yacc (which create a bottom up parser). 1. Scanner Lab 2. Parser Lab 3. Semantics Lab 4. Code Generation Lab
1. The web page will be submitted to flex, a version of lex creating a Scanner that will recognize the tokens described there. It is essentially a C program. 2. Then the C compiler (cc), compiles this to create an executable scanner. 3. This scanner runs the input program on the right producing a set of tokens.
You can see this list of tokens at the bottom of the output page. You might have to scroll down in your browser to see them. Step 1 Click on the button and investigate the output.
Edit the text and click me
Lex Program
%{ #include <stdio.h> %} %% [a-zA-Z][a-zA-Z0-9]* printf("WORD %s ", yytext); [a-zA-Z0-9\/.-]+ printf("FILENAME %s ", yytext); \" printf("QUOTE "); \{ printf("OBRACE "); \} printf("EBRACE "); ; printf("SEMICOLON "); \n printf("\n"); [ \t]+ /* ignore w hitespace */; %% int main (void) {yylex(); return 0;} int yyw rap (void) {return 1;}
Scanner Input
logging { category lame-servers { null; }; category cname { null; }; }; zone "." { data = 100; real = +145.8764; type hint; file "/etc/bind/db.root"; };
QUESTION 1 Looking at the lex code and the output, what tokens seem to be described there? Step 2 Change the input to the following and click on the button again:
void input_a() { a = b3; xyz = a + b + c - p / q; a = xyz * ( p + q ); p = a - xyz - p; }
QUESTION 2 What tokens didn't get recognized? Step 3 a. Change the lex file so that when it comes across an equal, it prints EQUAL.
b. Now change the lex file so that it recognizes all the tokens in the program in Step 2 Step 4 Change the lex file so that it will also recognize an integer consisting of 1 or more digits Step 5 Using lex in Unix a. Create a file, lex.a containing your lex code from Step 4. (You can change it to print nicer output if you wish - in fact, I recommend this; Try printing the tokens as ordered pairs, e.g., (identifier, void)) b. Generate the C Program which is the Scanner as follows:
$ lex lex.a
You can see that lex has created a C program called lex.yy.c. This is our Scanner, but we have to compile it first: c. Compile the C Program which is the Scanner:
$ cc lex.yy.c -ll $ ls a.out lex.a lex.yy.c
a.out is the executable scanner. Let's try it out! d. Running the Scanner:
$ ./a.out 23 integer
Step 6 Now change your lex file so that it displays REALNUM for a real number. A real number can be defined as a plus or minus followed by a number of digits followed by a dot ".", followed by a number of digits. You are now ready to do Project, Part 1!
The yacc (bison) program is in the left box, the related lex (flex) code is in the bottom box and the input program for the generated parser is in the right box. When you press the button Edit the text and click me, the following will happen: 1. 2. 3. 4. The yacc/bison code in the left box will be run by bison (yacc) The lex/flex code will be run by flex (lex) The C code created by flex will be compiled by cc, the C compiler The C code created by bison will be compiled by cc, the C compiler, creating an executable parser 5. The input in the right box is executed by this parser producing a bottom-up parse (The parser prints the left hand side of productions just like you will in Project, Part 2) The results will be displayed on the bottom of the page. You might have to scroll down in your browser to see them. Step 1 Looking at the bison/yacc code in the left box, write the BNF in standard form. Hint: The first production is
lines --> epsilon | lines line
Step 2 It is really hard to see the bottom-up parse from the output, so change the input to contain just the first statement 1 + 1; and press the button again. Now you should be able to create a parse from the output (which is the reverse of a leftmost derivation). Show it as a parse tree. Step 3 Now, change the yacc/bison grammar in the left box so that it recognizes division. Note that the lex code already recognizes "/", calling the token DIVIDE. (Ask us if you don't see this) Run an input to show that the division works (if it does! If not, fix your production) and show us your output. Draw the parse tree from the output. Step 4 In your written BNF, add productions to perform exponentiation, "^". Show it to us before you go on. Now add the bison code for ^ to the left box, enter a statement using ^ in the right box and click on the button. Note that the lex code for "^" is already there with the token called POWER. Show us when it works. Step 5 Moving to unix/linux:
Create the lex file, say, lex.2 Run lex.2 through lex (Type lex lex.2) to create lex.yy.c Now create the yacc file, say ly.y and run ly.y through yacc (Type yacc -d ly.y) This creates a file called yacc.tab.c and the "-d" creates a file of definitions called y.tab.h. When these are compiled they produce our parser. Stop and look at these files and show them to us. Now compile these 2 C programs:
cc y.tab.c lex.yy.c
For now, we will run the Parser from the Command line. Type:
../a.out 2 + 3 * 4
Run a few more programs. Now you are ready to do Project, Part 2.
%{ #include "parsing_lab.h" %}
%% [0-9]+ {yylval = atoi(yytext); return NUMBER;} [ \t\n] ; "+" return(PLUS); "-" return(MINUS); "*" return(TIMES); "/" return(DIVIDE); "^" return(POWER); "(" return(LEFT_PARENTHESIS); ")" return(RIGHT_PARENTHESIS); ";" return(END);
%%
Edit the text above, and click on the button to see the result.
The printtree function prints this as: (+ 1 1) Step 2 Change the yacc/bison grammar so that it recognizes division and add that node to the AST tree. Run an expression as input to show that the division works. Step 3 Add the pow() function (^) to compute the power of two numbers; include both the syntax (the yacc productions) and semantics (the $$ = stuff) Again, run an example to show power works. Step 4 Now read (yes, read it first!) and begin Project Part 3. Have fun!
Edit the text and click me
TIMES
DIVIDEPOWER
%{ #include "parsing_lab.h" %}
%% [0-9]+ {yylval = (int)yytext; return NUMBER;} /* cast pointer to int for compiler w arning */ [ \t\n] ; "+" return(PLUS); "-" return(MINUS); "*" return(TIMES); "/" return(DIVIDE); "^" return(POWER); "(" return(LEFT_PARENTHESIS); ")" return(RIGHT_PARENTHESIS); ";" return(END);
%%
Edit the text above, and click on the button to see the result.
Run an expression as input to show that power works. Add the pseudo code to run the power function. For simplicity you can assume the machine has an opcode that does power, or you can loop the multiplication.
typedef struct node { struct node *left; struct node *right; int tokcode; char *token; } node; node *mknode(node *left, node *right, int tokcode, char *token); void printtree(node *tree); void generate(node *tree); #define YYSTYPE struct node * %} %start lines
%% [0-9]+ {yylval = (int)yytext; return NUMBER;} /* cast pointer to int for compiler w arning */ [ \t\n] ; "+" return(PLUS); "-" return(MINUS); "*" return(TIMES); "/" return(DIVIDE); "^" return(POWER); "(" return(LEFT_PARENTHESIS); ")" return(RIGHT_PARENTHESIS); ";" return(END);
%%
Edit the text above, and click on the button to see the result.