Sei sulla pagina 1di 5

COMP1127 Introduction to Computing II

20152016 Semester II
Assignment
A parser is a pieceof softwarethattakesinputdata(frequentlytext)and buildsadatastructure
often some kind of parse tree, or other hierarchical structure giving a structuralrepresentation
of the input, checking for correct syntax intheprocess.Inthecaseofprogramminglanguages,a
parser is a component ofacompilerorinterpreter,whichparsessourcecodetocreatesomeform
ofinternalrepresentation.

This process happens every time you try to execute one of your Python scripts. You must all
fondly remember needing to overcome the generation of frequent syntax errors when first
learning the Python programming language. For this assignment we will create a rudimentary
parser for a rudimentary language that we will create and have it perform the first 2 stages of
parsing:lexicalanalysisandsyntacticanalysis.

Lexical analysis involves separating theinputstringintoitsconstituenttokens,syntacticanalysis


checks whether the tokens form an allowable statement or expression according to the rules of
thelanguage.

Our language has a few keywords, operators, data types and delimiters. With these one shall be
able to write procedures, make selections and loop. However, as we will only be implementing
lexicalandsyntacticanalysis,wewillnotbeabletoexecutethestatementsandexpressions.

The keyword proc indicates the start of a procedure and the keyword end marks its
completion. To display results (we are not returning anything and therefore there is no return
keyword),wewillusedisp,ifthenforselectionanddountilforrepetition.

It is a strongly typed language and therefore variables and their types must be declared before
theyareused.Datatypesthatareallowedare:int,float,char,string,bool.
1

There are only 3 delimiters: the space, the colon, the line break. Colons are used immediately
after a variable name followed by a data type. For example, num:int indicates a variable
named num of type integer. Line breaks are used at the end of each line to indicatetheendofa
statement or expression. Spaces will be used toseparateothertokens,forexampletheendofthe
keyword proc and the start of the name of a procedure. Names can only be written using
lowercaseletters.

The allowable tokens in ourlanguagearegiveninthetuplesbelow.Copythetuplesgivenbelow


intoyourcode.

keywords=('proc','disp','if','then','do','until','end')
operators=('>','=','<','>','!=','+','','/','*','^')
types=('int','float','char','string','bool')
delimiters=('\n',':','')

Anexampleofaprocedurethatissyntacticallycorrectinourlanguagewouldbe,

'procaddx:inty:int\nsum:int\nx+y>sum\ndispsum\nend\n'

This defines a procedure called add, that would accept two parameters, x and y, both integers,
adds them and places the result of the addition intothevariablesumandthendisplaysthevalue
ofsum.Writethecodethatisrequiredanddescribedbelow.

1. The table below lists basic functionsthatwillbeusedbyourparser.Writethecodeof all


ofthem.

Function

Description

isKeyword Accepts1argumentandreturnsTrue/Falseifitisinthetupleofkeywords
isOperator

Accepts1argumentandreturnsTrue/Falseifitisinthetupleofoperators

isType

Accepts1argumentandreturnsTrue/Falseifitisinthetupleofdatatypes

isDelim

Accepts1argumentandreturnsTrue/Falseifitisinthetupleofdelimiters

isLwrCase

Accepts1argumentandreturnsTrue/Falseifitisalowercasecharacter

isColon

Accepts1argumentandreturnsTrue/Falseifitisthecoloncharacteri.e.:

isSpace

Accepts1argumentandreturnsTrue/Falseifitisthespacecharacteri.e.

isLineBrk

Accepts1argumentandreturnsTrue/Falseifitisthelinebreakcharacteri.e.\n
2


2. Write a recursive function called isValidName which acceptsone(1)stringargument.
If the string consists oflowercaselettersofthealphabetonly,itreturnsTrue,otherwiseit
returnsFalse.Forexample,
>>>isValidName('proc')
True
>>>isValidName('Proc')
False
>>>

3. Write a recursive function called isValidToken which accepts one(1) string


argument. If the string is either a keyword, operator, data type or delimiter, it returns
True,otherwiseitreturnsFalse.Forexample,
>>>isValidToken('>')
True
>>>isValidToken('')
False
>>>

4. Write a function called getToken which accepts astringasitsargument.Thisfunction


returns a single token from its given string argument, removes it from the string and
returns the remainder of the string.Tokensareseparatedbydelimiters.Thereforetheend
of a token is considered asthecharacterimmediatelybeforethenextoccurringdelimiter.
Forexample,

>>>getToken('procaddx:inty:int\nsum:int\nx+y>sum\ndispsum\nend\n')
('proc','addx:inty:int\nsum:int\nx+y>sum\ndispsum\nend\n')
>>>

Thefirsttokenencounteredisthekeywordprocwhichisdelimitedbyaspace,therefore
getTokenreturnsprocasthetoken,andtheremainderofthestringbeginningwith
thespaceandendingwiththelinebreak.WritegetTokenusingalocallydefined,
recursivefunctioncalledextractwhichperformstheservicejustdescribed.

5. Write a recursive function called tokenize that accepts a string as its argument. It
performs the service of lexical analysis i.e. it separates the string argument into its
constituent tokens. The tokens are returned as a list. It must use getToken. For
example,

>>>tokenize('procaddx:inty:int\nsum:int\nx+y>sum\ndispsum\nend\n')
['proc','','add','','x',':','int','','y',':','int','\n','sum',
':','int','\n','x','','+','','y','','>','','sum','\n','disp',
'','sum','\n','end','\n']
>>>

6. Write a function called canFollow which accepts two strings as its arguments. One
string is the token being analysed, the second is the token that succeeds (comes
immediately after) the token being analysed. canFollowreturns True or False based
on whether the successor token can follow the token being analysed, based on the
languagerulesdescribedearlier.Forexample,

>>>canFollow('proc','')
True
>>>canFollow('proc',':')
False
>>>

7. Write an iterativefunctioncalledanalyseSyntaxwhichacceptsalistasitsargument.
It performstheserviceofsyntacticanalysisi.e.thatthetokenintheargumentlistforman
allowable statement or expression. This functionmustusethefunctioncanFollowthat
you wrote earlier. If there is a syntax error, the function returns the string No syntax
errors found otherwiseitreturns thestring Syntaxerrorfoundalongwiththetokenand
itssuccessorthatgeneratedtheerror.Forexample,

>>> analyseSyntax(['proc', ' ', 'add', ' ', 'x', ':', 'int', ' ', 'y', ':',
'int','\n','sum',':','int','\n','x','','+', '','y','', '>','',
'sum','\n','disp','','sum','\n','end','\n'])
'Nosyntaxerrorsfound'
>>> analyseSyntax(['proc', ':', 'add', ' ', 'x', ':', 'int', ' ', 'y', ':',
'int','\n','sum',':','int','\n','x','','+', '','y','', '>','',
'sum','\n','disp','','sum','\n','end','\n'])
('Syntaxerrorfound:','proc',':')
>>>


8. Finally, write the function parse. It accepts a string as its sole argument and performs
the first two stages of parsing, lexical analysis and syntactic analysis. It must first print
the input string in its proper format, print the message Checking syntax, then call
upon the services of analyseSyntax and tokenize that you wrote earlier. For
example,

>>>parse('procaddx:inty:int\nsum:int\nx+y>sum\ndispsum\nend\n')
procaddx:inty:int
sum:int
x+y>sum
dispsum
end

Checkingsyntax...
'Nosyntaxerrorsfound'
>>>

Andthatsit.Webrokeourproblemintosmallerproblems,solvedthoseandthensolvedour
originalonebyputtingthesmallersolutionstogether.Ihopeyouhadfun!

This assignment is worth 15% of your coursework mark. It is due on April 15th 2016, at
11pm.PostyoursubmissionviaOurVLE.LookforthecontainernamedAssignment.

You are required to work in pairs, therefore ensure that your code has BOTH members'
ID numbers included as a comment. Only one member of the pair is to submit. Each
person may submit as a member of one (1) programming pair only. No late submissions
willbeaccepted.

Nameyourfileaccordingtothefollowingconvention,
IFtheIDnumbersofthestudentsare620000001and620000002
THENthefilenameshouldbe620000001_620000002.py

Potrebbero piacerti anche