Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
2 COMPUTER LANGUAGES
1.3
Psycholinguistics
In psycholinguistics, parsing involves not just the assignment of words to categories, but the evaluation of the
meaning of a sentence according to the rules of syntax
drawn by inferences made from each word in the sentence. This normally occurs as words are being heard
or read. Consequently, psycholinguistic models of parsing are of necessity incremental, meaning that they build
up an interpretation as the sentence is being processed,
which is normally expressed in terms of a partial syntactic structure. Creation of initially wrong structures occurs
when interpreting garden path sentences.
2
2.1
Computer languages
Parser
preceded by a separate lexical analyser, which creates tokens from the sequence of input characters; alternatively,
these can be combined in scannerless parsing. Parsers
may be programmed by hand or may be automatically
or semi-automatically generated by a parser generator.
Parsing is complementary to templating, which produces
formatted output. These may be applied to dierent domains, but often appear together, such as the scanf/printf
pair, or the input (front end parsing) and output (back end
code generation) stages of a compiler.
The input to a parser is often text in some computer language, but may also be text in a natural language or less
structured textual data, in which case generally only certain parts of the text are extracted, rather than a parse
tree being constructed. Parsers range from very simple functions such as scanf, to complex programs such
as the frontend of a C++ compiler or the HTML parser
of a web browser. An important class of simple parsing is done using regular expressions, in which a regular expression denes a regular language and a regular
expression engine automatically generating a parser for
that language, allowing pattern matching and extraction
of text. In other contexts regular expressions are instead
used prior to parsing, as the lexing step whose output is
then used by the parser.
The use of parsers varies by input. In the case of data
languages, a parser is often found as the le reading facility of a program, such as reading in HTML or XML
text; these examples are markup languages. In the case
of programming languages, a parser is a component of a
compiler or interpreter, which parses the source code of
a computer programming language to create some form
of internal representation; the parser is a key step in the
compiler frontend. Programming languages tend to be
specied in terms of a deterministic context-free grammar because fast and ecient parsers can be written for
them. For compilers, the parsing itself can be done in
one pass or multiple passes see one-pass compiler and
multi-pass compiler.
The implied disadvantages of a one-pass compiler can
largely be overcome by adding x-ups, where provision
is made for x-ups during the forward pass, and the xups are applied backwards when the current program segment has been recognized as having been completed. An
example where such a x-up mechanism would be useful
would be a forward GOTO statement, where the target of
the GOTO is unknown until the program segment is completed. In this case, the application of the x-up would
be delayed until the target of the GOTO was recognized.
Obviously, a backward GOTO does not require a x-up.
Context-free grammars are limited in the extent to which
they can express all of the requirements of a language. Informally, the reason is that the memory of such a language
is limited. The grammar cannot remember the presence
of a construct over an arbitrarily long input; this is necessary for a language in which, for example, a name must
2.2
Overview of process
2.2
Overview of process
Types of parsers
Lemon
Lex
LuZ
Parboiled
Parsec
XPL
LOOKAHEAD
Ragel
Spirit Parser Framework
Syntax Denition Formalism
SYNTAX
Yacc
5 Lookahead
Lookahead establishes the maximum incoming tokens
that a parser can use to decide which rule it should use.
Lookahead is especially relevant to LL, LR, and LALR
parsers, where it is often explicitly indicated by axing
the lookahead to the algorithm name in parentheses, such
as LALR(1).
Most programming languages, the primary target of
parsers, are carefully dened in such a way that a parser
with limited lookahead, typically one, can parse them, because parsers with limited lookahead are often more efcient. One important change to this trend came in 1990
when Terence Parr created ANTLR for his Ph.D. thesis,
a parser generator for ecient LL(k) parsers, where k is
any xed value.
5
(2*3)). Note that Rule4 above is a semantic rule. It is
possible to rewrite the grammar to incorporate this into
the syntax. However, not all such rules can be translated
into syntax.
Simple non-lookahead parser actions
The parse tree generated is correct and simply more ef7. Shift "*" onto stack from input (in anticipation of cient than non-lookahead parsers. This is the strategy
rule2). Input = [3] Stack = [E,*]
followed in LALR parsers.
8. Shift 3 onto stack from input (in anticipation of
rule3). Input = [] (empty) Stack = [E,*,3]
9. Reduce stack element 3 to expression E based
on rule3. Stack = [E,*,E]
6 See also
Backtracking
Chart parser
Deterministic parsing
Compiler-compiler
Generating strings
Grammar checker
LALR parser
Lexical analysis
Pratt parser
Shallow parsing
Left corner parser
Parsing expression grammar
ASF+SDF Meta Environment
DMS Software Reengineering Toolkit
Program transformation
Source code generation
References
Retrieved 28 November
Further reading
Chapman, Nigel P., LR Parsing: Theory and Practice, Cambridge University Press, 1987. ISBN 0521-30413-X
Grune, Dick; Jacobs, Ceriel J.H., Parsing Techniques - A Practical Guide, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands. Originally
published by Ellis Horwood, Chichester, England,
1990; ISBN 0-13-651431-6
External links
The Lemon LALR Parser Generator
Stanford Parser The Stanford Parser
Turin University Parser Natural language parser for
the Italian, open source, developed in Common Lisp
by Leonardo Lesmo, University of Torino, Italy.
Short history of parser construction
EXTERNAL LINKS
10
10.1
Parsing Source: https://en.wikipedia.org/wiki/Parsing?oldid=752853165 Contributors: Damian Yerrick, Vicki Rosenzweig, The Anome,
K.lee, Michael Hardy, TakuyaMurata, CesarB, Ahoerstemeier, Mac, Glenn, Ralesk, Furrykef, , Vt-aoe, Jaredwf, Shoesfullofdust, Mulukhiyya, Martin Hampl~enwiki, Pps, Tea2min, Giftlite, Tom harrison, Dmb000006, Jason Quinn, Macrakis, Neilc, Gadum, Beland,
MarkSweep, Billposer, Irrelevant, Paulbmann, Zeman, Sylvain Schmitz~enwiki, Spayrard, Liberatus, Richard W.M. Jones, Bobo192,
Slomo~enwiki, Larryv, Zetawoof, Jonsafari, Obradovic Goran, Alansohn, Liao, Seans Potato Business, Chrisjohnson, Pekinensis, Ruud
Koot, Slike2, Knuckles, Wikiklrsc, BD2412, Knudvaneeden, Nekobasu, MarSch, Jimworm, Salix alba, RobertG, Gurch, Quuxplusone,
GreyCat, BradBeattie, Chobot, Hairy Dude, Arado, Epolk, Van der Hoorn, Yrithinnd, Mccready, Jstrater, Mikeblas, Maunus, Cadillac,
Pietdesomere, Modify, TuukkaH, SmackBot, NickyMcLean, Sebesta, Gilliam, Chris the speller, Nbarth, Sephiroth BCR, Fchaumartin,
Nixeagle, Sommers, SundarBot, Flyguy649, Downwards, Andrei Stroe, Derek farn, John, HVS, Ourai, SilkTork, Catapult, IronGargoyle,
Pelzi~enwiki, 16@r, Rkmlai, Alatius, MrDolomite, Clarityend, Paul Foxworthy, RekishiEJ, Vanisaac, FatalError, Ahy1, E-boy, DevinCook, MarsRover, HenkeB, Myasuda, Cydebot, Valodzka, Agentilini, Farzaneh, Msnicki, QTJ, 218, Zalgo, Thijs!bot, JEBrown87544, AntiVandalBot, Seaphoto, FedericoMP, Natelewis, Hermel, JAnDbot, MER-C, Gavia immer, AlmostReadytoFly, Tedickey, Usien6, Aakin,
Cic, Bubba hotep, CapnPrep, Nicsterr, Scruy323, Nathanfunk, Mmh, Dantek, W3stfa11, Magicjazz, Cometstyles, Kjmjds, DorganBot,
Brvman, CardinalDan, Vipinhari, Pulsar.co.nr, Technopat, Sylvia.wei, Lbmarshall, Biasoli, SieBot, YoCmos~enwiki, Timhowardriley, Til
Eulenspiegel, Flyer22 Reborn, Diego Grez-Caete, AlanUS, FghIJklm, Gabilaw, Denisarona, Stokito, Benot Sagot, Thisisnotapipe, DragonBot, Estirabot, Nikolasmorton, Roxy the dog, AngelHaf, Libcub, Addbot, Download, Dougbast, OlEnglish, Jarble, TaBOT-zerem, Wonder, LucidFool, AnomieBOT, Jim1138, , Materialscientist, Citation bot, Xtremejames183, Xqbot, J04n, WordsAndNumbers,
Thehelpfulbot, FrescoBot, Borsotti, OgreBot, Robinlrandall, I dream of horses, HRoestBot, Romek1, Lotje, Callanecc, Ruebencampbell, Spakin, RjwilmsiBot, GDBarry, Gareldnate, J36miles, EmausBot, Oliverlyc, WikitanvirBot, Eekerz, ZxxZxxZ, Hpvpp, Sirthias,
Ashowen1701, Peterh5322, Jeroendv, Donner60, MainFrame, Sebbes333, ChuispastonBot, ClueBot NG, Satellizer, Jeana7, Frietjes, DBSand, MerlIwBot, Leonxlin, Architectual, Claytoncarney, ChrisGualtieri, Steamerandy, Pintoch, Samlanning, JohnZofSydney, Watchforever, YarLucebith, Romavikt, AddWittyNameHere, Akshaynexus, Rooke, Aasasd, Equinox, ProprioMe OW, Wrath abyss, Imadeluz,
Bender the Bot and Anonymous: 241
10.2
Images
10.3
Content license