Johann 'Myrkraverk' Oskarsson <
[email protected]> writes:
My first question is: how are production recursive descent parsers con-
structed? From these two books, and other things I have read in the
past, recursive descent is so simple it's mostly just brushed off with
an example or two, but there seems to be a whole lot more to it. For
instance, [Appel] mentions a parse table but mostly does not explain how >>> to construct one. ...
I think, at this point, I'd like to demonstrate what [Appel] was talking
about. I can't expect every reader of comp.compilers to have the book
at hand.
The demonstration starts with grammar 3.12, page 47.
Z -> d Y -> X -> Y
Z -> X Y Z Y -> c X -> a
where Y goes to epsilon in the centre of the first line.
Then [Appel] shows how FIRST and FOLLOW sets are computed
given this grammar, ending with
nullable FIRST FOLLOW
X yes a c a c d
Y yes c a c d
Z no a c d
where previous steps are also shown[1]. Then figure 3.14 shows a
predictive parsing table, presumably constructed using the FIRST
and FOLLOW sets.
a c d
+-----------------------------------------+
X | X -> a X -> Y X -> Y |
| Y -> Y |
| |
Y | Y -> Y -> Y -> |
| Y -> c |
| |
Z | Z -> X Y Z Z -> X Y Z Z -> d |
| Z -> X Y Z |
+-----------------------------------------+
Where {X,a}, {Y,c}, and {Z,d} show conflicts, so this cannot be a
recursive descent parser; or LL(1) grammar per the book.
I am not clear on how to construct such a table from the description
in the book, and would not presume to be able to do so by hand, using
a more complex grammar.
I'll give the paragraph where [Appel] explains /recursive descent/.
Consider a recursive-descent parser. The parsing function for some nonterminal X has a clause for each X-production; it must choose one
of these clauses based on the next token T of the input. If we can
choose the right production for each (X, T), then we can write the recursive-descent parser. All the information we need can be encoded
as a two-dimensional table of productions, indexed by nonterminals X
and terminals T. This is called a /predictive parsing/ table.
So I was wondering if people who create /recursive descent/ parsers for production compilers use such a table as an intermediary step, or not.
Of course, as I mentioned before, [Holub] mentions Q grammars as an
alternative intermediary step, but only in an off hand manner.
This lack of clear directions led me to ask the first question: how are production compilers using /recursive descent/ constructed? What are
the steps involved? What do people use, or not use?
I don't know Forth, so I haven't done much more than take a cursory look
at the Gray manual; but it certainly wouldn't surprise me if people used
a skeleton generator, from [E]BNF grammar, and then filled in the
details later.
I tried to make the question open-ended, because I am pretty sure people
use methods not described in either book. I have not looked this up in
the dragon book, because currently my copy is in a different country,
and the shipping cost might rival the price of a new copy. Given a
chapter and verse in that book, I can probably read it with a little bit
of help, and I am totally up to get myself further material if it's
useful.
[1] I think overall the steps are better explained in [Holub] but I did
not try to following along with this particular grammar, using [Holub]'s method.
P.S. I'll try to reply to the other threads later on.
--
Johann | email: invalid -> com | www.myrkraverk.com/blog/
I'm not from the Internet, I just work there. | twitter: @myrkraverk
[Appel is showing how you turn a normal grammar into an LL(1) grammar. The algorithm on page 49 tells you how to make the first and follow sets, then
the informal algorithm in the second para on page 51 turns it into the parse table. Then he says gee, the table has more then one rule in some of the
cells so he spends a few more pages turning it into an LL(1) grammar on
pages 53-54. In my experience most people writing RD parsers do this informally, e.g., to deal with left recursion in expressions you know
that you eat the first token, then peek at the next token and eat it
if you're in the correct rule, otherwise return and let the caller
deal with it. If this seems like more trouble than it's worth, now
you know why we use parser generators. -John]
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)