We now understand the construction of first sets from a context-free grammar. Now, we can illustrate how to construct an LL(1) parse table using these first sets.
In order to outline our table construction algorithm, we must first slightly extend our concept of first sets.
Let's say we have context-free grammar G with productions P = {p1, p2, ..., pn}.
We will define a specialized first set of production pi, FIRSTPROD(i), that captures the set of terminal symbols that can be the first derived starting with the specified production.
Specifically:
Example
S → a
S → b
S → C
C → c
C → D
C → E
D → d
E → e
p4 = C → c
From this, we can simply derive FIRSTPROD(4) = {c}.
p5 = C → D
We can calculate FIRST(D) as {d}. From this, we can determine that FIRSTPROD(5) = {d}. Note that this does not include all terminals from FIRST(C) - only those reachable starting with production 5.
This section illustrates the algorithm for constructing an LL(1) parse table.
Note
The algorithm outlined here is designed to work for CFGs with no ε productions. The handling of such productions will be outlined in a later section.
Algorithm
First, we want to initialize our parse table PT as an empty table with:
Then, we can populate the table as follows from our grammar G:
By running through this process, we can construct the relevant LL(1) parse table and subsequently use it in our parser. Note that we are still using i as a short-form for applyi in our description of the parse table entries.
Example
Given the grammar:
S → A
A → B d
A → D d
B → b
D → d
D → e
We can start by calculating our first sets as follows:
Then, we can calculate the first sets for productions as follows:
From this information, we can simply fill in the parse table as follows:
b | d | e | |
S | 1 | 1 | 1 |
A | 2 | 3 | 3 |
B | 4 | ||
D | 5 | 6 |
Example
Observe the following grammar:
S → ab
S → a
Note how both productions' right-hand side starts with the terminal a.
Constructing the parse table, we end up with:
a | |
S | 1/2 |
From this table, we clearly have a conflict. If the next nonterminal is S and our next input symbol is a, it's not clear whether we take the action of apply1 or apply2.
Due to this ambiguity, we know that the provided grammar is not LL(1). As a result, we would not be able to construct an LL(1) parser for it. Note that the 1/2 notation is simply used to illustrate the conflict - this parse table is not valid.
Now, we have the tools to construct an LL(1) parse table for any LL(1) grammar that does not contain the symbol ε. Next, we will introduce the underlying concepts needed to handle ε transitions.
GitHub Repository: https://github.com/bprollinson/ripal
Copyright © 2017 Brendan Rollinson-Lorimer