We've seen what an LL(1) parse table looks like. Now we can solidify a parsing algorithm using an LL(1) parse table.
For now, we will ignore CFGs with ε productions to keep the setup of this algorithm easier to explain. CFGs with ε productions will be covered later.
To apply the LL(1) parsing algorithm, we will need the following data structures set up:
The LL(1) parse table
As shown in the previous section, we will need a parse table that maps combinations of nonterminal symbols and terminal symbols to the application of a production rule.
The input queue
We place our input string into an input queue, with one input symbol in each slot.
The queue looks like:
(head of queue) | σ1 | σ2 | ... | σ2 |
σi ∈ Σ
Example
Σ = {e, s, t}
s = test
Our queue looks like:
(head of queue) | t | e | s | t |
The operations we need to support on the input queue are:
The parse stack
We use a parse stack to store the symbols we are still in the process of handling.
The stack looks like:
(top of stack) | sym1 | sym2 | ... | symn |
symi ∈ T ∪ N
The operations we need to support on the parse stack are:
Initialize the input queue with the symbols from the input string.
Initialize the parse stack with the grammar's start symbol.
At each step of our parsing process, we pop and observe the top symbol from the parse stack.
If the top symbol is a terminal:
If the top symbol is a nonterminal (nnext):
Outside of an input symbol or table lookup mismatch, the following conditions signal and end of processing:
S → ab
a | b | |
S | 1 |
Example
Parsing input string ab:
Input queue | Parse stack | Action |
---|---|---|
ab | S | Apply rule 1 since S and a correspond to this production in our table |
ab | ab | Remove a from head of input queue and parse stack and continue parsing since terminal symbol matches |
b | b | Remove b from head of input queue and parse stack and continue parsing since terminal symbol matches |
Accept, since input queue and parse stack are both empty |
Example
Parsing input string b:
Input queue | Parse stack | Action |
---|---|---|
b | S | Reject, since S and b correspond to ø in our table |
Example
Parsing input string aa:
Input queue | Parse stack | Action |
---|---|---|
aa | S | Apply rule 1 since S and a correspond to this production in our table |
aa | ab | Remove a from head of input queue and parse stack and continue parsing since terminal symbol matches |
a | b | Reject, since the symbol from the head of the input queue (a) doesn't match the symbol from the top of the parse stack (b) |
Example
Parsing input string a:
Input queue | Parse stack | Action |
---|---|---|
a | S | Apply rule 1 since S and a correspond to this production in our table |
a | ab | Remove a from head of input queue and parse stack and continue parsing since terminal symbol matches |
b | Reject, since input queue is empty but parse stack is not |
Example
Parsing string abc:
Input queue | Parse stack | Note |
---|---|---|
abc | S | Apply rule 1 since S and a correspond to this production in our table |
abc | ab | Remove a from head of input queue and parse stack and continue parsing since terminal symbol matches |
bc | b | Remove b from head of input queue and parse stack and continue parsing since terminal symbol matches |
c | Reject, since parse stack is empty but input queue is not |
When applying the LL(1) parsing algorithm, we are making a selection of a production rule to apply based on the next terminal from the input string and the next nonterminal from the parse stack.
In general, we need the string produced by replacing nonterminals with terminals to match the input string exactly for a parse to be successful. Although an LL(1) parse table is a powerful tool, it is not always powerful enough to parse all context-free languages correctly.
We've now learned the basic rules of LL(1) parsing. Next, we will look at how to interpret the results.
GitHub Repository: https://github.com/bprollinson/ripal
Copyright © 2017 Brendan Rollinson-Lorimer