We've seen a very basic LR(0) production closure calculation.
In this section, we will illustrate a more complex closure calculation.
Example
Observe the following grammar:
S → A
A → B
B → C D
C → c
D → d
This grammar has the following augmented grammar:
S' → S
S → A
A → B
B → C D
C → c
D → d
It has the following LR(0) parse table:
c | d | $ | S | A | B | C | D | S' | |
state1 | shift2 | goto8 | goto7 | goto6 | goto3 | ||||
state2 | reduce5 | reduce5 | reduce5 | ||||||
state3 | shift4 | goto5 | |||||||
state4 | reduce6 | reduce6 | reduce6 | ||||||
state5 | reduce4 | reduce4 | reduce4 | ||||||
state6 | reduce3 | reduce3 | reduce3 | ||||||
state7 | reduce2 | reduce2 | reduce2 | ||||||
state8 | accept |
Parsing input string cd:
Input queue | Parse stack | Action |
---|---|---|
c d | 1 | Apply action of shift2 which corresponds to state1 and c in our parse table |
d | 1 c 2 | Apply action of reduce5 which corresponds to state2 and d in our parse table |
d | 1 C | Apply action of goto3 which corresponds to state1 and C in our parse table |
d | 1 C 3 | Apply action of shift4 which corresponds to state3 and d in our parse table |
1 C 3 d 4 | Apply action of reduce6 which corresponds to state4 and $ in our parse table | |
1 C 3 D | Apply action of goto5 which corresponds to state3 and D in our parse table | |
1 C 3 D 5 | Apply action of reduce4 which corresponds to state5 and $ in our parse table | |
1 B | Apply action of goto6 which corresponds to state1 and B in our parse table | |
1 B 6 | Apply action of reduce3 which corresponds to state6 and $ in our parse table | |
1 A | Apply action of goto7 which correspoinds to state1 and A in our parse table | |
1 A 7 | Apply action of reduce2 which corresponds to state7 and $ in our parse table | |
1 S | Apply action of goto8 which corresponds to state1 and S in our parse table | |
1 S 8 | Accept, since this action corresponds to state8 and $ in our parse table |
Similar to in the previous example, we can see here how the parsing procedure is rooted in state1. In particular, this can be seen with the following parse stack states.
However, before our reduction to nonterminal B, we must first arrive in the following parse stack state:
1 C 3 D 5
Because the production B → C D has two symbols on the right-hand side, we end up pushing two states onto the parse stack before finally reducing to:
1 B
In short, in order to process this production, we leave the previously displayed paradigm of an alternating sequence of goto and reduce actions. This happens because we must first process C before D. Also, C and D both produce a terminal, which requires input.
However, once we reduce to B, this chaining process can proceed. We are able to do this easily because in the reduction to B, we discard all of the additional state information (states 3 and 5) from the parse stack along with nonterminals C and D.
Using our observations here, we can make our fundamental question more flexible in comparison to the previous example as follows:
Given a qualified production as follows:
N → sympre1 sympre2 ... symprenpre · Nnext sympost1 sympost2 ... sympostnpost
The fundamental question to assess in this context is:
How far can we progress through our productions without processing any input symbols OR with shifting and reducing input symbols such that we arrive back in the initial parse state?Starting with the qualified production:
S' → · S $
we can see from above that, after reducing to state S, our parse stack contains:
1 S
This state of the parse stack is reached through our typical chaining of goto and reduce actions with the following sequence of stack states:
These states of parse stack content are reachable without processing any additional input symbols. As a result, we add the following qualified productions to the closure of S' → · S:
Notice how all such nonterminals are reachable with only state1 below it on the parse stack.
The indirect chainNext, let's consider the following sequence of stack states:
Note here how 1 C is eventually reduced to 1 B. Although additional nonterminals and terminals are processed in between those two stack states, all of that information is removed from the stack when 1 C 3 D 5 is reduced to 1 B.
This is a form of indirect chaining. Although additional steps are taken in between stack states 1 C and 1 B, we do eventually arrive back in state 1.
As a result, we add the following qualified production to our closure:
B → · C D
Even though the nonterminal D was not considered as part of this additional closure production, this closure extension can take place. Nonterminal D will be handled in a subsequent parse state which is resolved before the stack state of 1 B is reached.
The terminal productionFinally, because we have the production:
C → c
in our grammar, we add the following to our closure:
C → · c
since no additional symbols are needed to reach this point in the parsing process.
The closureIn conclusion, the closure of the qualified production set:
{S → · S $}
is:
{S → · S $, S → · A, A → · B C, C → · c}
Starting with the qualified production:
B → C · D
we know that nonterminal D appears on the left-hand side of the following production:
D → d
Before processing this production, we are in the following qualified production state:
D → · d
As a result, we can say that the closure of:
{B → C · D}
is
{B → C · D, D → · d}
We've seen a couple of examples of qualified production closure calculations.
Next, we will inspect a closure calculation for a production chain containing an ε production.
GitHub Repository: https://github.com/bprollinson/ripal
Copyright © 2017 Brendan Rollinson-Lorimer