RIPAL: Responsive and Intuitive Parsing for the Analysis of Language

Pages

LR(0) closure motivations

Background

We've seen the motivations for constructing an LR(0) parse table from a context-free grammar. In this section, we will provide motivations for the underlying construct of the production closure.

Recall

In our LR(0) parsing procedure:

  1. When performing a shift action, we add one terminal and one state to the parse stack
  2. When performing a reduce action, we remove a number of terminals and states from the parse stack equal to the number of symbols on the right-hand side of the relevant production

Motivating example

Example

Observe the following grammar:

S → A
A → B
B → C
C → c

This grammar has the following augmented grammar:

S' → S $
S → A
A → B
B → C
C → c

It has the following LR(0) parse table:

c $ S A B C S'
state1 shift2 goto6 goto5 goto4 goto3
state2 reduce5 reduce5
state3 reduce4 reduce4
state4 reduce3 reduce3
state5 reduce2 reduce2
state6 accept

Parsing input string c:

Input queue Parse stack Action
c 1 Apply action of shift2 which corresponds to state1 and c in our parse table
1 c 2 Apply action of reduce5 which corresponds to state2 and $ in our parse table
1 C Apply action of goto3 which corresponds to state1 and C in our parse table
1 C 3 Apply action of reduce4 which corresponds to state3 and $ in our parse table
1 B Apply action of goto4 which corresponds to state1 and B in our parse table
1 B 4 Apply action of reduce3 which corresponds to state4 and $ in our parse table
1 A Apply action of goto5 which corresponds to state1 and A in our parse table
1 A 5 Apply action of reduce2 which corresponds to state5 and $ in our parse table
1 S Apply action of goto6 which corresponds to state1 and S in our parse table
1 S 6 Accept, since this action corresponds to state6 and $ in our parse table

In the above example, observe how frequently the parser is in state1.

Since we are chaining productions that don't require the processing of additional terminals for each step, such as:

  1. S → A
  2. A → B
  3. B → C

we are able to apply a goto action followed by a reduce action multiple times in sequence. This parsing process is centered around state1.

Motivating example

Example

Observe the following grammar:

S → A
A → B
B → C D
C → c
D → d

This grammar has the following augmented grammar:

S' → S
S → A
A → B
B → C D
C → c
D → d

It has the following LR(0) parse table:

c d $ S A B C D S'
state1 shift2 goto8 goto7 goto6 goto3
state2 reduce5 reduce5 reduce5
state3 shift4 goto5
state4 reduce6 reduce6 reduce6
state5 reduce4 reduce4 reduce4
state6 reduce3 reduce3 reduce3
state7 reduce2 reduce2 reduce2
state8 accept

Parsing input string cd:

Input queue Parse stack Action
c d 1 Apply action of shift2 which corresponds to state1 and c in our parse table
d 1 c 2 Apply action of reduce5 which corresponds to state2 and d in our parse table
d 1 C Apply action of goto3 which corresponds to state1 and C in our parse table
d 1 C 3 Apply action of shift4 which corresponds to state3 and d in our parse table
1 C 3 d 4 Apply action of reduce6 which corresponds to state4 and $ in our parse table
1 C 3 D Apply action of goto5 which corresponds to state3 and D in our parse table
1 C 3 D 5 Apply action of reduce4 which corresponds to state5 and $ in our parse table
1 B Apply action of goto6 which corresponds to state1 and B in our parse table
1 B 6 Apply action of reduce3 which corresponds to state6 and $ in our parse table
1 A Apply action of goto7 which correspoinds to state1 and A in our parse table
1 A 7 Apply action of reduce2 which corresponds to state7 and $ in our parse table
1 S Apply action of goto8 which corresponds to state1 and S in our parse table
1 S 8 Accept, since this action corresponds to state8 and $ in our parse table

Observe the apperance of state1 in the above example.

When processing terminal input symbols a and b, the parser leaves state1 until CD is reduced to B.

When applying the reduction of B to A and A to S, the parser is able to use the repeated goto and reduce actions to stay centered around state1 as observed in the previous example.

The closure concept

In considering the set of productions to begin processing within a particular LR(0) parse state, we need a way to expand a given set as was shown in the above examples.

In particular, we can append productions to the set to be considered if they are reachable from the existing set without processing additional terminals. We will call this extended set of production rules the closure of the original set.

Conclusion

Now, we've seen motivation for the closure concept when constructing an LR(0) parse table. Next, we will introduce a special symbol to represent where we are in production processing so that we can handle closures appropriately.


GitHub Repository: https://github.com/bprollinson/ripal

Copyright © 2017 Brendan Rollinson-Lorimer