RIPAL: Responsive and Intuitive Parsing for the Analysis of Language

Background

We've seen the process for constructing an LR(0) DFA for a grammar that derives a fixed string through a single production.

In this section, we will expand this process to handle a grammar that contains a production chain.

Our grammar

The language we will be analyzing is specified by the following grammar:

S → A
A → a

This grammar has the following augmented grammar:

S' → S $
S → A
A → a

Our parse table

This grammar has the following LR(0) parse table:

	a	$	S	A	S'
state₁	shift₂		goto₄	goto₃
state₂	reduce₃	reduce₃
state₃	reduce₂	reduce₂
state₄		accept

Parsing input string a:

Input queue	Parse stack	Action
a	1	Apply action of shift₂ which corresponds to state₁ and a in our parse table
	1 a 2	Apply action of reduce₃ which corresponds to state₂ and $ in our parse table
	1 A	Apply action of goto₃ which corresponds to state₁ and A in our parse table
	1 A 3	Apply action of reduce₂ which corresponds to state₃ and $ in our parse table
	1 S	Apply action of goto₄ which corresponds to state₁ and S in our parse table
	1 S 4	Accept, since this action corresponds to state₄ and $ in our parse table

Initial state

As before, we will construct our initial parse state using a closure starting with our initial qualified production rule.

Our initial set of qualified productions is:

{S' → · S $}

Calculating the closure set, we end up with:

{S' → · S $, S → · A, A → · a}

We can't move any farther in our closure calculation since we've hit a dead end with the terminal a. Advancing would require the processing of the input symbol a.

We can visualize our initial parse state as:

DFA state 1. This state contains the following productions: S prime produces dot S $, S produces dot A, A produces dot a.

Handling terminal a

Observe the following sequence of parse stack states encountered in the parsing example above:

1
1 a 2
1 A

Before applying our reduction corresponding to the rule A → a, we must first shift terminal a onto the stack and move into state₂.

To do so, we take the following qualified production:

A → · a

We shift the dot past the terminal a that we are processing and create the following qualified production set from this action:

{A → a ·}

Since the dot is at the end of the production, this closure can't be further expanded. We end up with state₂ being defined as follows:

DFA state 2. This state contains the following production: A produces a dot.

Finally, we connect state₁ and state₂ with the appropriate transition as follows:

DFA containing two states. State 1 contains the following productions: S prime produces dot S $, S produces dot A, A produces dot a. State 2 contains the following production: A produces a dot. State 1 transitions to state 2 on terminal symbol a.

Handling nonterminal A

Observe the following sequence of parse stack states encountered in the parsing example above:

1 A
1 A 3
1 S

Before applying our reduction corresponding to the rule S → A, we must first recognize that we have nonterminal A on the top of the parse stack. Once we know this, we use a goto action to enter state₃ where this reduction can be applied.

In handling nonterminal A, we start with the following qualified production from our start state:

S → · A

and shift the dot past A, creating the following closure to start state₃:

{S → A ·}

Since the dot is at the end of the production, this closure can't be further expanded. We end up with state₃ being defined as follows:

DFA state 3. This state contains the following production: S produces A dot.

Since we use a goto action to transition from state₁ to state₃ when preparing to handle nonterminal A, we extend our DFA with a subsequent transition to state₃:

DFA containing three states. State 1 contains the following productions: S prime produces dot S $, S produces dot A, A produces dot a. State 2 contains the following production: A produces a dot. State 1 transitions to state 2 on terminal symbol a. State 3 contains the following production: S produces A dot. State 1 transitions to state 3 on nonterminal symbol A.

Handling end of string

After reducing A to S, from state₁, we want to process the end of string according to our standard approach.

Extending our DFA, we end up with a final diagram of:

DFA containing five states. State 1 contains the following productions: S prime produces dot S $, S produces dot A, A produces dot a. State 2 contains the following production: A produces a dot. State 1 transitions to state 2 on terminal symbol a. State 3 contains the following production: S produces A dot. State 1 transitions to state 3 on nonterminal symbol A. State 4 contains the following production: S prime produces S dot $. State 1 transitions to state 4 on nonterminal symbol S. State 5 contains the following production: S prime produces S $ dot. State 4 transitions to state 5 on end of string symbol $. State 5 is an accepting state.

Conclusion

We've seen the LR(0) DFA construction process for a grammar that derives a terminal symbol through a production chain.

Next, we will investigate a similar process related to the derivation of the empty string through a production chain.