RIPAL: Responsive and Intuitive Parsing for the Analysis of Language

Background

We've learned how to construct LL(1) parse tables for context-free grammars containing epsilon productions. This section will introduce the parsing techniques needed to handle these parse table additions.

Example grammar and parse table

Example

The following grammar

S → a B
B → C
C → ε

has the following corresponding parse table:

	a	$
S	1
B		2
C		3

Example parse

Example

Parsing input string a:

Input queue	Parse stack	Action
a$	S	Apply rule 1 since S and a correspond to this production in our table
a$	aB	Remove a from head of input queue and parse stack and continue parsing since terminal symbol matches
$	B	Apply rule 2 since B and $ correspond to this production in our table
$	C	Apply rule 3 since C and $ correspond to this production in our table
$		Accept, since $ is at head of input queue and parse stack is empty

A few notes about the parsing procedure here:

Adding the end of string symbol $ to the end of the input queue allows us to apply ε productions when there are no more terminal nodes to process
Since the input queue always starts with $ at the end, we accept when this symbol is encountered and the parse stack is empty

Additional parsing rules

First, in order to handle applying all of the necessary productions, we will change the nonterminal processing rule to:

If the top symbol is a nonterminal (n_next):

Look up the parse table entry (a_next) corresponding to the top symbol from the stack (n_next) and the first symbol (terminal or $) from the input queue (sym_next)
1. If a_next = ø, reject
2. Otherwise, look up our next production p_next = n_next → sym_next₁ sym_next₂ ... sym_{next_n}
3. Add sym_next₁, sym_next₂, ..., sym_{next_n} to the top of the parse stack in reverse order such that sym_next₁ is at the top of the stack

Here, we've simply allowed the use of the symbol $ as a lookup in our parse table. Note that the rule used for removing terminals from the input queue is unchanged.

In addition to this, since we allow productions to still occur when the end of the input string has been encountered, we need to change the accept / reject conditions to the following:

If $ is at the head of the input queue and parse stack is empty at the end of a processing step, accept
If $ is at the head of the input queue and a terminal is at the top of the parse stack at the end of a processing step, reject
If $ is not at the head of the input queue and the parse stack is empty at the end of the processing step, reject

Note that, as before, parsing can also fail upon not finding a parse table entry.

Another successful parse example

Example

S → A B C
A → a
A → ε
B → b
C → c
C → ε

has the following parse table:

	a	b	c	$
S	1	1
A	2	3
B		4
C			5	6

Parsing the input string abc:

Input queue	Parse stack	Action
abc$	S	Apply rule 1 since S and a correspond to this production in our table
abc$	ABC	Apply rule 2 since A and a correspond to this production in our table
abc$	aBC	Remove a from head of input queue and parse stack and continue parsing since terminal symbol matches
bc$	BC	Apply rule 4 since B and b correspond to this production in our table
bc$	bC	Remove b from head of input queue and parse stack and continue parsing since terminal symbol matches
c$	C	Apply rule 5 since C and c correspond to this production in our table
c$	c	Remove c from head of input queue and parse stack and continue parsing since terminal symbol matches
$		Accept, since $ is at head of input queue and parse stack is empty

Parsing input string b:

Input queue	Parse stack	Action
b$	S	Apply rule 1 since S and b correspond to this production in our table
b$	ABC	Apply rule 3 since A and b correspond to this production in our table
b$	BC	Apply rule 4 since B and b correspond to this production in our table
b$	bC	Remove b from head of input queue and parse stack and continue parsing since terminal symbol matches
$	C	Apply rule 6 since B and $ correspond to this production in our table
$		Accept, since $ is at head of input queue and parse stack is empty

Parse failure examples

Example

S → ab

has the following corresponding parse table:

	a	$
S	1

Note that the $ column is empty here and is only included to be explicit in this illustration.

Parsing input string a:

Input queue	Parse stack	Action
a$	S	Apply rule 1 since S and a correspond to this production in our table
a$	ab	Remove a from head of input queue and parse stack and continue parsing since terminal symbol matches
$	b	Reject, since $ is at head of input queue and parse stack has a terminal at the top

Parsing input string abc:

Input queue	Parse stack	Action
abc$	S	Apply rule 1 since S and a correspond to this production in our table
abc$	ab	Remove a from head of input queue and parse stack and continue parsing since terminal symbol matches
bc$	b	Remove b from head of input queue and parse stack and continue parsing since terminal symbol matches
c$		Reject, since terminal c is at head of input queue and parse stack is empty

Conclusion

The parsing techniques used here complete the theory needed to fully understand LL(1) parsing. Next, we will introduce the concept of bottom-up parsing in order to facilitate discussion of a wider variety of parsing algorithms.

RIPAL: Responsive and Intuitive Parsing for the Analysis of Language

Pages

Handling end-of-string productions

Background

Example grammar and parse table

Example parse

Additional parsing rules

Another successful parse example

Parse failure examples

Conclusion