In the last section, we saw the underlying idea behind bottom-up parsing. In this section, we will introduce the specific core constructs and processing rules used in the bottom-up LR parsing algorithm. We will keep our constructs simple at first, and begin to explore the construction of LR(0) parsers.
To apply the LR(0) parsing algorithm, we need the following data structures set up:
The LR(0) parse table
Much like in an LL(1) parser, an LR(0) parser uses a parse table to make decisions. However, the LR(0) parse table is somewhat more complex than the LL(1) equivalent.
As a result, we will build up a foundation of more LR parsing concepts before introducing it.
The input queue
As we done in the LL(1) parsing algorithm, we place our input string into an input queue, with one input symbol in each slot.
The queue looks like:
(head of queue) | σ1 | σ2 | ... | σ2 |
σi ∈ Σ
The input queue here has a very similar setup to the LL(1) input queue.
The operations we need to support on the input queue are:
Note that we've added the peek operation.
The parse stack
Similar to in LL(1) parsing, our parse stack looks like:
(top of stack) | sym1 | sym2 | ... | symn |
symi ∈ T ∪ N
As before, we still want to support the following operations:
In order to understand bottom-up LR(0) parsing, we need to first understand the two key LR parse actions: shift and reduce.
Definition
In the LR(0) shift action, we dequeue the first symbol from the input queue and push it onto the top of the parse stack.
Definition
Given a production Ni → symi1, symi2, ..., symin in our context-free grammar G, we can apply an LR(0) reduce action as follows:
Pop n symbols from the top of the parse stack (which should match si1 ... symin) and push Ni onto the top of the parse stack.
Since we need to accept the input string after it has been reduced to the grammar's start symbol, our LR(0) parser must have an accept action.
Given the grammar
S → ABC
A → a
B → b
C → cc
we can derive the string abcc in reverse as follows:
Input queue | Parse stack | Action |
---|---|---|
abcc | Apply shift action | |
bcc | a | Apply reduce action, since a matches the right-hand side of A → a |
bcc | A | Apply shift action, since the parse stack does not contain the entire right-hand side of any production |
cc | Ab | Apply reduce action, since b matches the right-hand size of B → b |
cc | AB | Apply shift action, since we want ABC on the parse stack to reduce to S |
c | ABc | Apply shift action, since we need two cs to reduce to C |
ABcc | Apply reduce action for production C → cc | |
ABC | Apply reduce action for production S → ABC | |
S | Accept, since we've reduced our terminal string to the start symbol |
We've gone over some of the fundamental operations of LR parsing which will be first used in an illustration of the LR(0) parsing algorithm. Next, we will look at the motivation for requiring state managemnt in our parsing process.
GitHub Repository: https://github.com/bprollinson/ripal
Copyright © 2017 Brendan Rollinson-Lorimer