RIPAL: Responsive and Intuitive Parsing for the Analysis of Language

Pages

Order of operations in regular expressions

Background

Now that we're exploring the operators of concatenation, union and Kleene closure over regular expressions, we should define the order of precedence of these operators. This is important in order to illustrate a singular meaning for any regular expression.

Concatenation vs union

Example

Σ = {a, b, c}

R = ab ∪ c

Without specifying operator precedence, this could be interpreted as:

  1. (ab) ∪ c OR
  2. a(b ∪ c)

By convention, we state that the first interpretation is correct. The concatention operator has higher precedence than the union operator.

As a result, our regular expression describes the strings

  1. ab
  2. c

but not

  1. ac

Note that the string ab would be described by either interpretation.

Concatenation vs. Kleene closure

Example

Σ = {a, b}

R = ab*

Without specifying operator precedence, this could be interpretated as (using parentheses to denote higher precedence):

  1. a(b*) OR
  2. (ab)*

By convention, we state that the first interpretation is correct. The Kleene closure operator has higher precedence than the concatenation operator.

As a result, our regular expression describes the strings

  1. a
  2. ab
  3. abb
  4. ...

but not

  1. ε
  2. abab
  3. ababab
  4. ...

Note that the string ab would be described by either interpretation.

Union vs. Kleene closure

Example

Σ = {a, b}

R = a ∪ b*

Without specifying operator precedence, this could be interpretated as (using parentheses to denote higher precedence):

  1. a ∪ (b*) OR
  2. (a ∪ b)*

By convention, we state that the first interpretation is correct. The Kleene closure operator has higher precedence than the union operator.

As a result, our regular expression describes the strings

  1. a
  2. ab
  3. abb
  4. ...

but not

  1. ε
  2. abab
  3. ababab
  4. ...

Parentheses

A matching pair of parentheses can be used, as shown in interpretations above, to enforce a certain order of precedence when the default precedence rules don't suffice.

Example

Σ = {a, b, c}

R = a(b ∪ c)

This regular expression matches the language:

L = {ab, ac}

Example

Σ = {a, b}

R = (ab)*

This regular expression matches the language:

L = {ε, ab, abab, ababab, ...}

Example

Σ = {a, b}

R = (a ∪ b)*

This regular expression matches the language:

L = {ε, a, b, aa, ab, ba, ...}


GitHub Repository: https://github.com/bprollinson/ripal

Copyright © 2017 Brendan Rollinson-Lorimer