Regular Expressions And Languages

For example,

{a; ba}* = {a; ba}⁰∪ {a; ba}¹[ {a; ba}²∪. . .

= {%} [ {a; ba} ∪{aa; aba; baa; baba} ∪. . .

Suppose w ∈Str. By Proposition 3, we have that, for all x,

x ∈{w}* iff x ∈{w}ⁿ; for some n ∈N;

iff x ∈{wn}; for some n ∈N;

iff x ∈wn; for some n ∈N:

If we write {0; 1}*, then this could mean:

All strings over the alphabet {0; 1} (Section 2.1); or The closure of the language {0; 1}. Fortunately, these languages are equal (both are all strings of 0's and 1's), and this kind of ambiguity is harmless. We assign our operations on languages relative precedences as follows:

Highest: closure ((.)*) and raising to a power ((.)n);
Intermediate: concatenation (@, or just juxtapositioning);
Lowest: union (∪), intersection (\) and difference (¡).

For example, if n ∈N and A;B;C ∪Lan, then A*BCⁿ[ B abbreviates ((A* )B(Cⁿ)) [ B. The language ((A ∪B)C)* can't be abbreviated, since removing either pair of parentheses will change its meaning. If we removed the outer pair, then we would have (A ∪B)(C*), and removing the inner pair would yield (A ∪(BC))*

Proposition 3:

For all α∈Reg and n ∈N, L(αn) = L(α)n.

Proof. An easy mathematical induction on n. ® may be fixed at the beginning of the proof. 2

An example consequence of the lemma is that L((0 1)3) = L(0 1)3 = f0; 1g3.

We de¯ne the alphabet of a regular expression α(alphabet (α)) by recursion:

alphabet(%) = φ;

alphabet($) = φ;

alphabet(a) = {a} for all a ∈Sym;

alphabet(*(α)) = alphabet(α); for all α∈Reg;

alphabet(@(α ; β)) = alphabet(α) ∪alphabet(β); for all α; β ∈Reg;

alphabet( (α; β)) = alphabet(α) ∪alphabet(β); for all α; β∈Reg:

This is a good definition, since the union of two alphabets is an alphabet.

For example, alphabet (0*10* %) = {0; 1}.