Regular Expressions And Languages
For example,
{a; ba}* = {a; ba}0∪ {a; ba}1[ {a; ba}2∪. . .
= {%} [ {a; ba} ∪{aa; aba; baa; baba} ∪. . .
Suppose w ∈Str. By Proposition 3, we have that, for all x,
x ∈{w}* iff x ∈{w}n; for some n ∈N;
iff x ∈{wn}; for some n ∈N;
iff x ∈wn; for some n ∈N:
If we write {0; 1}*, then this could mean:
All strings over the alphabet {0; 1} (Section 2.1); or The closure of the language {0; 1}. Fortunately, these languages are equal (both are all strings of 0's and 1's), and this kind of ambiguity is harmless. We assign our operations on languages relative precedences as follows:
- Highest: closure ((.)*) and raising to a power ((.)n);
- Intermediate: concatenation (@, or just juxtapositioning);
- Lowest: union (∪), intersection (\) and difference (¡).
For example, if n ∈N and A;B;C ∪Lan, then A*BCn[ B abbreviates ((A* )B(Cn)) [ B. The language ((A ∪B)C)* can't be abbreviated, since removing either pair of parentheses will change its meaning. If we removed the outer pair, then we would have (A ∪B)(C*), and removing the inner pair would yield (A ∪(BC))*
Proposition 3:
For all α∈Reg and n ∈N, L(αn) = L(α)n.
Proof. An easy mathematical induction on n. ® may be fixed at the beginning of the proof. 2
An example consequence of the lemma is that L((0 1)3) = L(0 1)3 = f0; 1g3.
We de¯ne the alphabet of a regular expression α(alphabet (α)) by recursion:
alphabet(%) = φ;
alphabet($) = φ;
alphabet(a) = {a} for all a ∈Sym;
alphabet(*(α)) = alphabet(α); for all α∈Reg;
alphabet(@(α ; β)) = alphabet(α) ∪alphabet(β); for all α; β ∈Reg;
alphabet( (α; β)) = alphabet(α) ∪alphabet(β); for all α; β∈Reg:
This is a good definition, since the union of two alphabets is an alphabet.
For example, alphabet (0*10* %) = {0; 1}.