# Automata And Languages

**AUTOMATA AND LANGUAGES**: An automaton is able to determine whether a sequence of words belongs to a specific language. This language consists of a set of words over some finite alphabet. Depending on the type of automaton used, this sequence of words may be finite or infinite. If these sequences of words correspond to sequences of events and actions, we can construct an automaton that accepts correct sequences of events and actions in a system, and thus solve the verification problem as follows.

With the introduction of more concepts, we can use an automaton to represent a process or system. More precisely, a specification automaton represents the desired specification of a system, and an implementation automaton models an implementation attempting to satisfy the given specification. Our goal is to verify that the implementation satisfies the specification. This problem can now be viewed as the language inclusion problem (also known as the language containment problem), that is, to determine whether the language accepted by the implementation automaton is a subset of the language accepted by the specification automaton.

This section introduces several classical types of automata and the languages they accept. These automata are deterministic finite automata and nondeterministic finite automata. The languages include regular languages.

**Languages and Their Representations**: First, we define the terminology for languages. An alphabet is a finite set of symbols, which can be Roman letters, numbers, events, actions, or any object. A string over an alphabet is a finite sequence of symbols selected from this alphabet. An empty string has no symbols and is denoted by e. The set of all strings over an alphabet is written as Σ∗. The length of a string is the number of symbols in it. To refer to identical symbols at different positions in a string, we say these are occurrences of the symbol, just like saying instances (or iterations) of a process in a computer system.

The concatenation of two strings x1 and x2, written as x1x2, is the string x1 followed by the string x2. A subtstring of a string x is a subsequence of x. A language

is any subset of Σ∗. The complement, union, and intersection operations can be applied to languages since languages are sets. The language operation Kleene star (also called closure) of a language L, written as LΣ∗, is the set of strings consisting of the concatenation of zero or more strings from L. Now we describe how to use strings to represent languages. Since the set Σ∗ of strings over an alphabet is countably infinite, the number of possible representations of languages is countably infinite. However, the set of all possible languages over a given alphabet is uncountably infinite, and thus finite representations cannot be used to represent all languages. We next focus on languages that can be represented by finite representations. A regular expression specifies a language by a finite string consisting of single symbols, ∅, possibly parentheses, and the symbols ∪ and Σ∗. We now define regular expressions more formally.

**Regular Expressions: **The regular expressions over an alphabet consist of the strings over the alphabet ∪ { ), (, ∅,∪, Σ∗} such that:

1. Each member of and ∅ is a regular expression.

2. If α and β are regular expressions, then (α ∪ β) is a regular expression.

3. If α and β are regular expressions, then (αβ) is a regular expression.

4. If α is a regular expression, then α∗ is a regular expression.

5. All regular expressions must satisfy the above rules.

Because every regular expression represents a language, we define a function L mapping from strings to languages such that for any regular expression α, L(α) is the language represented by α with the following properties.

1. For each a ∈ , L(a) = {a}, and L(∅) = ∅.

2. If α and β are regular expressions, then L((α ∪ β)) = L(α) ∪ L(β).

3. If α and β are regular expressions, then L((αβ)) = L(α)L(β).

4. If α is a regular expression, then L(α∗ ) = L(a)∗.