Automata | Comp. Sc. Engg.

Strings And Languages

The mathematical study of the “Theory of Computation” begins by understanding the Mathematics of strings of symbols.

Alphabet: It is defined as a finite set of symbols.

Example: Roman alphabet {a, b, ...... z}.

“Binary Alphabet” {0, 1} is pertinent to the theory of computation.

String: A “string” over an alphabet is a finite sequence of symbols from that alphabet, which is usually written next to one another and not separated by commas.

  1. If Σa = {0,1} then 001001 is a string over Sa .
  2. If Σb = {a, b, K, z) then axyrpqstcd is a string over Sb .

Length of String: The “length” of a string is its length as a sequence. The length of a string w is written as |w|.

Example: |10011| = 5

Empty String: The string of zero length is called the “empty string”. This is denoted by Î.

The empty string plays the role of 0 in a number system.

Reverse String: If w w w wn = 1 2K where each wi ÎS, the reverse of w is

wn wn-1 . . . w1.

Substring: z is a substring of w if z appears consecutively within w.

As an example, ‘deck’ is a substring of ‘abcdeckabcjkl’.

Concatenation: Assume a string x of length m and string y of length n, the concatenation of x and y is written xy, which is the string obtained by appending y to the end of x, as in x1, x2 Kx m y1 y2K yn .

To concatenate a string with itself many times we use the “superscript” notation:

Suffix: If w = xv for some x, then v is a suffix of w.

Prefix: If w = vy for some y, then v is a prefix of w.

Lexicographic ordering: The Lexicographic ordering of strings is the same as the dictionary ordering, except that shorter strings precede longer strings.

The lexicographic ordering of all strings over the alphabet {0, 1} is (∈, 0, 1, 00, 01, 10, 11, 000, K ).

Language: Any set of strings over an alphabet Σ is called a language. The set of all strings, including the empty string over an alphabet Σ is denoted as Σ*.

Infinite languages L are denoted as

L = {w ∈ Σ* : w has property P}

Examples:

(a) L1 = {w ∈ {0,1}* :w has an equal number of 0' s and 1' s}

(b) L {w w wR}

2 = ∈ Σ * : = where wR is the reverse string of w.

Concatenation of Languages: If L1 and L2 are languages over Σ, their concatenation is L = L1 · L2, or simply L = L1L2, where

L = {w∈ Σ * :w = x · y for some x ∈ L1 , and y∈ L2}

Example: Given Σ = {0,1}

L1 {w w } = ∈ Σ * : has an even number of 0' s

L2 w w = { : starts with a 0 and the rest of the symbols are 1' s} then

L1L2 {w w } = : has an odd number of 0' s

Kleene Star: Another language operation is the “Kleene Star” of a language L, which is denoted by L*. L* is the set of all strings obtained by concatenating zero or more strings from L.