The Y Operator

The Y operator: There exists a class of (almost “magical”) functions, the most popular one being function Y below, that can find the unique fixed-points of functions such as H!
Y = (lambda x. (lambda h. x(h h)) (lambda h. x(h h)))
In other words, we claim that (Y f) is the fixed-point of any arbitrary function f. Things seem almost too good to be true: if we desire the fixed-point of any function f expressed in Lambda calculus, simply “send Y after it.” In other words, Y (f) would be the fixed-point. For the above to be true, the following fixed-point equation must hold:
(Y f) = f(Y f).
Here is how the proof goes:
Y f = (lambda x. (lambda h. x(h h)) (lambda h. x(h h))) f
= (lambda h. f(h h)) (lambda h. f(h h)) <-- look --|
|
|
= f( (lambda h. f(h h)) (lambda h. f(h h)) ) -------
= f( Y f ).
In the above derivation steps, in the penultimate step, we have f applied to a big, long Lambda form that was obtained in the second step as an expansion of Y f. Therefore, in the last step, we can obtain the simplification indicated. Okay, now, finally, we have successfully “de- Freded” our original recursive definition for Fred. We can write Fred as
Fred = Y (lambda y . lambda x . if(x=0) then 0 else x y(x-1)) where the right-hand side contains no trace of Fred. The right-hand side is now an irredundant name defining what was originally cast as an explicit recursion.

Illustration of reduction: While we have shown how to turn the equation Fred = H Fred into Fred = (Y H), does the new irredundant name really capture the reduction semantics of functional evaluation? Let us demonstrate that indeed it does, by applying Fred to an argument, say 5: Fred 5
= (Y H) 5
= H (Y H) 5
= (lambda y . lambda x . if(x=0) then 0 else x y(x-1)) (Y H) 5
= (lambda x . if(x=0) then 0 else x (Y H)(x-1)) 5
= (if(5=0) then 0 else 5 (Y H)(5-1))
= 5 (Y H) 4
= ...
= 5 4 (Y H) 3
= ...
= 5 4 3 2 1 0
= 15
From the above, one can observe that the main characteristic of Y is that it has the ability to “self-replicate” a Lambda expression. Notice how a copy of (Y H) is “stashed away” “just in case” there would be another recursive call. Self-replication, unfortunately, is also the basis on which many malicious programs such as computer viruses operate. In advanced computability theory, the deep connections between “selfreplication” and computability are captured in the so-called recursion theorem. The interested reader is encouraged to read up on this topic, including Ken Thompson’s widely cited article “Reflections on trusting
trust,” to be fully informed of the true potentials, as well as societal consequences, of computers. On one hand, computers are mankind’s most impressive invention to date; on the other hand, they are prone to abuse, stemming either from innocent oversight or malicious intent - in both cases demanding the eternal vigil of the computing community to guard against, detect outbreaks, and restore normal operations if bad things do happen.

Recursive Definitions as Solutions of Equations: The reduction behavior using Y indeed tracks the normal function evaluation method.
Fred 5 = (lambda x . if(x=0) then 0 else x Fred(x-1)) 5
= if(5=0) then 0 else 5 Fred(5-1)
= 5 (Fred 4)

= (lambda x . if(x=0) then 0 else x Fred(x-1)) 4
= 5 (if(4=0) then 0 else 4 Fred(4-1))
= 5 4 (Fred 3)
= ...
= 5 4 3 2 1 0
= 15.
What really is the advantage of a fixed-point formulation? To better understand this, let us study the connection between recursion and solving for functions more deeply. The fact that we have equated Fred to a Lambda term (lambda x. if (x=0) then 0 else x Fred(x-1)) containing Fred suggests that Fred is a solution for f in an equation of the form

f = (λx.if (x = 0) then 0 else x f(x − 1)).
How many solutions exist for f? In other words, how many different functions can be substituted in place of f and satisfy the equation? Also, if there are multiple possible solutions, then which of these solutions did (Y H) correspond to? Might the function f0 below, which is undefined over its entire domain N, for instance, be a solution?
f0 = (λx. ⊥)
Here, ⊥ stands for “undefined” or “bottom” value. Substituting f0 for f, the right-hand simplifies to the function
(λx.if (x = 0) then 0 else x ⊥),
or
(λx.if (x = 0) then 0 else ⊥).
This function is different from f0 in that it is defined for one input, namely 0. Hence the above equation is not satisfied. Calling this function f1, let us see whether it would serve as a solution. Substituting f1 on the right-hand side, we get
(λx.if (x = 0) then 0 else x f1(x − 1)).
Simplifying f1(x − 1), we get
(λx.if (x = 0) then 0 else ⊥)(x − 1)
= if ((x − 1) = 0) then 0 else ⊥
= if (x = 1) then 0 else ⊥.
The right-hand side now becomes
(λx.if (x = 0) then 0 else x (if (x = 1) then 0 else ⊥))