Review Of Probability Theory

This has 64 summands! Each of whose value needs to be estimated from empirical data. For the estimates to be of good quality, each of the instances that appear in the summands should appear sufficiently large number of times in the empirical data. Often such a large amount of data is not available.
However, computation can be simplified for certain special but common conditions. This is the condition of independence of variables.
Two random variables A and B are independent iff
P(A,B) = P(A)P(B)
i.e. can get the joint from the marginals
This is quite a strong statement: It means for any value x of A and any value y of B
P(A = x,B = y) = P(A = x)P(B = y)
Note that the independence of two random variables is a property of a the underlying probability distribution. We can have

Conditional probability is defined as:

It means for any value x of A and any value y of B

If A and B are independent then

Conditional probabilities can represent causal relationships in both directions.
From cause to (probable) effects

From effect to (probable) cause