I’m struggling with the concept of conditional expectation. First of all, if you have a link to any explanation that goes beyond showing that it is a generalization of elementary intuitive concepts, please let me know.

Let me get more specific. Let (Ω,A,P) be a probability space and X an integrable real random variable defined on (Ω,A,P). Let F be a sub-σ-algebra of A. Then E[X|F] is the a.s. unique random variable Y such that Y is F-measurable and for any A∈F, E[X1A]=E[Y1A].

The common interpretation seems to be: “E[X|F] is the expectation of X given the information of F.” I’m finding it hard to get any meaning from this sentence.

In elementary probability theory, expectation is a real number. So the sentence above makes me think of a real number instead of a random variable. This is reinforced by E[X|F] sometimes being called “conditional expected value”. Is there some canonical way of getting real numbers out of E[X|F] that can be interpreted as elementary expected values of something?

In what way does F provide information? To know that some event occurred, is something I would call information, and I have a clear picture of conditional expectation in this case. To me F is not a piece of information, but rather a “complete” set of pieces of information one could possibly acquire in some way.

Maybe you say there is no real intuition behind this, E[X|F] is just what the definition says it is. But then, how does one see that a martingale is a model of a fair game? Surely, there must be some intuition behind that!

I hope you have got some impression of my misconceptions and can rectify them.

**Answer**

Maybe this simple example will help. I use it when I teach

conditional expectation.

(1) The first step is to think of E(X) in a new way:

as the best estimate for the value of a random variable X in the absence of any information.

To minimize the squared error

E[(X−e)2]=E[X2−2eX+e2]=E(X2)−2eE(X)+e2,

we differentiate to obtain 2e−2E(X), which is zero at e=E(X).

For example, if I throw a fair die and you have to

estimate its value X, according to the analysis above, your best bet is to guess E(X)=3.5.

On specific rolls of the die, this will be an over-estimate or an under-estimate, but in the long run it minimizes the mean square error.

(2) What happens if you *do* have additional information?

Suppose that I tell you that X is an even number.

How should you modify your estimate to take this new information into account?

The mental process may go something like this: “Hmmm, the possible values *were* {1,2,3,4,5,6}

but we have eliminated 1,3 and 5, so the remaining possibilities are {2,4,6}.

Since I have no other information, they should be considered equally likely and hence the revised expectation is (2+4+6)/3=4“.

Similarly, if I were to tell you that X is odd, your revised (conditional) expectation is 3.

(3) Now imagine that I will roll the die and I will tell you the parity of X; that is, I will

tell you whether the die comes up odd or even. You should now see that a single numerical response

cannot cover both cases. You would respond “3” if I tell you “X is odd”, while you would respond “4” if I tell you “X is even”.

A single numerical response is not enough because the particular piece of information that I will give you is **itself random**.

In fact, your response is necessarily a function of this particular piece of information.

Mathematically, this is reflected in the requirement that E(X | F) must be F measurable.

I think this covers point 1 in your question, and tells you why a single real number is not sufficient.

Also concerning point 2, you are correct in saying that the role of F in E(X | F)

is not a single piece of information, but rather tells what possible specific pieces of (random) information may occur.

**Attribution***Source : Link , Question Author : Stefan , Answer Author : Community*