Hacker News new | ask | show | jobs
by srean 2597 days ago
I think it will help if you think in terms of conditioning on (for example, a coarser sigma algebra). You would get another random variable that is measurable on the sigma algebra you conditioned on. If that is coarser so would be the new function you obtained by conditioning.
1 comments

Let's talk about a fair dice roll to make it concrete, and let the rolled number be X and let the event that we rolled an even number be E. P(X=6|E) = 1/3. P(X|E) is a distribution where 1,3,5 has 0 probability mass and 2,4,6 have 1/3 each.

If we consider X|E as a random variable, what is its value if we roll an odd number? Undefined? What does that mean? Random variables always have some value.

Sure you can build a new event space (sigma algebra) but then you can't use random variables over the original one.

Let's consider two independent rolls, X and Y. You can't compute the joint distribution P(Y, (X|E)), it just doesn't make sense as the two "variables" are defined over different spaces. Note that this is not the same as P(X,Y | E). The latter is simple a conditional probability, without any concept "conditional random variables".

Again, this is totally obvious to people who have experience with probabilities, but could be confusing to students. Such cases are where students who try to understand the details may be left more confused than students who just want to get the main idea.

Sure you can. The TLDR would be "piecewise constant projection"

I think picking up a standard graduate probability book will clear this up better than any long comment trail. There are no problems defining a coarser sigma algebra using an original one and then defining a function measurable on the new sigma algebra. Note this continues to be an r.v. in the original space as meaurability is preserved. A consistent definition the values of the conditioned r.v. would be the piecewise constant approximation of the original r.v. over the indivisible elements of the coarser sigma algebra.

Let me try another route.

You seem to be accepting of a conditional expectation. Now what is a conditional expectation if not a function. Now all we need is that function be measurable with respect to the new sigma algebra, thats ensured byconstruction. Hope it helped some

> I think picking up a standard graduate probability book will clear this up better than any long comment trail.

Can you recommend one? I just picked up Probability and Measure by Billingsley and it does not mention "conditional random variable" a single time in over 600 pages. It does have a lot of "conditional probability", "conditional distribution", "conditional expectation" etc.

> You seem to be accepting of a conditional expectation.

Conditional expectation is defined in terms of conditional probabilities, and those are in turn explicitly defined as P(A|B)=P(A,B)/P(B), so there's nothing not to accept.

Billingsley is pretty darn good. It might have left the connection as a dotted line given that the notion is no different from conditional expectation. The only connection you have to make is conditional expectation is a function and a random variable. You must have seen expectation taken of a conditional expectation. That should should convince you that condititional expectation is indeed a random variable. Since that r.v. was obtained by conditioning its not a stretvh to call it a conditioned r.v.

Any book that explains conditioning over a sigma algebra should suffice. You could try Loeve, Dudely or Neveu but dont remember if its mentioned explicitly.

BTW conditional expectation is really more fundamental than conditional probability. Its the former that yields the latter in measure theoretic probability. If you want to drink from the source that would be Kolmogorov.

Finally if you are reading Billingsley you are adequately qualified to call yourself a mathematician.

It's getting a little tedious. Please show me a concrete citation of a serious textbook (not a tutorial/handout by a grad student or a paper by a random researcher) that puts the three words "conditional random variable" next to each other (consistently, not simply as a one-off potential mistake). Google doesn't show serious sources for it.

While I agree with isolated points of your comment I think it doesn't add up to a useful/coherent concept of conditional random variable.

Thats a little too much to ask, perhaps if they were grep'able I could have obliged, unfortunately I dont have a photographic memory.

More concretely its just another name for conditional expectation. I am assuming you are aware that conditional expectation is a random variable obtained via conditioning (equivalently as a piecewise approximation in L_2). If you arent familiar with that view point that would be the place to start. Kolmogorov, Neveu, Dudely, Billingsley will all cover that view point.

> If we consider X|E as a random variable, what is its value if we roll an odd number? Undefined? What does that mean? Random variables always have some value.

Random variables have some value on their domain, and for the random variable X | E=1 the sample space is restricted to the elementary events {2,4,6} which conform the composite event E=1. The original sample space is partitioned in the subspaces {1,3,5} and {2,4,6} when we condition on the values of the random variable E (0:odd, 1: even).

> Sure you can build a new event space (sigma algebra) but then you can't use random variables over the original one.

I guess we all agree then.

> Let's consider two independent rolls, X and Y. You can't compute the joint distribution P(Y, (X|E)), it just doesn't make sense as the two "variables" are defined over different spaces.

The variables X and Y describing independent rolls are also defined over different spaces and to have a joint distribution you have to define a "common" sample space of the form {x=1,y=1},{x=2,y=1},..,{x=6,y=6}.

You could do the same for a roll of a dice and the toss of a coin. Or do you think that computing the joint distribution of a coin toss and a dice roll doesn't make sense because they are defined over different spaces?

> You could do the same for a roll of a dice and the toss of a coin. Or do you think that computing the joint distribution of a coin toss and a dice roll doesn't make sense because they are defined over different spaces?

Of course it doesn't! You first have to define them on a common space (the Cartesian product), and for that you have to specify their joint probabilities. One example might be that you model them as independent. Otherwise we wouldn't know how the coin and the dice relate. Sure independence is usually a good default assumption, but it's still a necessary step.

What did you mean with the following paragraph then?

> Let's consider two independent rolls, X and Y. You can't compute the joint distribution P(Y, (X|E)), it just doesn't make sense as the two "variables" are defined over different spaces.

Do you agree that you cannot compute the joint distribution P(Y,X) either because the two variables are defined over different spaces?

I meant them to be defined on the same space. It's a single experiment, the outcome of which are two rolls that happen to be independent.
If you mean that the space for this single experiment composed of two rolls (random variables X and Y) is the cartesian product of {x=1,x=2,x=3,x=4,x=5,x=6} and {y=1,y=2,y=3,y=4,y=5,y=6}, then I agree.

But the fact that each variable alone is defined on the "same" sample space {1,2,3,4,5,6} is irrelevant.

The situation is no different from the joint probability for random variables X and Z corresponding to a single experiment consisting of a dice roll and a coin toss, where the relevant space is the cartesian product of {x=1,x=2,x=3,x=4,x=5,x=6} and {z=1,z=2}.

And it is also similar for the situation you asked about, with a random variable Y and a "conditional" random variable X|Even. The relevant space is the cartesian product of {y=1,y=2,y=3,y=4,y=5,y=6} and {x=2,x=4,x=6}.