Hacker News new | ask | show | jobs
by NougatRillettes 2598 days ago
Thanks for your comments! I don't see how what is discussed here conflicts with the notation I introduced into the post, do you still believe there is a soundness issue in what I have written?
1 comments

I'm just saying that the notation of say A * X|Y * B seemed unfamiliar to me. I only know conditional notation within a P(...). Or an expectation, etc. Apparently your way of writing is used by others as well, but it may be good to know that it is not fully rigorous.

Again, there are different people preferring different presentations. I as a student was often frustrated by abused notations and was often confused by such things when trying to understand something in detail. For a more cursory and "practical" understanding it could be good enough.

> it may be good to know that it is not fully rigorous

What is the problem with A|B=b being a random variable? (Apart from you unfamiliarity with the concept, I mean.)

Edit: I don’t say there are no problems, I ask what do you think the problem is? There is no problem in the discrete case. In the continuous setting things are indeed more complicated (but if the limiting process is well defined there are no issues).

Note that the same lack of rigour that you find in conditional ramdom variables affects conditional probabilities. If you can accept the latter there is no reason to reject the former.

A random variable is different concept from a distribution. For me personally it is helpful to keep them separate, but I can see that others may not care about the complete conceptual picture.

In the PDF file linked above I can see conditional probabilities, conditional distributions and conditional expectation etc, which are all valid and rigorous. I can see that the author thinks it's a good idea to merge these into a single concept of conditional random variable for didactic reasons, but that's not a rigorous concept.

Practically, if you have two random variables then you can take their joint distribution. What would be the joint distribution of (A|B) and (C|D)? For actual random variables it's simple: you can take intersections in event space, but a "conditional random variable" does not correspond to any subset of the event space.

Very simply speaking (this is my working model, not the exact precise math definition which involves a lot of measure theory): in probability theory we have an event space containing atomic events that cover all possible outcomes for the whole experiment/observation. A random variable is a function that maps from each such potential (atomic) event to a number. That's right. The random variable is a function but not the mass function, which maps from a number to a probability.

Conditional probability P(A|B) is an expression defined to mean P(A,B)/P(B). That's a clear definition. I am yet to see the actual definition of a conditional random variable.

Again, disclaimer 1: I can see the practicality of disregarding formality. Still I argue this is best done only when you do know better but it would be tedious to be technically correct all the time. But as a beginner I find it more useful to keep track of the correct concepts. For example not distinguishing random variables and distributions can be very confusing when considering more advanced things, like mutual information and KL-divergence. The former operates on random variables, the latter on distributions. I remember this was a difficult realization for me because the material we used didn't emphasize the difference enough, probably in the name of practicality.

Disclaimer 2: my point is a minor one overall.

> Practically, if you have two random variables then you can take their joint distribution.

If they are defined in the same sample space.

> a "conditional random variable" does not correspond to any subset of the event space

I would say it's exactly the other way around, the domain of a "conditional random variable" is a subset of the domain of the "unconditioned" random variable (the subset where the conditioning holds).

I think it will help if you think in terms of conditioning on (for example, a coarser sigma algebra). You would get another random variable that is measurable on the sigma algebra you conditioned on. If that is coarser so would be the new function you obtained by conditioning.
Let's talk about a fair dice roll to make it concrete, and let the rolled number be X and let the event that we rolled an even number be E. P(X=6|E) = 1/3. P(X|E) is a distribution where 1,3,5 has 0 probability mass and 2,4,6 have 1/3 each.

If we consider X|E as a random variable, what is its value if we roll an odd number? Undefined? What does that mean? Random variables always have some value.

Sure you can build a new event space (sigma algebra) but then you can't use random variables over the original one.

Let's consider two independent rolls, X and Y. You can't compute the joint distribution P(Y, (X|E)), it just doesn't make sense as the two "variables" are defined over different spaces. Note that this is not the same as P(X,Y | E). The latter is simple a conditional probability, without any concept "conditional random variables".

Again, this is totally obvious to people who have experience with probabilities, but could be confusing to students. Such cases are where students who try to understand the details may be left more confused than students who just want to get the main idea.

Sure you can. The TLDR would be "piecewise constant projection"

I think picking up a standard graduate probability book will clear this up better than any long comment trail. There are no problems defining a coarser sigma algebra using an original one and then defining a function measurable on the new sigma algebra. Note this continues to be an r.v. in the original space as meaurability is preserved. A consistent definition the values of the conditioned r.v. would be the piecewise constant approximation of the original r.v. over the indivisible elements of the coarser sigma algebra.

Let me try another route.

You seem to be accepting of a conditional expectation. Now what is a conditional expectation if not a function. Now all we need is that function be measurable with respect to the new sigma algebra, thats ensured byconstruction. Hope it helped some

> If we consider X|E as a random variable, what is its value if we roll an odd number? Undefined? What does that mean? Random variables always have some value.

Random variables have some value on their domain, and for the random variable X | E=1 the sample space is restricted to the elementary events {2,4,6} which conform the composite event E=1. The original sample space is partitioned in the subspaces {1,3,5} and {2,4,6} when we condition on the values of the random variable E (0:odd, 1: even).

> Sure you can build a new event space (sigma algebra) but then you can't use random variables over the original one.

I guess we all agree then.

> Let's consider two independent rolls, X and Y. You can't compute the joint distribution P(Y, (X|E)), it just doesn't make sense as the two "variables" are defined over different spaces.

The variables X and Y describing independent rolls are also defined over different spaces and to have a joint distribution you have to define a "common" sample space of the form {x=1,y=1},{x=2,y=1},..,{x=6,y=6}.

You could do the same for a roll of a dice and the toss of a coin. Or do you think that computing the joint distribution of a coin toss and a dice roll doesn't make sense because they are defined over different spaces?