Hacker News new | ask | show | jobs
by yarky 1345 days ago
When building a mcmc sampler I was too lazy to properly code a matrix approximation needed to avoid some mathematical black hole and the corresponding underflow. It was cheaper to just ignore the faulty simulations.

Turns out our results were better than the papers we compared to, both in time and precision.

I am not that familiar with ml, but can't you just ignore those faulty weights?

2 comments

With MCMC, depending on application, it seems risky to just toss out the NaN/inf results. I'd guess these numerical issues are more likely to occur in certain regions of the state space you're sampling from, so your resulting sample could end up a bit biased. In some cases the bias may be small or otherwise unimportant, so the speed-up and simpler code of filtering NaN/inf results is worth it, but in other cases (like when the MCMC samples feed into some chain of downstream computations) the bias may have sneaky insidious effects.
I didn't think deeply about this back then since my parameter estimates where close/better than the literature I compared to, but now I'm interested in checking the distribution of those NaN/inf. If I recall correctly they were uniformly distributed throughout an adaptive phase.
When people talk about AI taking over the world, a funny image pops up in my head where a robot is trying to enter a frying pan. When you ask it why it's doing that, it says "because I feel like [NaN, NaN, 2.45e24, NaN]", which is a perfectly valid reason.

I'm not at all caught up with the this side of ML but my first instinct is that faulty weights would lead to interpretability issues. The numbers represented by NaN/Inf vastly outnumber the ones within precision range, so interpreting them is much more of a guess.