Hacker News new | ask | show | jobs
by currymj 1113 days ago
I don't endorse the following as an accurate history, but it is a narrative that is taught within machine learning and probably has some elements of truth to it.

If you want to take a system that's working on Boolean logic, and introduce uncertainty, replacing true/false with probabilities does a great job of this. However in the 1960s/1970s, people believed it was hopeless to have AI systems use probability to deal with uncertainty. This is because probability requires you to use Bayes' Theorem, and computing the denominator in Bayes' Theorem requires summing over an exponentially large number of different outcomes.

Thus people came up with all kinds of alternative systems to avoid dealing with probability. Among these would have been fuzzy logic.

However people came up with ways to cope with the computational intractability (belief propagation on Bayesian networks, better Markov-chain Monte Carlo algorithms, etc.), so probability became practically viable. And on the merits, if you can deal with the computational issues, probability seems to be much nicer than these other formal systems. So since the 1980s/1990s the probabilistic approach to AI has become dominant (even if deep learning has displaced the actual models).

2 comments

When was the reparametrization trick developed?
I'll first admit you likely know way more than I about this. So this is not an attack on your statements, but an ask for clarification.

>replacing true/false with probabilities does a great job of this ... people came up with all kinds of alternative systems to avoid dealing with probability

a probability is a number between 0 and 1, instead of either 0 or 1. a number between 0 and 1, instead of 0 and 1, is called fuzzy logic.

so I'm really lost about what you're saying here.

In addition to the other replies here, one fundamental difference between probabilistic and fuzzy logic is that fuzzy logic is truth functional and probabilistic logic isn't. Truth functional means, for instance, that if we know the (numerical) truth values of the propositions A and of B then we also know the (numerical) truth values of the propositions (A and B), (A or B), and so on. In probabilistic logic this does not hold. That is, P(A and B) is not fully determined by P(A) and P(B). If A and B are independent, we have P(A and B) = P(A)P(B), but in general we only know that P(A)+P(B)-1 <= P(A and B) <= min(P(A),P(B)). I also believe there's no generally accepted notion of conditioning in fuzzy logic, whereas conditioning is crucial in any probabilistic approach, see e.g. Bayes' theorem.
Truth functionality always struck me as a bodge in fuzzy logic, because you get to choose the implementation of (A and B) fairly arbitrarily. You're getting to choose some mix that lets you pretend the degree of dependence between the variables doesn't matter. It's a useful engineering hack, but a hack nonetheless.
> so I'm really lost about what you're saying here.

Numeric representation may be the same[0], but the rules of how to do math on them, and what that math means, are different.

--

[0] - AFAIK dealing with Bayesian math, it's usually more useful to take a logarithm of all probabilities, so you can add them instead of multiplying. This translation expands the range from <0, 1> to <-infinity, +infinity>, and incidentally makes it clear why some people say that "zero and one are not probabilities".

`log(x)` does not give `+inf` on the range `(0,1)`, I think you're thinking of "log odds" aka a "logit" which is `log(x/(1-x))`
The 0..1 of fuzzy logic isn't the 0..1 of probability. When, in fuzzy logic, you say "X is true to degree 0.7" you are not saying "70% of the time, X is true" or "70% of samples display X"[0]. You are saying that the state you observe conforms to X by 70%. You can, at the same time, say "Y is true to degree 0.4". X and Y are allowed be somewhat contradictory: there's no need for their degrees to sum to 1.

[0] Not necessarily. You might choose to measure X that way, but it's not required.

You'll need to be more specific, because probability can be described the same way, numerically.
Fuzzy logic deals with degrees/scores, probability theory deals with how likely something is.

When you say that "The hotel room is 75% clean", that's a degree. It means a room that's not as clean as a "90% clean" hotel room, but definitely cleaner than a "50% clean" hotel room on some kind of cleanliness scale that you have. You need a scale because the boundary between clean and unclean is not sharp, but fuzzy. Fuzzy logic gives you tools to construct well-behaved scales for logical combinations of variables that already have scales associated with them. E.g. if you have a scale for a motorcycle being loud and a scale for a motorcycle being expensive (both fuzzy concepts), the tools of fuzzy logic can e.g. give you a scale for loud OR expensive.

In contrast, when you say that "The hotel room is clean with probability 75%", you're reasoning about how likely it is that the room is clean under uncertainty. Maybe the cleaners only work 3 out of 4 days, and you're unsure what day it is. But these are not degrees of cleanliness: if you say that a room has a 75% chance of being clean you're not claiming that it's cleaner than some other room that has a 50% chance of being clean.

The concepts involved in probability need not be fuzzy, or even measured on a scale. E.g. when one says "there's a 75% chance that the car repair will cost more than $50", there is no fuzziness involved in whether the repair costs more than $50 or not: in the end, you'll get a bill, and it will state a number that is unambiguously either above $50 or not above $50, a pure binary variable, no scales involved.

A simple example I once heard was,

Probability: “There’s a 70% chance the grass will be wet tomorrow morning”

Fuzzy logic: “The grass will be 70% wet tomorrow morning”

Fuzzy logic uses very different rules.

It is common to saysthat the 'truthiness' of (a and b), is the minimum truthiness of a and b. Or becomes a maximum, not becomes 1 - truthiness. Then normal predicate logic works by combining these rules the usual way. E.g. (a implies b) becomes b or not a, which is max(b, 1-a).

Basyan probability is much more sophisticated, but also more difficult to calculate.

Beyond the already well explained aspects of truth-functionality, you're thinking of t-norm fuzzy logic (basically the only one actually still studied today). There were other kinds.
I believe fuzzy logic had its own axioms which aren't the axioms of propositional logic extended to probabilities. So the representations look similar but are manipulated differently.
For example, in Bayesian logic:

x AND y is xy, x OR y is x + y - xy

Whereas in (Zadeh) fuzzy logic:

x AND y is min(x, y), x OR y is max(x, y)

IIRC, “fuzzy logic” is actually a class that includes all generalizations of crisp binary logic to continuous values over [0,1] with operators meeting a set of definitions which basically boil down to “reduces to crisp logic when the input values are constrained to 0 and 1”, so that Bayesian logic is a fuzzy logic.

The Zadeh operators in particular I remember being constructed, or at keast rationalized, as ways to combine the degree of truth of propositions as distinct from the probability of truth of uncertain proposition. But I think interest in the kind of epistemic differences in alternative extensions to propositional logic faded with the lack of a practical need in terms of computational efficiency to avoid Bayesian probability (and I think there was also a separate philosophical battle and the side in favor of “Bayesianism is the only meaningful extension of propositional logic” was winning that battle when the computational problems were resolved, which helped sweep aside the alternatives.)

> For example, in Bayesian logic: x AND y is xy, x OR y is x + y - xy

No it isn't. P(x AND y) = P(x)P(y) only in the special case where x and y are independent. Unlike fuzzy logic, probabilistic logic is not truth functional.

> No it isn’t. P(x AND y) = P(x)P(y) only if x and y are independent.

You are obviously correct, and I shouldn’t post when I should be sleeping.

> Unlike fuzzy logic, probabilistic logic is not truth functional.

Its been a long time since I had much engagement with fuzzy logic, but I distinctly remember it being constructed as a class such that Bayesian probability was a fuzzy logic, though the others of interest were much simpler. But that may be as wrong as the other part of that post...