Hacker News new | ask | show | jobs
by andrewla 3177 days ago
My only objection to this is a semantic one -- the word "algorithm" is not being well-served here. The correct word for this sort of thing is "heuristic". The concern isn't that algorithms themselves are incorrect, the concern is that the problem they are trying to solve is a heuristic one, not a formal one.

Saying "let's write an algorithm to improve search results" is meaningless; "let's design and implement a heuristic that improves search results". The algorithmic part of this is how to efficiently implement that heuristic.

I can usually get through articles like this by silently replacing "algorithm" with "heuristic"; the problem arises when some articles attempt to draw equivalencies between "algorithmic" concepts, like running time and space, and "heuristic" concepts, like optimizing for the wrong thing.

6 comments

Algorithms running on "social hardware" can be surprisingly formal. A famously well-documented example are early modern witchhunts. The humorous depiction in Monty Python and the Holy Grail does a surprisingly good job at conveying the algorithmic nature.
Many aspects of the law are algorithmic. Even though there is no one, settled formal definition for algorithm, statutory and common law meet many informal definitions. Laws usually lay out an ordered, (theoretically) unambiguous set of steps for deciding a legal issue. When lawyers talk about "elements of a test," they are referring to this structured logic.

For example, the elements required to prove a negligence claim are:

1. Duty

2. Breach of Duty

3. Cause in Fact

4. Proximate Cause

5. Damages

When evaluating a negligence claim, a lawyer first tries to determine if the defendant owed the plaintiff any duty of care, then whether the plaintiff breached that duty, then if that breach was the factual cause of a harm suffered by the plaintiff, then whether the causal relationship was close enough to be considered legally proximate, and then, finally, whether the plaintiff actually suffered measurable damages.

Arguably, that superficially algorithmic process frequently breaks down in practice. For example, it's often easier to start with the damages suffered by a plaintiff and work backwards by identifying the causes, then who was responsible for those causes and any duties they may have owed to the plaintiff. However, regardless of how the lawyer and plaintiff identify whom to sue, they must frame their pleadings to allege the elements of the tort in the order specified by their jurisdiction's law, so the actual practice of law in court amounts to an algorithmic exercise.

Along those lines, here are some of my comments on this general topic from an email I posted to the Doug Engelbart Unfinished Revolution II Colloquium in 2000: http://www.dougengelbart.org/colloquium/forum/discussion/012...

===

... I personally think machine evolution is unstoppable, and the best hope for humanity is the noble cowardice of creating refugia and trying, like the duckweed, to create human (and other) life faster than other forces can destroy it. [Although in 2017 I'd add other possibilities like symbiosis or trying to create friendlier AI as a partner (or at least AIs with a sense of humor -- see James. P. Hogan's AIs, or the ones like Libbry in EarthCent Ambassador series, or the Old Guy Cybertank series example), improved sensemaking through better intelligence-augmenting tools, and trying to help human society be more compassionate in the hopes our path out of a singularity will somehow reflect our path going in...]

Note, I'm not saying machine evolution won't have a human component -- in that sense, a corporation or any bureaucracy is already a separate machine intelligence, just not a very smart or resilient one. This sense of the corporation comes out of Langdon Winner's book "Autonomous Technology: Technics out of control as a theme in political thought".

You may have a tough time believing this, but Winner makes a convincing case. He suggests that all successful organizations "reverse-adapt" their goals and their environment to ensure their continued survival. These corporate machine intelligences are already driving for better machine intelligences -- faster, more efficient, cheaper, and more resilient.

People forget that corporate charters used to be routinely revoked for behavior outside the immediate public good, and that corporations were not considered persons until around 1886 (that decision perhaps being the first major example of a machine using the political/social process of its own ends).

Corporate charters are granted supposedly because society believe it is in the best interest of society for corporations to exist. But, when was the last time people were able to pull the "charter" plug on a corporation not acting in the public interest? It's hard, and it will get harder when corporations don't need people to run themselves.

I'm not saying the people in corporations are evil -- just that they often have very limited choices of actions. If a corporate CEOs do not deliver short term profits they are removed, no matter what they were trying to do. Obviously there are exceptions for a while -- William C. Norris of Control Data was one of them, but in general, the exception proves the rule. Fortunately though, even in the worst machines (like in WWII Germany) there were individuals who did what they could to make them more humane ("Schindler's List" being an example).

Look at how much William C. Norris of Control Data got ridiculed in the 1970s for suggesting the then radical notion that "business exists to meet society's unmet needs". Yet his pioneering efforts in education, employee assistance plans, on-site daycare, urban renewal, and socially-responsible investing are in part what made Minneapolis/St.Paul the great area it is today. Such efforts are now being duplicated to an extent by other companies. Even the company that squashed CDC in the mid 1980s (IBM) has adopted some of those policies and directions. So corporations can adapt when they feel the need.

Obviously, corporations are not all powerful. The world still has some individuals who have wealth to equal major corporations. There are several governments that are as powerful or more so than major corporations. Individuals in corporations can make persuasive pitches about their future directions, and individuals with controlling shares may be able to influence what a corporation does (as far as the market allows).

In the long run, many corporations are trying to coexist with people to the extent they need to. But it is not clear what corporations (especially large ones) will do as we approach this singularity -- where AIs and robots are cheaper to employ than people. Today's corporation, like any intelligent machine, is more than the sum of its parts (equipment, goodwill, IP, cash, credit, and people). It's "plug" is not easy to pull, and it can't be easily controlled against its short term interests.

What sort of laws and rules will be needed then? If the threat of corporate charter revocation is still possible by governments and collaborations of individuals, in what new directions will corporations have to be prodded? What should a "smart" corporation do if it sees this coming? (Hopefully adapt to be nicer more quickly. :-) What can individuals and governments do to ensure corporations "help meet society's unmet needs"?

Evolution can be made to work in positive ways, by selective breeding, the same way we got so many breeds of dogs and cats. How can we intentionally breed "nice" corporations that are symbiotic with the humans that inhabit them? To what extent is this happening already as talented individuals leave various dysfunctional, misguided, or rogue corporations (or act as "whistle blowers")? I don't say here the individual directs the corporation against its short term interest. I say that individuals affect the selective survival rates of corporations with various goals (and thus corporate evolution) by where they choose to work, what they do there, and how they interact with groups that monitor corporations. To that extent, individuals have some limited control over corporations even when they are not shareholders. Someday, thousands of years from now, corporations may finally have been bred to take the long term view and play an "infinite game".

However, if preparations fail, and if we otherwise cannot preserve our humanity as is (physicality and all), we must at least adapt with grace whatever of our best values we can preserve or somehow embody in future systems. So, an OHS/DKR [Open Hyperdocument System / Dynamic Knowledge Repository] to that end (determining our best values, and strategies to preserve them) would be of value as well.

When aluminum was first discovered around 1827, and for decades afterward, it was worth more than platinum, and now just under two centuries later we throw it away. In perhaps only two decades from now, children may play "marbles" using diamonds, and a child won't bother to pick a diamond up from the street unless it is exceptionally pretty (although you or I probably would out of habit -- "see a diamond, pick it up, and all the day you have good luck").

This long essay is my own current perspective on this developing situation, and part of the process of my formulating my own thinking on these trends and how I as an individual will respond to them.

To conclude, I think all the "classical" problems like food, energy, water, education, and materials will be technically solvable by 2050 even if we don't do much specifically about them (and like hunger are solved today except for politics). The dynamics of technology and economics are just taking us there whether we like it or not. Those goods may all may essentially be "free" or "extremely cheap" by 2050. Obviously the complex politics of these issues need to be resolved, and the solutions need to be actually implemented. If they are "extremely cheap", people still need a tiny amount of income to buy them.

Still, I think Doug [Engelbart] is right. We face huge problems that only collaborative efforts can solve -- especially the problems of intelligent machines, technology-amplified conflict, and a complete disruption of our scarcity-based materialistic economic and social systems. These problems dwarf technical issues like energy, food, goods, education, and water.

The problem has always been, and will always be, "survival with style" (to amplify Jerry Pournelle). The next twenty years will fundamentally change what the survival issues are: environment, threats, and allies. They will also very well change what we value as "style" -- when diamonds are cheap as glass [perhaps from nanotechnology], what will one give to impress?

===

Just sayin... point me to the person who has the habit of seeing diamonds on the ground and picking them up (and doesn't work in a strip mine). Habits aren't somethings we want - they're somethings we do.
talking machines had an episode on the difference between algorithms and models, and how the general public understands the meaning of the word "algorithm". In general these are conflated terms which is hard to be absolutist about, at the very least

The general public (& journalists) use the word 'algorithm' to mean any computerized process that "does things to them", such your facebook news feed, or what a credit agency does.

This is a different meaning from how social scientists use these words.

http://www.thetalkingmachines.com/blog/2017/9/22/the-long-vi...

In the episode, he talks about how even something like Principal Components Analysis (PCA), which is something that normally we would call an algorithm, which follows a discrete sequence of steps, can also be thought of as resting on something that resembles a model

I don't think there's a correct word here. He's talking about widely differing things and using a vague word to try to relate those things. The correct thing is to reject the relation and treat each of the notions he's talking about (such as web search results, government policy, and economic models) as the distinct things that they are.
We already have a great word that exactly describes our approach to capitalism though: ideology.
Yes, a lot of people use "algorithm" to mean simply "procedure". I'm glad to see someone besides me pointing out the distinction between algorithms, in the strict sense, and heuristics. Complaining about the misuse of technical terms is unlikely to have impact on usage in the popular press ([0]), but I think it's appropriate in a technical discussion.

One of the most egregious misuses of "algorithm", in my opinion, is the term "genetic algorithms". Not only are these not algorithms in the strict sense, but referring to the procedures as "genetic heuristics" or "genetic search" would be much clearer.

[0] https://news.ycombinator.com/item?id=10475884

Genetic algorithms are algorithms in that they are a description of a specific process or set of processes (which may used in heuristics, or studied in isolation) at an implementation level of abstraction. "Genetic heuristics" suggests heuristic applications, rather than algorithms in isolation, and "genetic search" suggests a specific application, but a "genetic algorithm" can be extremely simple and isn't fundamentally different than an algorithm like quicksort. Either could be used as part of some heuristic, or not.
> a "genetic algorithm" can be extremely simple and isn't fundamentally different than an algorithm like quicksort

Ah, but it is. What do you get when you run quicksort? You get a sorted list. What do you get when you run a genetic "algorithm"? You get ... the result of performing that procedure. There's no other formally specifiable postcondition. You certainly aren't guaranteed to get a perfect solution to the problem you were trying to solve. That's why it's a heuristic: if you run this procedure a certain number of times, you might get a useful result. Maybe. And the way it works is by searching a space of possibilities in a certain way. That's not an application; that's the whole point.

I don't understand what you mean about formally specifiable postconditions. Genetic algorithms can have formally specifiable postconditions, they just aren't definable in terms of the problem space to which they're typically applied. The postconditions can be defined in terms of the state of the genetic system. I also am not able to find, anywhere, a formal definition of algorithm that unambiguously wouldn't include genetic algorithms as I outlined them in the above comment.

A heuristic explores a problem in a specific domain. An algorithm specifies a process at a level suitable for machine implementation. Genetic algorithms are, thus, algorithms, though they are typically applied in heuristics to explore real-world problem spaces.

Well, I'll grant you that lots of people use the word "algorithm" in a way that doesn't respect the distinction andrewla and I are trying to draw.

Nonetheless, I believe there is a real distinction here. As it happens, there is a discussion about algorithms textbooks on HN right now [0]. I contend that GAs are not the kind of thing that would ever be described in a textbook on algorithms, no matter how comprehensive.

> Genetic algorithms can have formally specifiable postconditions, they just aren't definable in terms of the problem space to which they're typically applied.

Actually, I think this would be a pretty good definition of the distinction, if we changed "genetic algorithms" to "heuristics". I think if you look at the procedures described in algorithms textbooks, you'll see they all have postconditions definable in terms of the problem space.

[0] https://news.ycombinator.com/item?id=15423045

> The concern isn't that algorithms themselves are incorrect, the concern is that the problem they are trying to solve is a heuristic one, not a formal one.

Heuristic problems are simply problems that aren't yet formally understood. I don't think it's meaningless to use "algorithm" in the examples you cite, as long as its understood that a good algorithmic solution requires a good model of the actual problem being solved.

well, heuristics are made up of algorithms.
This is not true. A (bad) heuristic for search results might be "rank the documents by the total number of occurrences of each search term in that document".

That's not an algorithm -- that's a desired result. Similar to how "sorting a list" is a description of a class of algorithms; it gives no description of how a machine can accomplish that goal.

The difference between the heuristic above and "sort a list" is that the success criteria of the latter can be very well defined, whereas the heuristic presented is an attempt at approximating the desired result, which is something like "present the best search results first, for some meaning of best".

>"rank the documents by the total number of occurrences of each search term in that document"

I fail to see how this is not an algorithm. The heuristic (rank search results from most to least relevant) is backed by an algorithm (find occurrence of word, sort document based on occurrences). I like to approximate the two by thinking of heuristics as an approach to solving a given problem while algorithms are actions to taken to get to the end results.

There's an algorithm to rank the documents by an arbitrary metric. And there's an algorithm to calculate the number of occurrences of each search term in that document.

However, those are insignificant implementation details - all the logic (and all the good and bad results) comes from the arbitrary decision to use the number of occurrences as meaningful for measuring the relevance, from the choice of heuristic.

That's just saying it's the wrong algorithm, not that it isn't an algorithm. Every computation of pi has truncated the result, which is a 'heuristic' decision, but that doesn't invalidate the fact that pi is computed using an algorithm.

All algorithms approximate things, after all. That's simply a consequence of abstraction.