| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by captainmuon 1748 days ago

> Security starts with deep understanding.

I wonder if the way we are approaching it is wrong. We are basically putting text though a deep learning black box. The model might have learned some abstractions, but all in all it is just playing word games and trying to guess the most likely continuation of a string. Maybe we should go into the other direction and base such an AI on a really massive ontology. Instead of unstructured strings, put highly structured facts into the model.

For example, just like in Copilot you'd start with:

    def login_user(username, password):

But the ontology would also know things like:

- This is a web application and this function is going to be called after submitting a form

- Security specialist Bob says you should always hash your passwords

- Specialist Anne says you should use bcrypt

- Tom says Anne is 95% trustworthy

... and thousands of facts more. And then it would take them all into consideration, build a represenation of the problem you are trying to solve, find a strategy, and only in the end generate code.

I have a feeling that there was a qualitiative leap going from simple neural networks and multivariate methods to "deep learning" and modern machine learning, and that this is mainly driven by scale and available computing power. Now what if we try the same thing for ontologies, expert systems, and triple store databases? I think the difference will be between some AI parroting what it read on Wikipedia (direct speach), and a smarter AI being able to reason about what it read on Wikipedia (indirect speach).

3 comments

lmilcin 1748 days ago

I think one way this could be improved is, instead of giving an exact answer (which is provably impossible to do correctly) maybe it could be possible to point the developer to other repositories where other people were solving similar problem.

There are already services that do this for you and I actually find them useful. For example, I might be trying to use a function from some library and it fails. If I get pointed to some public repositories that use the same library in function for similar purpose, I may learn that I am missing some critical setup. I can also browse different uses of this function/library and get informed on how it is at the very least used successfully by others.

link

perl4ever 1747 days ago

There was a project started in 1984 to do that:

https://en.wikipedia.org/wiki/Cyc

Supposedly an attempt to assemble a database of "common sense" facts and reasoning.

It has always been controversial and it's not clear what kind of success it's had.

link

DonHopkins 1748 days ago

You're touching on the "Neat -vs- Scruffy" dichotomy in AI. (But it's not necessarily a dichotomy -- they can be combined!)

https://en.wikipedia.org/wiki/Neats_and_scruffies

From the "Scruffy" side, there's Charles Rich's classic work on "Programmer's Apprentice".

https://dspace.mit.edu/handle/1721.1/6054

https://dspace.mit.edu/bitstream/handle/1721.1/6054/AIM-1004...

>The Programmer's Apprentice Project: A Research Overview

>MIT AI Lab Memo No. 1004, November 1987.

>Rich, Charles; Waters, Richard C.

>Abstract: The goal of the Programmer's Apprentice project is to develop a theory of how expert programmers analyze, synthesize, modify, explain, specify, verify, and document programs. This research goal overlaps both artificial intelligence and software engineering. From the viewpoint of artificial intelligence, we have chosen programming as a domain in which to study fundamental issues of knowledge representation and reasoning. From the viewpoint of software engineering, we seek to automate the programming process by applying techniques from artificial intelligence.

https://dspace.mit.edu/handle/1721.1/41967

https://dspace.mit.edu/bitstream/handle/1721.1/41967/AI_WP_1...

>Plan Recognition in a Programmer's Apprentice. Ph.D. Thesis proposal.

>MIT AI Lab Working Paper 147, May 1977.

>Rich, Charles

>Abstract: Brief Statement of the Problem: Stated most generally, the proposed research is concerned with understanding and representing the teleological structure of engineered devices. More specifically, I propose to study the teleological structure of computer programs written in LISP which perform a wide range of non-numerical computations. The major theoretical goal of the research is to further develop a formal representation for teleological structure, called plans, which will facilitate both the abstract description of particular programs, and the compilation of a library of programming expertise in the domain of non-numerical computation. Adequacy of the theory will be demonstrated by implementing a system (to eventually become part of a LISP Programmer's Apprentice) which will be able to recognize various plans in LISP programs written by human programmers and thereby generate cogent explanations of how the programs work, including the detection of some programming errors.

link

captainmuon 1748 days ago

Thanks for the term and the resources! Sometimes one has a vague idea and it's really nice to see that this is a thing people put thought into.

Funny that I would describe a solution based on machine learning as scruffy and a solution based on bayesian logic and knowledge databases as neat, whereas Wikipedia defines it the other way around.

link

DonHopkins 1748 days ago

There's some fascinating historical back-story and quotes on the talk page of that wikipedia entry, and also an interesting question about how machine learning is neat:

https://en.wikipedia.org/wiki/Talk:Neats_and_scruffies

>Roger Schank first used those terms "scruffy" and "neat" at an AI conference in the 1970s. He proudly called himself a scruffy. 71.183.59.144 (talk) 02:17, 26 October 2011 (UTC)

>The terminology is sourced to the late 1970s or early 1980s and originated by Schenk according to this:

>"In particular, certain personality traits go hand and hand with certain styles of research. Schank and Abelson hit upon one such phenomenon along these lines and dubbed it the neats vs. the scruffies. These terms moved into the mainstream AI community during the early 80s, shortly after Abelson presented the phenomenon in a keynote address at the Annual Meeting of the Cognitive Science Society in 1981. Here are some selected excerpts from the accompanying paper in the proceedings:"

>The article quotes a lengthy excerpt of this keynote address, some of which I include below

>“The study of the knowledge in a mental system tends toward both naturalism and phenomenology. The mind needs to represent what is out there in the real word, and it needs to manipulate it for particular purposes. But the world is messy, and purposes are manifold. Models of mind, therefore, can become garrulous and intractable as they become more and more realistic. If one’s emphasis is on science more than on cognition, however, the canons of hard science dictate a strategy of the isolation of idealized subsystems which can be modeled with elegant productive formalisms. Clarity and precision are highly prized, even at the expense of common sense realism. To caricature this tendency with a phrase from John Tukey (1969), the motto of the narrow hard scientist is, “Be exactly wrong, rather than approximately right”.

>The one tendency points inside the mind, to see what might be there. The other points outside the mind, to some formal system which can be logically manipulated [Kintsch et al., 1981]. Neither camp grants the other a legitimate claim on cognitive science.... an unnamed but easily guessed colleague of mine (Schenk?), who claims that the major clashes in human affairs are between the “neats” and the “scruffies”. The primary concern of the neat is that things should be orderly and predictable while the scruffy seeks the rough-and-tumble of life as it comes ... The fusion task is not easy. It is hard to neaten up a scruffy or scruffy up a neat. It is difficult to formalize aspects of human thought that which are variable, disorderly, and seemingly irrational, or to build tightly principled models of realistic language processing in messy natural domains.

>What are the difficulties in starting our from the scruffy side and moving toward the neat? The obvious advantage is that one has the option of letting the problem areas itself, rather than the available methodology, guide us about what is important. The obstacle, of course, is that we may not know how to attack the important problems. More likely, we may think we know how to proceed, but other people may find our methods sloppy. We may have to face accusations of being ad hoc, and scientifically unprincipled, and other awful things."

>Source is Chapter 5 of this book edited by Schenk and published in 1994, titled "Beliefs, Reasoning, and Decision Making: Psycho-logic in Honor of Bob Abelson". Article needs clean-up, which I am doing now.--FeralOink (talk) 13:58, 2 August 2021 (UTC)

https://books.google.com/books/about/Beliefs_Reasoning_and_D...

>How is machine learning neat?

>Machine learning is only provably correct for the known examples it was trained for. If that is not an adhoc approach to AI, then I don't know what is. Big data is the epitome of a scruffy. No model, just data, not formalism, besides fitting a curve/model to the given data. It is the exact same approach that scruffies follow: abstracting from examples for specific sub tasks.<unsigned>

>Just because some mathematical methods are employed, like optimization for a sub-problem, i.e. curve fitting, does not make the approach itself neat.

>Obviously, scruffies also use mathematically rigorous approaches, when employing provably correct algorithms, such as searching trees, or certain signal processing approaches.

>So far, the only valid "neats", are those doing GOFAI: they use a minimal model and deduce everything based on it, with no added assumptions or axioms along the way.

>Machine learning is only based on added assumptions/axioms: the training data. New for each problem, no general model.<unsigned>

>Yeah, I noticed that too. Not sure who introduced machine learning to the article. I'm trying to clean up, e.g. removing the jargon about scruffies just being casual hackers throwing stuff together in an ad hoc manner. I don't know enough about the people involved though. I know about the methods you mention (curve fitting, converging series, mathematical modeling) but not necessarily who did what. I don't even know whether most of these guys, the neats OR the scruffies, would be comfortable with "big data" (i.e. lots of specious results with very low cost of being wrong).--FeralOink (talk) 10:48, 3 August 2021 (UTC)

link