Hacker News new | ask | show | jobs
by alankay1 3191 days ago
"More or less" ... I can see that it's hard after decades of "data-centric" perspectives to think in terms of "computers" rather than "data", and about semantics rather than pragmatics. It's not "data contains procedures" but that objects are (a) semantically computers, and are impervious to attack from the outside (a computer has to let an attack happen via its own internal programming), and (b) what's inside can be anything that is able to deal with messages in a reasonable way. In Smalltalk, these are more objects (there are only objects in Smalltalk). The way the internals of typical Smalltalk objects are organized could be done better (we used several schemes, but all of them had variables and methods (which were also objects).

So "2" is not "data" in Smalltalk (unfortunately, it is in Java, etc.)

We had planned that the interior of objects should be an "address space of objects" because this is a better and more recursive way to do modularization, and to allow a different inter-viewing scheme (we eventually did some of this for the Etoys system for children about 20 years ago). But the physical computers at Parc were tiny, and the code we could run was tiny (the whole system in the Ted Nelson demo video was a little over 10,000 lines of code for everything). So we stayed with our top priority: to make a real and highly interactive system that was very comprehensive about what it and the user could do together.

3 comments

Taking your feedback into consideration, I had the thought that it would be more accurate to talk about things like "2" in an OOP context as symbols with semantics (which provide meaning to them), not data, since "data" connotes more a collection of inputs/quantities, where we may be able to attach a meaning to it, or not, and that wasn't what I was going after. I was going after a relationship between information and semantics that can be associated with it, but trying to provide a transition point from the idea of data structures to the idea of objects, for someone just learning about OOP. Doing a sleight of hand may not do the trick.

My starting point was to use a very interesting concept, when I encountered it, in SICP, where it discussed using procedures to emulate data, and everything that can be done with it. It seemed to help explain for the first time what "code is data" meant. It illustrated the inversion I was talking about:

https://tekkie.wordpress.com/2010/07/05/sicp-what-is-meant-b...

"In 2.1.3 it questions our notion of “data”, though it starts from a higher level than most people would. It asserts that data can be thought of as this constructor and set of selectors, so long as a governing principle is applied which says a relationship must exist between the selectors such that the abstraction operates the same way as the actual concept would." It went on to illustrate how in Lisp/Scheme you could use functions to emulate operations like "cons", "car", and "cdr", completely in procedural space, without using data structures at all.

This is what I illustrate with "2 + 2", and such, that code is doing everything in this operation, in OOP. It's not a procedure applied to two operands, even though that's how it looks on the surface.

Yes, the SICP follows "simulate data" ideas much further back in the past, including the B5000 computer, and especially the OOP systems I did. But the big realization is that there are very few things that are helpful when they are passive, and the non-passive parts are the unique gift of computing. The question is not whether ideas from the past can be simulated (easy to see they can be if the building blocks are whole computers) but what do we "mean by 'meaning' "?

Good answers to this are out of the scope of HN, but we should be able to imagine -processes- moving forward in both our time and in simulated time that can answer our questions in a consistent way, and can be set to accomplish goals in a reasonable way.

I was using "data" in the spirit of a saying I heard many years ago in CS, that, "Data is code, and code is data." It seems that people in CS are still familiar with this phrase. I was focusing on the latter part of that phrase. I was trying to answer the question that I think is often implied once you start talking to people about real OOP, "What about data?" I almost don't like the term "data" when talking about this, because as you say, it gets one away from the focus on semantics, but whenever you're talking to people in the computing field, such as it is, I think this question is unavoidable, because people are used to thinking of code and information as separate, hence the notion of data structures. People need a way to translate in their minds between what they've done with information before, and what it can be. So, I used the term "data" to talk about "literal objects" (like "2", or other kinds of input), but I was using the description of "processors" (ie. computers), "containing procedures," which can also be thought of as "operators."

I think the idea of an "inversion" is quite apt, because as you've said before, the idea of data structures is that you have procedures acting on data. With real objects, you still have the same essential elements in programming, the same stuff to deal with, but the kinds of things programmers typically think about as "data" are objects/computers in OOP, with intrinsic semantics. So, you're still dealing with things like "2", just as procedures acting on data do, but instead of it being just a "dead" symbol, that can't do anything, "2" has semantics associated with an interface. It's a computer.

> We had planned that the interior of objects should be an "address space of objects" because this is a better and more recursive way to do modularization

Something that nags me in the back of my mind is that messages are not just any object, they always have the selector attached. Why not let objects handle any other object as a message? Is this what you mean by the above?

Thinking about the biological analogy (maybe taking it too far...): the system of cells is distinct from the system of proteins inside the cells and going up the layers we have the systems of creatures. So the way proteins interact is different from how cells interact, etc. but each system derives its distinct behaviors from the lower ones. Also, the messages are typically not the entities themselves but other lower level stuff (cells communicate using signals that are not cells). So in a large scale OO system we might see layers of objects emerge. Or maybe we need a new model here, not sure.

Take a look at the first implemented Smalltalk (-72). It implemented objects internally as a "receive the message" mechanism -- a kind of quick parser -- and didn't have dedicated selectors. (You can find "The Early History of Smalltalk" via Google to see more.)

This made the first Smalltalk "automatically extensible" in the dimensions of form, meaning, and pragmatics.

When Xerox didn't come through with a replacement for the Alto we (and others at Parc) had to optimize for the next phases, and this led to the compromise of Smalltalk-76 (and the succeeding Smalltalks). Dan Ingalls chose the most common patterns that had proved useful and made a fixed syntax that still allowed some extension via keywords. This also eliminated an ambiguity problem, and the whole thing on the same machine was about 180 times faster.

I like your biological thinking. As a former molecular biologist I was aware of the vast many orders of magnitude differences in scale between biology and computing. (A typical mammalian cell will have billions of molecules, etc. A typical human will have 10 Trillion cells with their own DNA and many more in terms of microbes, etc.) What I chose was the "Cambrian Revolution Recursively": that cells could work together in larger architecture from biology, and that you can make the interiors of things at the same organization of the wholes in computing because of references -- you don't have to copy. So just "everything made from cells, including cells", and messages made from cells, etc.

Some ideas you might find interesting are in an article I wrote in 1984 -- called "Computer Software" -- for a special issue of Scientific American on "Software". This talks about the subject in general, and looks to the possibility of "tissue programming" etc.

I should have mentioned a few other things for the later Smalltalks. First, selectors are just objects. Second, you could use the automatic "message not understood" mechanism to field an unrecognized object. I think I'd do this by adding a method called "any" and letting it take care of arbitrary unknown objects ...
> adding a method called "any"

Right, I understand there are ways to do this with methods but my question was more about the purity aspect, which you already addressed above.

A selector is an object -- so that is pure -- and its use is a convention of the messaging, and the message itself is one object, that is an instance of Class message.

What's fun is that every Smalltalk contained the tools to make their successors while still running themselves. In other words, we can modify pretty much anything in Smalltalk on the fly if we choose to dip into the "meta" parts of it, which are also running. In Smalltalk-72, a message send was just a "notify" to the receiver that there was a message, plus a reference to the whole message. The receiver did the actual work of looking at it, interpreting it, etc.

This is quite possible to make happen in the more modern Smalltalks, and would even be an interesting exercise for deep Smalltalkers.

> A selector is an object -- so that is pure -- and its use is a convention of the messaging

The selector 'convention' is hard coded in the syntax - this appears to elevate selector based messaging over other kinds. But now I'm rethinking this differently - i.e. selectors isn't part of the essence, but a specific choice that could be replaced (if we find something better.)