Hacker News new | ask | show | jobs
by cyberferret 2949 days ago
While I believe 'plain English' type programming languages are a great concept, the reality is that it just introduces far more ambiguity into the mix, and you are just left trying to guess what the language designer's vagaries are.

I came across this years ago when trying to get to grips with the then new fangled Ruby language. I kept having to go back to the documentation to remember the best way to convert a string to all uppercase... was it:

  str.upper
  str.uppercase
  str.ucase
  str.upcase
  str.capitalize (<- Don't even get me started on the regional differences of 'ise' vs 'ize' between US and UK English variants)
  ???
Even here, I would probably start using Avail, then in a few weeks I would be scratching my head and asking, was it:

  Print 1 to 10, as a comma-separated list
or

  Display 1-10, in CSV format
11 comments

Yeah, that was my problem with AppleScript. It’s a read-only language to me. It took me years to try other languages and discover I could actually write code.
"The experiment in designing a language that resembled natural languages (English and Japanese) was not successful." -- William Cook

Any programming language that tries to do "natural language" should at least reference the AppleScript HOPL Paper[1] and say how they are doing things differently to address the (now) obvious problems.

(Oh, and there is a LOT of good stuff in the paper, definitely worth a read).

[1] http://www.cs.utexas.edu/~wcook/Drafts/2006/ashopl.pdf

I think all experiments of languages that resembled natural languages will never work until one will try to ascend to a more high-level paradigm.

Actuals programming language are based on a "forklift paradigm" : you describe the path of a forklift in a virtual warehouse where data's are in some places and your forklift move them from places to places. So a natural language isn't design to express this kind of semantic.

In PPIG ( Psychology of Programming Interest Group ) 2006 was published a very interesting paper where is studied the metaphors which was shared by the core java library authors, the mental model said in another way. It reaveals very interesting facts : Programmers tend to think the world in term of Belief-Intention agents. Agent are members of a society who trade and own datas and are subject to legal constraints. A method calls is a speech act. The execution is mentalized as a journey in a a spatial world. Etc. There's a lot of metaphors like that, and it reveals tracks to conceive a more high level language.

So there's a lot of work in this field (Behavior-Tree based language, time dimension include inside the semantics, etc.).

Note that it is also possible to express non ambiguigous semantics in natural language : Attempto Controlled English is an exemple [2].

Here an toy example of what you could do with this language :

"MyWebPage is a webpage. MyWebPage contains a textfield which a logical name is-named NameTextField. His label is "Name". MyWebPage contains a button which a logical name is-named SubmitButton. His label is "Submit". If User clicks on Submit and NameTextField is empty then NameTextField ' s css class becomes AlertEmptyInput. "

[1] http://www.ppig.org/sites/default/files/2006-PPIG-18th-black... [2] http://attempto.ifi.uzh.ch/site/

I tried to learn AppleScript at one point because it had to do some automation on a Mac. It is an exceedingly confusing language, because the syntax does not clearly reflect the underlying semantic and processing model. The English-like syntax just obscures what is actually going on. Yeah it is easy to read, and readability is important, but I think they forgot that code have to be written before it can be read.
Just to add another data point, I didn't find it so. I can code up applescript if I want to completely relying on dynamic documentation. In fact, what surprised me was the ease with which I could pick it up after nearly a decade of touching it. The vocabulary explorer is awesome for scripting applications .. if only applications would care about them. Script editor offers coding up using JavaScript as well as applescript, but I'd any day pick AS over JS for its purpose because mental model is clearly established.
The vocabulary explorer is indeed awesome and a great idea. Unfortunately not even Apple is very good at supporting them. And with Sal Soghoian's departure, automation at Apple will probably either change a lot or die.

My problem was the language syntax as a whole. But I'm particularly sensitive and picky on this issue. ObjC and Java's verbosity repel me so strongly I could never properly learn them, for instance, and so on.

I'm hopeful that these sorts of languages can enjoy a renaissance in the form of suggestive interfaces similar to the ones that Android/iPhone users get when typing. It'd be much easier to program in Applescript if I had an editor that was letting me choose from a list of valid words rather than me having to remember the perfect incantation to do what I want.
> I kept having to go back to the documentation to remember

1. Why would you want to remember this? It's in the docs. If you use it often, you'll learn it eventually. If you don't, you have no need to remember this. Don't burden your memory unnecessarily.

2. A well written and searchable documentation adds very little overhead anyway, on the order of a few seconds. Yeah, writing from memory is faster, but not by that much, and that advantage disappears almost completely if you have good auto-completion and docs support in your editor.

3. There are hundreds of languages out there, even if you learn the library of one language, it doesn't help with other languages at all. Learning to quickly search reference docs, along with learning some basic concepts/assumptions of each language, is much more sustainable if you're thinking of becoming polyglot.

EDIT: please, don't turn HN into Reddit, write a comment if you disagree, instead of downvoting.

I agree with you in principle. Over the past 4 decades, I have programmed in dozens of programming languages, and remembering things like FOR or CASE loop structures is almost impossible for me now, and I constantly refer to docs.

However, as a starting point for me, it would be easier if the languages all used a common naming convention for things like .upper() etc. Common mistake for me is to try .upper(), then .uppercase(), then have to leave my code and refer to the docs after repeated runtime errors. If I can make a reasonably intelligent guess within 2 or 3 tries, then it bodes well for me. Otherwise it is a productivity hit.

To extrapolate this - the corollary to .upcase() is, to my mind .lowercase(), but it is in fact .downcase(). Makes sense logically ('down' is the opposite of 'up'), but syntax wise, I never say to anyone "You need to write your username in down case...". If the naming conventions for functions followed the English language definitions, then my hit/miss ration will improve, and so will my productivity.

NB: I didn't downvote you (I can't anyway as I don't have the necessary karma to downvote immediate child answers to my posts). I thought you raised a valid point worthy of discussion.

Or you use tooling that is aware of the syntax and available methods and can provide you live assistance and documentation at write time.
> it would be easier if the languages all used a common naming convention for things like .upper() etc.

Well, it would definitely be easier if all the languages used the same conventions. The problem here, I think, is that all the possible conventions are arbitrary and objectively as good as any other, so there are many of them and settling on a single one across languages is not going to happen.

It's much more important to choose one convention for a language and to follow it everywhere - this improves guessability of identifiers in that language. The convention chosen may be familiar to you or not, but that doesn't mean the more familiar one is any better. So, I think, it's more important for a language to follow a single convention consistently, while choosing which convention to follow is less important.

That being said, I agree that it would be easier to learn new languages if they all followed a single standard for naming things, I just don't think it's going to happen.

> If I can make a reasonably intelligent guess within 2 or 3 tries, then it bodes well for me. Otherwise it is a productivity hit.

Agreed, but note that the 2-3 tries are informed by your experiences to date - someone with a different history of programming languages could find your guesses weird and would have completely different ones themselves.

Another thing, I agree on the productivity hit in principle, just want to note that there are ways of mitigating it. I use many languages regularly (I'm a hobbyist-polyglot, I learned at least 30-40 languages to varying degrees) and I think that a good editor integration (most notably "semantic auto-complete" and "inline docs") can make up for a lot of cases like the `upcase`. For example, if I write this (`|` is cursor position):

    a = "asd"
    a.up|
and hit <tab>, I get a popup with the suggestion of a correct name (`upper`) along with a bit of documentation. The level of support for this obviously varies a lot across languages, but where it's available I find it mostly fixes the problem of having to hit the docs too often.

> NB: I didn't downvote you

It's impossible to downvote immediate child posts, no matter the karma level, so it obviously couldn't be you :)

> it would be easier if the languages all used a common naming convention for things like .upper() etc.

It would be easier if everyone would just speak English all the time, but I'm not sure it would be better.

Thing is, Avail actually is not a 'plain English' type programming language. The website states things very confusingly.

But look at the FAQ: http://www.availlang.org/about-avail/documentation/faq.html

Commenter dcw303 in this thread puts it better than I can.

>While I believe 'plain English' type programming languages are a great concept

I will refute that notion and suggest that 'plain English' type languages are an incredibly foolish diversion into a world where we pretend formal languages don't exist, are unimportant, or that English is one, or that it somehow can be laboriously contorted into a useful approximation of one.

> I kept having to go back to the documentation to remember the best way to convert a string to all uppercase... was it:

Wouldn't you have the same issue with just any programming language?

Point taken. But in the above example, why does Ruby do it as 'upcase' instead of the more English-y and popular 'uppercase' as most other languages use?

Take a more esoteric example - to capitalise just the first letter of a string. By definition, this is called 'proper case', and most other languages I know use .proper() to achieve this. Except Ruby, which decided to use .titleize() (and there is that 'ize' again just to confound me further).

If a language is going to be 'English-y', then sticking to actual English words for things such as uppercase, proper case etc. would be handy. Going outside of those constraints just increases the guessing game workload for the programmer.

Pascal uses upcase, and so does many others, so it has a very long history.

.titleize() is from ActiveSupport, not Ruby.

But proper() would have confused me much more. First time I've ever heard it called that. 'proper' would sound ambiguous to me, because it doesn't indicate to me it's about case at all.

But of course this just reinforces the point that naming is hard.

Except titleize changes not the first letter of a string, but the first letter of every word. That's title case (https://en.wiktionary.org/wiki/title_case).

Capitalize (https://ruby-doc.org/core-2.2.0/String.html#method-i-capital...) on the other hand does something even weirder from your perspective, as it also changes letters to lower case.

The rule they’re following is that method names should be verbs, and upcase / downcase are shorter than alternatives like to_upper etc. So while it’s an unusual choice it’s not an arbitrary one
Inform 7 thoroughly explored this space: http://inform7.com/
I've tried several times to figure out Inform 7, and I always go back to the old C-ish Inform 6 instead. I think my brain is damaged by prolonged exposure to traditional programming languages.
I had the exact same experience. I really wanted to like Inform 7 (and plenty of people do) but it feels completely counter-intuitive because it wants to feel like natural language but it's still "just" a programming language.

It still has a fairly rigid syntax - it's just more verbose than a regular programming language. That is probably an oversimplification, but that's how it felt to me at least.

The language could include all of the possible expressions as valid alternatives. There are good reasons why most programming languages don't do that (it really helps comprehension if there's only one standardized way to do something), but if a language wants to be as close to English as possible, it will have to include the whole kitchen sink of synonyms. It might have to educate the users about the difference between a minus and a dash, though, if "1-10" and "1–10" (that's an en-dash) should evaluate to a number and a list, respectively.
Some of the keywords and operators (punctuation) of Avail methods are “completely optional”, in that the caller’s choice to include it or not (or alternative prepositions in some cases) is solely to improve readability of the result. For example, the ordinary field-getter has the form “_’s??thing”. The double-question-mark should actually be a single Unicode character (I’m on my phone). The question-mark makes the “s” optional, so you can say “cat’s thing” or “Jess’ thing”. But there’s nothing stopping you from saying “cat’ thing”, other than the derisive laughter of your peers. :-)
"an efficient compiler that concurrently explores all possible parses in pursuit of a semantically unambiguous interpretation."

This suggests they have a solution to the ambiguity problem

Thanks for the clarification. I missed that on the initial skim of the article. (And yes, I appreciate the irony of me introducing the 'ambiguity' arguments into a topic which deals specifically with the removal of such) :)
It's still arguable that there could be some ambiguity. The compiler makes sure the statement is interpreted in an unambiguous way on the computer's end, but the writer might still not know which of a number of ways to express what she wants to state.
I went into this design decision, having looked very closely at the specific success of the SHRDLU system (1971?). Local disambiguation was very effective, and Avail goes even further by having exceptionally strong types to help distinguish meaning. We also have grammatical restrictions to express a unique form of precedence rules, and semantic restrictions to statically identify a large number of inappropriate uses of methods, like “dog”[4], which is a compile-time subscript bounds error.
This sounds extremely dangerous.

Millions of dollars are wasted each year on mistyped ==/= operators, but this is some next level evil.

Very true. The core Avail language uses “_=_” for comparison, “_:_” to declare a variable, “_:=_” got assignment, and “_::=_” for constants (think “final”). It also has “_:_:=_” for the case when you want to declare a variable of a general type, but initialize it with something more specific (say a counter that starts at 1).

But yes. There’s a lot of deeply evil things that can be done in Avail, right down to the lexical scanning. Like forbidding certain variable names, or manufacturing a run of tokens every time a certain emoji is encountered. There are also a lot of safeties, like sealing methods to prevent someone overriding “_+_” for [3’s type, 4’s type]->9’s type.

Oh, and writing “x = x + 1;” as a statement will be flagged as a syntax error, because it would produce a Boolean value which is not allowed to be silently discarded like in Java or C. “Discard:_” makes these rarely needed situations absolutely obvious.

Exactly my thoughts whenever I have to touch AppleScript.
Perhaps lojban would have been a better choice.