Hacker News new | ask | show | jobs
by throwaway1492 1069 days ago
As someone who worked on a 35 million loc cobol and jcl system for several years in the late 90's, I don't find this to be funny at all. Cobol was purported to be readable with english language like syntax to make it more understandable. Giving gems such as:

    MULTIPLY TAX-RATE OF STATE(43) BY BALANCE GIVING SALES-TAX-AMOUNT.
Note the period is sigificant.
5 comments

There are multiple responses to the above parent noting how readable the above line of code is and yet how it might go wrong.

The first pitfall of using english as a programming language that occurs to me (as a totally Cobol ignorant person):

* Human languages tend to be nebulous around the edges and fluid, often with single word taking up multiple meanings and same function done by multiple words. There are multiple ways to express the same concept. This allows for the language to change and evolve with the times.

* OTOH programming languages need to be specific and exact for the computer to be able to interpret it and ensure it functions as expected across devices and over time.

This means to make English function as a programming language, we will have to take the existing language, whittle down most of the senses and many words, assign one function to one word and use a very trimmed down and reduced version of English.

Now you will have to know two forms of English * The human one * The computer-compatible one

Worse, our human version of the language can often trip up our computer-compatible one. I imagine even debugging would be harder, because when you look into the code, the brain would see perfectly good english and not register any issue with punctuation or tokens as expected by the interpreter.

Consider the below version.

   MULTIPLY TAXRATE FROM STATE(43) WITH BALANCE GIVING SALES-TAX-AMOUNT.
I have made 3 changes here, which might or might not work with COBOL (I am Cobol ignorant ;) ). But if a person tries to find the changes or debug, it would difficult for the human brain to register what is wrong as this is perfectly good english.
> I have made 3 changes here, which might or might not work with COBOL (I am Cobol ignorant ;) ).

I've never used COBOL before, but this thread and your post made me curious and I installed GNU Cobol and found a sample program that demonstrates a few language features.

After playing around with it a little, I think I've understood that "OF" is like a struct element (or object property) accessor, the parens are an array index, and the hyphen is a syntactically significant part of the identifier. "BY" is apparently a mandatory part of the multiplication operator.

Conclusion: each of your three changes (while, as you said, perfectly reasonable from the English language point of view) would indeed break the COBOL code!

By the way, I remember this being a bit of an issue when I tried programming in HyperTalk: on occasion I would unintentionally come up with an English synonym for some natural-language HyperTalk code, and it wouldn't be valid HyperTalk.

The existence of Gnu COBOL:
Furthermore, vim turned out to have a built-in COBOL syntax highlighting mode which was activated when I opened the source first!
It's actually a pretty interesting question why programming languages that work like that are a bad idea.

From decades of experience, I know intuitively that they are, but I find myself unable to formulate a concise description of what exactly is wrong with this approach.

Because they turn out to be just as precisely fiddly as any other programming language, but deceptively appear not to be.

If you treat them as English you'll get burned. You have to still treat them as very precise formal languages where apparently trivial/irrelevant details are significant, and minor hard-to-spot mistakes will break your program.

Having more explicitly formal/structured syntax makes it easier to distinguish the different parts of the language, make sense of the details, and figure out what is or isn't allowed.

The best description I have for languages like this, e.g. Applescript, is that they are "read only languages" (in the opposite direction old-style Perl has been called a "write only language"): That is, if you have an already written program in hand it will be easier for a complete novice to read (at least, single isolated lines of code will be). But writing new programs is a huge pain in the ass.

My guess would be that it's because spoken languages and programming languages are fundamentally different, so in trying to make one fit the other you might end up with a programming language that looks and reads a lot like English, but it's almost become self obfuscating because now your brain could automatically try parsing it using the rules for English and not the rules for the programming language, making things look like it should work even if it doesn't. The example the comment you're replying to works pretty well, we probably largely ignore punctuation at the end of sentences in terms of actually consciously seeing it as opposed to just inserting a pause in our mental cadence, so if they're suddenly important pieces of syntax in statements that look like English you could easily start messing up the use of it
I might offer an alternative perspective.

In my opinion it’s because of english’s evolution over time. Something that might have made perfect sense in the past becomes an antiquated way of saying the same thing in the future. It also deliberately encodes the authors belief about how language should be spoken ignoring any regional variances you see in the real world.

When you abstract away the English meaning of code into something new and unchanging, you provide stability not seen in natural language.

Too much ambiguity that needs a large amount of context to resolve.
But why is that a problem? That's literally how we speak every single day.

In fact, given that this is how all the languages that humans are already familiar with work, it's hard to see why this wouldn't be the best approach for constructing a programming language.

Because:

- humans can ask back questions to clarify any ambiguity; computers don't have the capacity to understand ambiguity, let alone asking for clarification.

- computers are trusted to work w/o failure almost 100% of the time. humans err (to be human is to err, anyway), so we'd never put a human in charge of critical things we use computers for.

I agree with your sarcastic reply that naturally grown languages are simply horrible for shared comprehension!

For instance, the poem, or should I say program using the example in the post has several different interpretations, and at least some of those teachers would say are flabbergastingly incorrect.

It’s quite literally a problem, every day.
Because "computers aren't human"
That is surprisingly readable and makes me want to go learn COBOL.

Could I now get you to criticize eating healthy, please? :)

Before jumping in with both feet, please check out the ALTER statement. GOTO on steroids and more.
> The ALTER statement changes the transfer point specified in a GO TO statement.

What could go wrong?

Finalizing in COBOL:

    I am altering the data. Pray I do not alter it further.
T-Shirt material which a vanishingly small number of victims will understand.
Fair critique. That's why INTERCAL was so much better.
I hear the successor language is much better.

It's called ADD 1 TO COBOL GIVING COBOL.

I have never read cobol before now, but using another translation in this thread as my guide, and minifying the variable label:

    C++
Nice.
I'm so excited about the new language features in ADD 1 TO COBOL GIVING COBOL EX TWENTY-THREE.
I have no doubt English like syntax is a terrible idea but this seems like a surprisingly readable example.
Agreed, I rather like this line of code, but I must admit that the C-like version is perhaps better:

    sales_tax_amount = balance * state[42].tax_rate;
(Assuming I've guessed the meaning of the Cobol correctly - perhaps that's the rub.)

(ETA: Added semicolon.)

(ETA2: Fixed critical indexing bug.)

You forgot the semicolon.

Note the semicolon is significant. Barbaric, I know.

I think indexes start at 1 in COBOL. I had to look it up though. (aiui, they're called TABLEs and the index can start from any number)

So that'd be state[42].

Not all heroes wear capes.

I've put in a PR to upstream. LGTM

    #define MULTIPLY sales_tax_amount = balance * state[43].tax_rate; //