Hacker News new | ask | show | jobs
Tulip – An untyped functional language (jneen.net)
77 points by jneen 4096 days ago
7 comments

This might seem strange, but if there's one thing from Common Lisp that should receive wider adoption in other languages, it's hyphenated names. They are so much more readable than anything else (well, C with underscores comes close).
This is such a good idea I just added it to my current toy language:

https://github.com/TazeTSchnitzel/Firth/commit/7b9bf0b4c090e...

Thanks for the idea! :)

I've been playing around with creating a toy language which treats '-' as a name for a function. It means there always needs to be white space around a - (the syntax of the language isn't like LISP), but that increases readability at the cost of two extra key presses.
Also, note that, in the same breath practically, you can support true negative integer constants:

   -1234
which are distinguished because there is no whitespace. You can distinguish the unary operator being applied to 1234 from a true -1234 constant.
This would be an easy text-transformation that you could do in vim. Upon opening the file, translate all dashes without spaces "foo-bar" to add spaces "foo - bar". Then convert all underscores to dashes. "foo_bar" to "foo-bar". On save, invert the process.

You'd have to actually run the language's parser in order to do the transformation to avoid changing strings, and even then it'd only work if the parser output kept track of the original line and character so that you could know where to make the change.

This sort of text-transformation is something I've long wished my text editor did. At a previous job the standard was three-spaces of indent, regardless of the language.

> This sort of text-transformation is something I've long wished my text editor did.

Emacs does this in some cases: specifically for camelCasedWords (http://www.masteringemacs.org/article/making-camelcase-reada...) and for the word 'lambda' which can be displayed as a symbol. There are many other modes which "overlay" some text over how it looked originally.

Easier to type too. No need to press the shift key.
Perl6, Clojure, Racket, Rebol, Red, Factor & Forth are some other languages that allow hyphenated names.

And I agree with you that hyphens are more readable. They're also good for adding extra semantic meaning - https://news.ycombinator.com/item?id=3978992

Add to that list, GNU Make. In gmake, you sometimes need things that look like paths to be variable names:

   VAR_$(PATH) := whatever
where PATH could be path/to/foo-parser.o, say.
If that sort of thing interests you, also check out my language 'tab' (https://bitbucket.org/tkatchev/tab).

Tab is a statically-typed, functional, type-inferred language that occupies a niche between bash and python.

It's also not Turing-complete but can compute almost everything you could ever think of.

(I wish more languages aimed for Turing-incompleteness -- unsurprisingly, it turns out Turing-incomplete languages have big benefits for performance and resource management.)

How is yours not Turing-complete and what benefits are there?
> I strongly dislike macros that can hide in code. I get really frustrated when I open a source file and see (foo ...) and can’t tell whether it’s a function or a macro until I read documentation.

Well... that's just, like, your opinion, man.

Seriously though. In Elixir, for example, much of the language itself is implemented via its own macros, which demonstrates a certain nice extensibility. If Elixir followed this same pattern, it would get really annoying really quickly, as even simple if statements would require a leading slash.

Also, I preferred "unf" ;)

Yep, it's my opinion, and that's why I put it into the design. Lots of language design comes from opinions. I hope it's borne out. FWIW it's the same approach Rust has taken, where macros have to end with a ! to make them visually distinct.
That might be because Rust might not eat its own dogfood in that department, and build some of its own functionality out of its macro system.

But I can see just "knowing" at a glance if it's a macro or not.

I think the answer would basically be determined by how much of the language itself uses its own macro system AND what type of macro system it actually is. If it's significant, having special syntax would just look weird.

Rust does eat its own dogfood with regard to macros, and over time has steadily replaced former language-level features like `log` and `panic` with macros. Syntactic distinction is a philosophical choice in service of making costs more explicit (and while it's true that functions can hide behavior, overuse of macros can trigger enormous code bloat, such as the `regex!` macro which compiles your regex into a state machine).

(There are also valid technical reasons for requiring syntactic distinction, as the sheer flexibility of Rust's macros in their ability to create new syntax run the risk of making it a nightmare to parse if you remove the unambiguous ability of the compiler to drop into macro-parsing mode. These challenges aren't insurmountable, just very hairy.)

If you see (foo ...) but don't actually know what foo does, it doesn't matter all that much whether it is a function or an operator. Even if you know it's a function, that just tells you how the arguments are evaluated; but not what happens with those values. Untold effects could hide behind a function call.
This is also true for Racket. The language is basically all macros built on top of each other. While this superficial distinction between macros and other constructs serves a purpose, I think that purpose is largely misguided and invented.

What is the need for knowing if it's a macro or not when you could just know how it works (what'll it spit out / do?)?

While I do believe in limiting stuff for the sake of simplicity, this notation will actually burden the developer into not using the macro system fully, simply because someone wants there to be a non-forced distinction between macros and other constructs in the code.

Link is down for me.

Google Cache Text-Only:

http://webcache.googleusercontent.com/search?q=cache:cOp3ebJ...

Argh, thanks for the cache link. I'm still on heroku free-tier :\
This looks cool -- is there any source code? What language is it written in?

"Tulip is still in active development, and I could use a whole lot of help, both filling in the design gaps here and actually churning out the implementation"

OK interesting, it actually appears to be written in RPython, not full Python:

https://github.com/jneen/tulip/blob/master/tulip/libedit.py

(RPython is the "static" subset of Python used to bootstrap PyPy)

They have moved towards treating RPython as a framework:

http://rpython.readthedocs.org/

Yeah, it's basically a toolkit for building jitted languages. Basically the easiest way to get a tracing jit these days. So it'll be a self-hosting jit similar to pypy or pixie.
Note that it definitely has types, they just aren't required explicitly. It seems to use dynamic type matching.
It's unityped! It has one type with infinitely many variants/tags (.<string>). Match failure occurs at runtime as in any other safe typed language such as Haskell or ML.
>Match failure occurs at runtime as in any other safe typed language such as Haskell or ML.

That is incredibly disingenuous. The only way a Haskell or ML program could be as colossally unsafe as a unityped language program is if the programmer used only one giant sum type for the entire program, and most functions in the program were non-total with respect to that type.

"Unityping" provides no static type safety. It is isomorphic to, and usually a euphemism for, the lack of any static type system.

Yeah, it was a difficult decision to remove types - I'd gotten myself into a corner trying to tack on dependent types, and it just wasn't happening. My bet is that unlike most of the un{i,}typed languages out there (most of which I'd categorize as lisps and smalltalks), tulip provides tagging and destructuring that allows the programmer to maintain some level of control over the polymorphism. Tulip will panic at runtime for non-total functions, but ideally you'll have the tools necessary to keep the panic as close to the problem as possible.
If you customize your runtime behavior based on any metadata about the value on which you operate, you have multiple types. Attempting to change the syntax will not change the fact that you need to differentiate behavior for numbers and strings.
> the fact that you need to differentiate behavior for numbers and strings

That's actually not exactly right: for example in Forth you really have no types at all.

Also, if somewhat uses a word such as "unityped" or "type with infinitely many variants" you should immediately know that any mention that "there are types, alright, just checked on runtime" will be immediately rejected. Majority of static typing fanatics are like that.

"Majority of static typing fanatics are like that."

That's not the problem. The problem is that to a first approximation, every language is "type safe" in the sense that you can't add a string to a number. Even in those languages where it looks like you can, it's because of a certain usually-limited set of automatic coercions, not because you can actually add a number to a string.

Truly adding a number to a string looks like this:

   number: 0x000000000000002a
   string: 0x7ffb000000007264
   result: 0x7ffb00000000728e
The string is, of course, a pointer, and the result, of course, is gibberish. This is why "no" languages to speak of implement this form of "untyped language"; it isn't what anybody actually wants. (Assembler, of course, has it, but that's an exception for obvious reasons.)

A term that describes essentially 100% of languages is not a useful one, so static typing usually refers to a language whose type system is somehow more restrictive at compile time than "Everything is a variant type and we'll work it out at runtime".

> The problem is that to a first approximation, every language is "type safe" in the sense that you can't add a string to a number

We're not discussing a concept of "type safety" here at all, but rather a concept of "untypedness". I just can't agree that for example Common Lisp (with CLOS), Smalltalk or Python are "untyped". They are not: untyped language is one which has no type errors both on compile time and runtime (unless I'm very . An obvious example is Assembler, but Forth or TCL qualify too. And quite a few others do too. See here: http://en.wikipedia.org/wiki/Programming_language#Typed_vers...

> so static typing usually refers to a language whose type system is somehow more restrictive at compile time than "Everything is a variant type and we'll work it out at runtime"

Again, it was never suggested that Tulip has "static types". It doesn't of course.

What I said is that it has types. I don't want to discuss how much better "static typing" is than "dynamic typing" or vice versa, this makes for a very boring discussion similar to Emacs vs. Vim and I'm not interested in it at all. I just object to the notion that "static types" are the only kind of types we can ever have in a language.

The problem is with "static typing fanatics", really. They'd like to bend the terminology in a way which helps them promote static typing, for example by equating all types with static types. This is both dishonest and unnecessary. No serious static typing advocate would do this (I hope) - static typing is a great idea able to defend on its own, there's no need to lie about "the other side" of the argument.

Well, all fanatics are like that. Way too much Kool-Aid, way too little critical thinking.

B and early C are untyped like this
you can easily add a number to a char in C and get a jibberish character, which is why C is a weakly typed language
Yep! It focuses more on dynamic type-checks than on static typing though, so I put it in the category of "untyped functional" - more like clojure and erlang than haskell or ml.
Well, dynamic types are still types. :) It also seems strongly typed through a lack of implicit conversions between types.

I would say this is more like go than anything, though it seems to lack methods (and interfaces) and includes a functional syntax.

You're going to run into issues when attempting to extend polymorphism for built-in functions to user-defined types—imagine trying to figure out how to sort an 'unknown' type without a way to compare them without modifying the method to be explicitly aware of the new type.

There does seem to be a method/interface system (Under the "Methods, Protocols, Implementations" header). And it seems to have some sort of dispatch system for tagged structures that can be later modified by the user.
Exactly! This is what the @method / @impl system is for - it's about equivalent to clojure's defprotocol. Future plans include named protocols consisting of multiple methods, and protocol-based matching.
"I’ve renamed the language from Unf to Tulip, because some folks pointed out that the old name created an unnecessarily sexualized environment"

Are fifth graders critiquing programming languages now? Seriously, who makes that association and then feels the need to comment on it?

It was a decision I made, partly because I realized they were right, and partly because I think tulips are pretty.

     ) (
    (  _)
      |/
unnecessarily sexualized logo

she's doing a high kick away from the viewer?

um, it's a flower
“Unf” is quite widely used as a spelling of a moan, to express sexual desire or gratification. While it can be used to express non-sexual enjoyment, the sexual connotation it evokes is just unnecessary when it comes to a programming language, regardless of the original intent.
While I've never seen this use personally, UrbanDictionary very strongly corroborates this.

http://www.urbandictionary.com/define.php?term=unf

It’s worth noting that “universal noise of fucking” is a backronym—the word was originally onomatopoeic.
It is, still, but urbandictionary users like making inaccurate definitions as an attempt at "humour"
As an counter datapoint, this is the first time I have ever heard of this.
Perhaps as "umph" or "umf"? No?
No, not even slightly.

FWIW, I grew up in Scotland, then moved to England, so I'm probably from a rather different cultural background than the poster.

As a non-native speaker, I have seen "umph" before.