Hacker News new | ask | show | jobs
by BradleyChatha 246 days ago
In short: I wanted to talk a bit about ASN.1, a bit about D, and a bit about the compiler itself, but couldn't think of any real cohesive format.

So I threw a bunch of semi-related ramblings together and I'm daring to call it a blog post.

Sorry in advance since I will admit it's not the greatest quality, but it's really not easy to talk about so much with such brevity (especially since I've already forgot a ton of stuff I wanted to talk about more deeply :( )

5 comments

A small nitpick: I don’t think your intersection example does what you want it to do. Perhaps there’s some obscure difference in “PER-visibility” or whatnot, but at least set-theoretically,

  LegacyFlags2 ::= INTEGER (0 | 2 ^ 4..8) -- as in the article
is exactly equivalent to

  LegacyFlags2 ::= INTEGER (0) -- only a single value allowed
as (using standard mathematical notation and making precedence explicit) {0} ∪ ({2} ∩ {4,5,6,7,8}) = {0} ∪ ∅ = {0}.
At least you might be summoning Walter Bright in talking about D. One of my favorite languages I wish more companies would use. Unfortunately for its own sake, Go and Rust are way more popular in the industry.
Unfortunately it lost the opportunity back when Remedy Games and Facebook were betting on it.

The various WIP features, and switching focus of what might bring more people into the ecosystem, have given away to other languages.

Even C#, Java and C++ have gotten many of features that were only available in D as Andrei Alexandrescu's book came out in 2011.

I wouldn't say that it's unable to make a comeback, there is still a valid use case from my experience with it. The syntax, mixed-memory model, UFCS, and compilation speed are nice quality of life features compared to C++, and it's still a native binary compared to C# and Java. So if you're starting out with a new project from scratch there's not much reason not to beyond documentation reasons. And you can interface pretty easily to C/C++ as well as pretty much any other language designed for that sort of thing, but without a lot of syntax changes like Carbon.

I imagine that the scope of its uses has shrunk as other languages caught up, and I don't think it's necessarily a good language for general enterprise stuff (unless you're dealing with C++), but for new projects it's still valid IMO. I think that the biggest field it could be used in is probably games too, especially if you're already writing a new engine from scratch. You could start with the GC and then ease off of it as the project develops in order to speed up development, for example. And D could always add newer features again too, tbh.

You always have to compare ecosystems, not programming languages syntax on their own.

Another thing that Java and C# got to do since 2011, is that AOT is also part of the ecosystem and free of charge (Java has had commercial compilers for a while), so not even a native binary is an advantage as you imagine.

First D has to finish what is already there in features that are almost done but not quite.

TBH I don't necessarily think that ecosystem is what matters in every application, but it is necessary for most people, I agree. And I do agree with finishing a lot of the half-baked features too, but I'm unsure if the people maintaining the language have the will or the means to do that.

Do you have any other ideas about how D could stand out again?

It is what matters, as most companies pick languages based on SDKs, not the other way around being one pony trick and trying to solve everything with the same language.

That is why outside startups selling a specific product, most IT departments are polyglot.

For D to stand out, there must be a Rails, Docker like framework, something, that is getting such a buzz that makes early adopters want to go play with D.

However I don't see it happening on LLM age, where at a whim of a prompt thoughts can be generated in whatever language, which is only a transition step until we start having agent runtimes.

C#, Java and C++ have poor copies of the D features. For example, C++ constexpr is a bad design because a special keyword to signify evaluating at compile time is completely redundant. (Just trigger on constexpr in the grammar.)

C++ modules, well, they should have more closely copied D modules!

They might have, still they are good enough to keep people on those ecosystems and not bother at all with D, and tbe current adoption warts concerning IDE tooling and available libraries.

Worse is better approach tends to always win.

Good enough is fine until one discovers how much better it can be.
Unfortunately the opportunity window is gone now, and in the LLM age programming languages are becoming less relevant, when programming itself increasingly turns into coordinating agents.
I have a very naïve and maybe dumb question coming from someone who is used to scripting languages. It's about the `auto` keyword, while being a nice feature, why is it necessary to write it down ? Isn't it possible to basically say to the compiler : "Hey you see this var declared with no type ? Assume by yourself there is an `auto` keyword."
I feel like back when D might've been a language worth looking into, it was hampered by the proprietary compilers.

And still today, the first thought that comes to mind when I think D is "that language with proprietary compilers", even though there has apparently been some movement on that front? Not really worth looking into now that we have Go as an excellent GC'd compiled language and Rust as an excellent C++ replacement.

Having two different languages for those purposes seems like a better idea anyway than having one "optionally managed" language. I can't even imagine how that could possibly work in a way that doesn't just fragment the community.

D has been fully unproprietary since 2017. https://forum.dlang.org/post/oc8acc$1ei9$1@digitalmars.com

But that's only the reference compiler, DMD. The other two compilers were fully open source (including gcc, which includes D) before that.

Fully disagree on your position that having all possibilities with one language is bad. When you have a nice language, it's nice to write with it for all things.

Sounds like you should look into it instead of idly speculating! Also, the funny thing about a divisive feature is that it doesn't matter if it fragments the community if you can use it successfully. There are a lot of loud people in the D community who freak out and whine about the GC, and there are plenty more quiet ones who are happily getting things done without making much noise. It's a great language.
Are you saying that if I'm using D-without-GC, I can use any D library, including ones written with the assumption that there is a GC? If not, how does it not fracture the community?

> There are a lot of loud people in the D community who freak out and whine about the GC, and there are plenty more quiet ones who are happily getting things done without making much noise

This sounds like an admission that the community is fractured, except with a weirdly judgemental tone towards those who use D without a GC?

> Are you saying that if I'm using D-without-GC, I can use any D library, including ones written with the assumption that there is a GC? If not, how does it not fracture the community?

"Are you saying that if I'm using Rust in the Linux kernel, I can use any Rust library, including ones written with the assumption they will be running in userspace? If not, how does that not fracture the community?"

"Are you saying that if I'm using C++ in an embedded environment without runtime type information and exceptions, I can use any C++ library, including ones written with the assumption they can use RTTI/exceptions? If not, how does that not fracture the community?"

You can make this argument about a lot of languages and particular subsets/restrictions on them that are needed in specific circumstances. If you need to write GC-free code in D you can do it. Yes, it restricts what parts of the library ecosystem you can use, but that's not different from any other langauge that has wide adoption in a wide variety of applications. It turns out that in reality most applications don't need to be GC-free (the massive preponderance of GC languages is indicative of this) and GC makes them much easier and safer to write.

I think most people in the D community are tired of people (especially outsiders) constantly rehashing discussions about GC. It was a much more salient topic before the core language supported no-GC mode, but now that it does it's up to individuals to decide what the cost/benefit analysis is for writing GC vs no-GC code (including the availability of third-party libraries in each mode).

The RTTI vs no-RTTI thing and the exceptions vs no-exceptions thing definitely does fracture the C++ community to some degree, and plenty of people have rightly criticized C++ for it.

> If you need to write GC-free code in D you can do it.

This seems correct, with the emphasis. Plenty of people make it sound like the GC in D is no problem because it's optional, so if you don't want GC you can just write D without a GC. It's a bit like saying that the stdlib in Rust is no problem because you can just use no_std, or that exceptions in C++ is no problem because you can just use -fno-exceptions. All these things are naïve for the same reason; it locks you out of most of the ecosystem.

> This sounds like an admission that the community is fractured, except with a weirdly judgemental tone towards those who use D without a GC?

That's not what I'm saying, and who cares if it's fractured or not? Why should that influence your decision making?

There are people who complain loudly about the GC, and then there are lots of other people who do not complain loudly and also use D in many different interesting ways. Some use the GC, some don't. People get hyper fixated on the GC, but it isn't the only thing going on in the language.

> who cares if it's fractured or not? Why should that influence your decision making?

Because, if I want to write code in D without the GC, it impacts me negatively if I can't use most of the libraries created by the community.

Go is a GC language that has eaten a chunk of the industry (Docker, TypeScript, Kubernetes... Minio... and many more I'm sure) and only some people cry about it, but you know who else owns sizable chunks of the industry? Java and C# which are both GC languages. While some people waste hours crying about GCs the rest of us have built the future around it. Hell, all of AI is eaten by Python another GC language.
And in D, there's nothing stopping from either using or not using the GC. One of the key features of D is that it's possible to mix and match different memory management strategies. Maybe I have a low level computational kernel written C-style with memory management, and then for scripting I have a quick and dirty implementation of Scheme also written in D but using the GC. Perfectly fine for those two things to co-exist in the same codebase, and in fact having them coexist like that is useful.
> And in D, there's nothing stopping from either using or not using the GC.

Wait so are you, or are you not, saying that a GC-less D program can use libraries written with the assumption that there's a GC? The statement "there's nothing stopping [you] from not using the GC" implies that all libraries work with D-without-GC, otherwise the lack of libraries written for D-without-GC would be stopping you from not using the GC

I'm not strongly for or against (non-deterministic) GC. Deterministic GC in Rust or the (there's no real Scotsman) correctly written C++ has benefits, but often I don't care and go / java / c# / python are all fine.

I think you're really overstepping with AI is eaten by python. I can imagine an AI stack with out python llama.cpp (for inference not training... isn't completely that, but most of the core functionality is not python, and not-GCd at all), I can not imagine an AI stack with out CUDA + C++. Even the premier python tools (pytorch, vllm) would be non-functional with out these tools.

While some very common interfaces to AI require a GC'd language I think if you deleted the non-GC parts you'd be completely stuck and have years of digging your self out, but if you deleted the 'GC' parts you can end up with a usable thing in very short order.

NVidia has decided that the market demand to do everything in Python justifies the development cost of making Python fast in CUDA.

Thus now you can use PTX directly from Python, and with the new cu Tiles approach, you can write CUDA kernels in a Python subset.

Many of these tools get combined because that is what is already there, and the large majority of us don't want, or has the resources, to spend bootstrapting a whole new world.

Until there is some monetary advantage in doing so.

For 99% of the industry use cases, some kind of GC is good enough, and even when that isn't the case, there is no need to throw away the baby with the babywater, a two language approach also works.

Unfortunely those that cry about GCs are still quite vocal, at least we can now throw Rust into their way.

There's a place for GC languages, and there's a place for non-GC languages. I don't understand why you seem so angry towards people who write in non-GC languages.
And there's a place for languages that smoothly support both GC and non-GC. D is the best language at that.
"AI" is built on C, C++, and Fortran, not Python.
Once one realizes that the GC is just another way to allocate memory in D, it becomes quite wonderful to have a diverse collection of memory management facilities at hand. They coexist quite smoothly. Why should programs be all GC or no GC? Why should you have to change languages to switch between them?
Indeed the GC is just a library with some helpful language hooks to make the experience nice.

If you understand how it's hooked into, it's very easy to work with. There is only one area of the language related to closure context creation that can be unexpected.

I don't think the proprietary compilers is a true set back, look at for example C# before it became as open as .NET has become today (MIT licensed!) and yet the industry took it. I think what D needed was what made Ruby mainly relevant: Rails. D needs a community framework that makes it a strong candidate for a specific domain.

I honestly think if Walter Bright (or anyone within D) invested in having a serious web framework for D even if its not part of the standard library, it could be worth its weight in gold. Right now there's only Vibe.d that stands out but I have not seen it grow very much since its inception, its very slow moving. Give me a feature rich web framework in D comparable to Django or Rails and all my side projects will shift to D. The real issue is it needs to be batteries included since D does not have dozens of OOTB libraries to fill in gaps with.

Look at Go as an example, built-in HTTP server library, production ready, its not ultra fancy but it does the work.

C# has Microsoft behind it. D ... doesn't.

There are plenty of people who aren't interested in using languages with proprietary toolchains. Those people typically don't use C#. The people who don't mind proprietary toolchains typically write software for an environment where D isn't relevant, such as .NET or the Apple world.

I do agree with you that there needs to be a good framework though. Either in Web or Games. Web because it's more familiar than Go but also has Fibers, and Games because it's an easier C++. There is also Inochi2D which looks rather professional: https://inochi2d.com/

One of the issues I've seen in the community is just that there aren't enough people in the community with enough interest and enough spare time to spend on a large project. Everyone in the core team is focused on working on the actual language (and day-jobs), while everyone else is doing their own sort of thing.

From your profile you seem to have a lot of experience in the field and in software in general, so I'd like to ask you if you have any other advice for getting the language un-stuck, especially with regards to the personnel issues. I think I'd like to take up your proposal for a web framework as well, but I don't really have any knowledge of web programming beyond the basics. Do you have any advice on where to start or what features/use case would be best as well?

Getting a web framework into the standard library is something I want to get working, along with a windowing library.

Currently we need to get a stackless coroutine into the language, actors for windowing event handling, reference counting and a better escape analysis story to make the experience really nice.

This work is not scheduled for PhobosV3 but a subset such as a web client with an event loop may be.

Lately I've been working on some exception handling improvements and start on the escape analysis DFA (but not on the escape analysis itself). So the work is progressing. Stackless coroutine proposal needs editing, but it is intended to be done at the start of next year for approval process.

Go would be an excellent GC'd compiled language if it actually learnt from history of computing.

I rather give that to languages like C# with Native AOT, or Swift (see chapter 5 of GC Handbook).

D only lacks someone like Google to push it into mainstream no matter what, like Go got to benefit from Docker and Kubernetes.

As someone that had the dis-pleasure to work with Asn.1 data (yes, certificates) I fully symphatise with anguish you've gone through (that 6months of Ansible HR comments cracked me up also :D ).
It makes me laugh that absolutely no one can say "I've worked with ASN.1" in a positive light :D
Bzzt! Wrong! I have worked with ASN.1 for many years, and I love ASN.1. :)

Really, I do.

In particular I like:

- that ASN.1 is generic, not specific to a given encoding rules (compare to XDR, which is both a syntax and a codec specification)

- that ASN.1 lets you get quite formal if you want to in your specifications

For example, RFC 5280 is the base PKIX spec, and if you look at RFCs 5911 and 5912 you'll see the same types (and those of other PKIX-related RFCs) with more formalisms. I use those formalisms in the ASN.1 tooling I maintain to implement a recursive, one-shot codec for certificates in all their glory.

- that ASN.1 has been through the whole evolution of "hey, TLV rules are all you need and you get extensibility for free!!1!" through "oh no, no that's not quite right is it" through "we should add extensibility functionality" and "hmm, tags should not really have to appear in modules, so let's add AUTOMATIC tagging" and "well, let's support lots of encoding rules, like non-TLV binary ones (PER, OER) and XML and JSON!".

Protocol Buffers is still stuck on TLV, all done badly by comparison to BER/DER.

Yeah I know I'm making fun of it a lot (mostly in jest) but it genuinely is a really interesting specification, and it's definitely sad - but not surprising - it's not a very popular choice outside of its few niche areas.

:) Glad to see someone else who's gone down this road as well.

I feel the experience of many people writing with ASN.1 is that of dealing with PKI or telecom protocols, which attempt to build worldwide interop between actually very different systems. The spec is one thing, but implementing it by the book is not sufficient to get something actually interoperable, there are a ton of quirks to work around.

If it was being used in homogenous environments the way protocol buffers typically are, where the schemas are usually more reasonable and both read and write side are owned by the same entity, it might not have gotten such a bad rap...

I also like ASN.1; I think it is better than JSON, XML, Protocol Buffers, etc, in many ways. I use it in some of my programs.

(However, like many other formats (including JSON, XML, etc), ASN.1 can be badly used.)

How do you feel about something like CBOR? In which stage would you say it's stuck in evolution compared to ASN.1 (since you said Protobuf is still TLV)?
CBOR and JSON are just encodings, not schema, though there are schemas for them. I've not looked at their schema languages but I doubt they support typed hole formalisms (though they could be added as it's just schema). And since CBOR and JSON are just encodings, they are stuck being what they are -- new encodings will have compatibility problems. For example, CBOR is mostly just like JSON but with a few new types, but then things like jq have to evolve too or else those new types are not really usable. Whereas ASN.1 has much more freedom to introduce new types and new encoding rules because ASN.1 is schema and just because you introduce a new type doesn't mean that existing code has to accept it since you will evolve _protocols_. But to be fair JSON is incredibly useful sans schema, while ASN.1 is really not useful at all if you want to avoid defining modules (schemas).
I was considering CBOR+CDDL heavily for a project a while so they're a tad intertwined in my head. I very much liked CBOR's capability of being able to define wholly new types and describe them neatly in CDDL. You could even add some basic value constraints (less than, greater equal, etc.). That seemed really powerful and lacking ASN.1 experience it sounds like a very lite JSON-like subset of that.
I worked with ASN.1 for a few years in the embedded space because its used for communications between aircraft and air traffic control in Europe [1]. I enjoyed it. BER encoding is pretty much the tightest way to represent messages on the wire and when you're charged per-bit for messaging, it all adds up. When a messaging syntax is defined in ASN.1 in an international standard (ICAO 9880 anyone?), its going to be around for a while. Haven't been able to get my current company to adopt ASN.1 to replace our existing homegrown serialization format.

[1] https://en.wikipedia.org/wiki/Aeronautical_Telecommunication...

Isn't PER or OER more compact? especially for the per-bit charging thing
Oh yeah, derp. I was thinking unaligned-PER, not BER.
of all the encoding i like BER the most as well

(i worked in telecommunications when ASN.1 was common thing)

As a former PKI enthusiast (tongue firmly in cheek with that description) I can say if you can limit your exposure to simply issuing certs so you control the data and thus avoid all edge cases, quirks, non-canonical encodings, etc, dealing with ASN.1 is “not too terrible.” But it is bad. The thing that used to regularly amaze me was the insane depths of complexity the designers went to … back in the 70’s! It is astounding to me that they managed to make a system that encapsulated so much complexity and is still in everyday use today.

You are truly a masochist and I salute you.

ASN.1 is from the mid-80s, and PKI is from the late 80s.

The problems with PKI/PKIX all go back to terrible, awful, no good, very bad ideas about naming that people in the OSI/European world had in the 80s -- the whole x.400/x.500 naming style where they expected people to use something like street addresses as digital names. DNS already existed, but it seems almost like those folks didn't get the memo, or didn't like it.

They got grant money to work on anything but TCP/IP. :-) A lot of European oral history about how "the Internet" got to a Uni talks about how they were supposed to only use ISO/OSI but eventually unofficially installed IP anyway.
But of course.
Organizational unit, location, etc all these concepts were pretty dumb to tie with digital identity in retrospect.
Unless you need things like ability to address groups in flexible ways, which is why X.400 survives in various places (in addition to actually supporting inline cryptography and binary attachments).

What people forget is that you do not have to use the whole set of schema attributes.

It's also amazing that we're basically using only a couple of free-form text fields in the WebPKI for the most crucial parts of validation.

Completely ignoring the ASN.1 support for complicated structures, with more than one CVE linked to incorrect parsing of these text fields m

No we're not. We're using dNSName subjectAlternativeName values. We used to use the CN attribute of the subject DN, and... there is still code for that, but it's obsolete.

We _are_ using subject DNs for linking certs to their issuers, but though that's "free-form", we don't parse them, we only check for equality.

CN is absolutely used everywhere. And it can contain wildcards. SANs are also free-form.
There was an amusing chain of comments the last time protobuf was mentionned in which some people were arguing that it had been a terrible idea and ASN.1, as a standard, should have been used.

It was hilarious because clearly none of the people who were in favor had ever used ASN.1.

Cryptonector[1] maintains an ASN.1 implementation[2] and usually has good things to say about the language and its specs. (Kind of surprised not he’s not in the comments here already :) )

[1] https://news.ycombinator.com/user?id=cryptonector

[2] https://github.com/heimdal/heimdal/tree/master/lib/asn1

Thanks for the shout-out! Yes, I do have nice things to say about ASN.1. It's all the others that mostly suck, with a few exceptions like XDR and DCE/Microsoft RPC's IDL.
Derail accepted! Is your approval of DCE based only on the serialization not being TLV or on something else too? I have to say, while I do think its IDL is tasteful, its only real distinguishing feature is AFAICT the array-passing/returning stuff, and that feels much too specialized to make sense of in anything but C (or largely-isomorphic low-level languages like vernacular varieties of Pascal).
You're likely to find my comments among those saying that. I've been using ASN.1 in some way for a couple of decades, and I've been an ASN.1 implementor for about half a decade.
It's not entirely horrible, parsing DER dynamically enough to handle interpreting most common certificates can be done in some 200-300 lines of C#, so I'd take that any day over XML.

The main problem is that to work with the data you need to understand the semantics of the magic object identifiers and while things like the PKIX module can be found easily, the definitions for other more obscure namespaces for extensions can be harder to locate as it's scattered in documentation from various standardization organizations.

So, protobuf could very well have been transported in DER, the problem issue was probably more one of Google not seeing any value of interoperability and wanting to keep it simple (or worse, clashing by oblivious users re-using the wrong less well documented namespaces).

ASN.1 seems like something that could have been good ... If it was less complicated, had more accessible documentation, and had better tooling.
I suspect that typical interactions with ASN.1 are benign because people are interested in reading and writing a few specific preexisting data structures with whatever encoding is required for interoperability, not in designing new message structures and choosing encodings for them.

For example, when I inherited a public key signature system (mainly retrieving certificates and feeding them to cryptographic primitives and downloading and checking certificate revocation lists) everything troublesome was left by dismissed consultants; there were libraries for dealing with ASN.1 and I only had to educate myself about the different messages and their structure, like with any other standard protocol.

(void) space. :P
Just wanted to say I enjoyed your post very much. Thank you for writing it. I love D but unfortunately I haven't touched it for several years. I also have some experience writing parsers and implementing protocols.
Thank you :)
Don't worry, it's your blog, and your way. Keep it up, if it makes you whole.