Hacker News new | ask | show | jobs
by edgyquant 871 days ago
Assembly is meant to be 1:1 with machine code, which makes writing an assembler extremely easy as long as you know the architecture. Machine code doesn’t have things like equal signs, it’s literally just a series of bytes (an opcode and operands)

If you want equal signs, use C

4 comments

Nonsense. Machine code doesn't have things like commas, brackets and letters, yet we use those in assembly. There's zero reason why you couldn't do my proposal.

Also assembly mnemonics aren't even 1:1 with instructions. Pseudoinstructions do pretty much anything, and even something like `add` can assemble to two different instructions depending on the arguments.

As for writing an assembler being "extremely easy"... yeah no. There's no formal grammar so you're going to be reverse engineering LLVM and GCC's hilariously messy assemblers. Or more realistically, guessing and building an enormous test suite. Not easy at all.

I mean if you want to be 1:1 with machine code, then don't look at the "J" Jump or "B" Branch instruction formats where the constant value is split into pieces and packed in around the registers.

With a '=' all OP is asking for is the wee-est bit of grammar (no more than in some addressing modes in CISC assemblers) and to change the order of the registers, which isn't crazy since immediates are already split.

https://en.wikipedia.org/wiki/RISC-V#Design

PS- while the instruction formats look crazy in isolation, there is a nice symmetry between them and if you start thinking about instruction decoding in discrete logic, they are actually quite an elegant design.

PPS- there is also “High Level Assembly” https://en.m.wikipedia.org/wiki/High_Level_Assembly

Machine code doesn’t have mnemonics either. What’s your point? There are assembly syntaxes with equal signs (The Itanium though ultimately a failure was not obscure and there are a handful of DSP ISAs that use this syntax). Not my cup of tea but your argument is specious.

> Assembly is meant to be 1:1 with machine code,

And that is nonsense as well. RISC-V is a perfect example as it has plenty of pseudo ops. (Or do you actually think that the literal machine code bytes of an add instruction are 0x41 0x44 0x44? - they’re not)

Equal signs or not, macro assemblers have plenty of ergonomic conveniences layered above machine code.

How does an equal sign break any concordance with assembly and the underlying machine code anyway?

It isn’t my definition so your entire rant here is just being argumentative and defensive for no reason. Just because some fringe ISAs have equal signs in their assembler doesn’t change anything either. Assembly is meant to map directly to the way the machine code is written, and ran, so having [opcode] [operand(s)] makes perfect sense and quality of life/syntactic sugar beyond very simple things like variables (which don’t add some crazy ast or other abstractions that make it a compiler) do not make sense for the tool an assembler is.
> fringe ISAs

Itanium fringe? You’re clueless and have no credibility.

We’re talking about an = instead of a , … you’re needlessly bringing up “crazy ASTs”, so much for being argumentative.

And defensive?

You seem mixed up, I’m not the one even advocating for the damn things. But your argument is ignorant and foolish.

> Assembly is meant to map directly to the way the machine code is written

This is just false.

https://github.com/netwide-assembler/nasm/blob/master/asm/as... Simple?

> Itanium fringe? You’re clueless

Very fringe. A huge market failure. Hardware discontinued. Support removed in LLVM (2.6), and the Linux kernel (6.7). Still seems to be hanging on in GCC, though it was initially going to be deprecated in GCC 10.

One of the very few ISAs I've never actually seen a real machine of, let alone used. And I've worked professionally on i960 (Stratus fault-tolerant computer), which not many people can say. Not to mention of course PA-RISC and Alpha and Pr1me and DG Nova/Eclipse (and an M88000 PC) as well as common-as-mud (and which I own examples of) SPARC and MIPS.

That it was a failure is well known. But fringe, as in obscure (which is what I think is salient in this context)? I think that is misleading - that it has an official gcc backend to remove is a sufficient counterargument. Funny you mentioned PA-RISC and Alpha, as both were abandoned by HP and Compaq for it. Stupid (at least HP did backpedal, but not after burning a mountain of cash, it was to replace the PA), but not consistent with fringe. Getting it into the Linux kernel and gcc support was a big deal at the time thousands of man-hours were expended on all of this, all the investment and work was done for the dominant architecture. Obviously Microsoft was hugely invested in it, but some there thankfully had better ideas and didn’t miss the x86-64 boat, since it already stank years prior to its release. DEC Alpha was also in the long run a commercial failure. wouldn't call it fringe though. PA-RISC was commercially middling, I also wouldn't call it fringe (both shitcanned for Itanic BTW, for which people are still bitter).
Well-known, yes, but fringe as in seldom seen.

Absolutely correct that mega-millions or billions went into it, both hardware and software, and that it FUD'ed many other ISAs to premature extinction.

It's not that hard to get an ISA into gcc and Linux -- you just need shipping hardware (or a convincing story that there will be) and someone to do the work. For example Andes (Taiwan) and C-Sky (China) both got their proprietary ISAs (nds32, c-sky) into GCC and the Linux kernel. And both switched to RISC-V not long after :_)

TI has an Assembly language for DSPs that feels like coding in C in SSA form.

It is an Assembly with equal signs.