Hacker News new | ask | show | jobs
by wallacoloo 3647 days ago
One-instruction computers intrigue me because of the possibility of making the processor out of a ridiculously low number of transistors. To that effect, I've always wanted to implement one that doesn't require a hardware adder for the instruction pointer. It seems like it should be possible. For example, each clock cycle the CPU would take the NAND of the bit pointed to by the IP and the bit adjacent to it (via flipping the LSB of the IP) and use the result to determine a new value of the IP using very simple logic (E.g. IP = IP<<1 | NAND(mem[IP], mem[IP ^ 1]) ). But surprisingly, I've never seen this done and the last time I approached it I didn't have the knowledge needed to choose an appropriate scheme and prove its usability.
2 comments

A few processors, such as the TI TMS1000 and the Sharp SM4, used a maximal-length LFSR for the program counter. This may be more practical than it sounds, if your memory subsystem doesn't penalize out-of-order accesses. You could write code normally, pretending that instructions are at contiguous addresses, and then scramble the object code into LFSR sequence at link time.

As another note, for low gate count processors, you may want to look into bit-serial techniques, as used in the PDP-8/S.

"used a maximal-length LFSR for the program counter."

I found Wikipedia entry on MLS-LFSR. How do those end up useful in program counters?

"you may want to look into bit-serial techniques"

You talking about the 1-bit ALU's like these?

https://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Comb/oneb...

Making a comeback in another field:

http://www.caesjournals.org/uploads/IJCAES-CSE-2012-132.pdf

You wouldn't normally use an LFSR as a program counter, because it's more convenient to have the instructions execute from address 1, 2, 3, 4, etc rather than (say) 79, 30, 61, 122, etc. But an LFSR has fewer gates than an adder, which is why it was used in those old 4-bit processors.

The last diagram in the page you linked shows three 1-bit ALUs chained together to make one 3-bit ALU. As I understand it, the PDP-8/S used a single 1-bit ALU to sequentially calculate each bit in a 12-bit output word, taking 12 cycles to emulate a 12-bit ALU. For the 3-bit example, imagine there being a single ALU which acts first as the rightmost ALU in the diagram, and then once you have the carryout bit it acts as the middle ALU, and finally it acts as the leftmost ALU.

Appreciate the tip.
You might consider something like the IBM 650, where each instruction includes the address of the next instruction to execute.