Hacker News new | ask | show | jobs
by LuciOfStars 438 days ago
No AI or OCR here, just an idiot with too much free time. After seeing that the AltaBASIC source posted was a scan of paper tape, I thought having it in a digital document proper would be good for preservation and novelty's sake. I've never written Assembly, and I'm not a typography expert! Feel free to point and laugh at any silly mistakes. Let's see how many days it takes!
9 comments

It's not "paper tape" (that's a digital storage medium), just a printout. And the numbers in the left half are not actually part of the source file, they are line numbers and machine code output produced by the assembler. You'd probably better not waste time transcribing them.

Not trying to dissuade you, but here's some things you should consider:

• Turn off your spell checker, it will only make this more difficult! It certainly won't help with the code itself, and it seems like you want to reproduce everything perfectly, including typos in the comments.

• I'd strongly suggest to at the very least become a bit familiar with 8080 assembly language before attempting this.

• The tools used to produce this output add another layer of complications. They used the PDP10 system's assembler with a set of macros to adapt it to generate 8080 code, so it's using somewhat different syntax and directives than those of 8080-native assemblers (like the ones from Intel or Digital Research).

• Some characters are hard to read, and without knowledge of the context and at least some of the PDP10-specific syntax it will be impossible to just guess. E.g. decimal numbers are sometimes prefixed with '^D', and octal numbers with '^O', which look quite similar in this scan. The 'RADIX' directive changes the default for when there is no such prefix, it should be 10 for most of it, but I think that it does start out as octal. Memory addresses will be octal (like 'RAMBOT==^O20000' in line 13), ASCII characters could be either but they seem to prefer decimal for those ('^D13' is CR, '^D10' is LF).

There are PDP-10 emulators with well-maintained copies of the different operating systems for them, so someone could check that the typed up source can be assembled.
I don't think it would help. As far as I can tell, the source doesn't include the macros needed to actually perform an assembly.
I do have some (surface-level) experience with GB/GBC assembly, but other than that I'm new. As for spell-checker, I've figured out how to rid myself of that. And the paper tape mix-up was just my inexperience.

All super interesting info!

If you've spent 7 days so far and got as far as "F3", I wondering if this is actually some kind of elaborate troll. At the rate of one byte per week, you might finish transcribing the 4K ROM within 8 decades.
Oh haha, just seen the reply below mine that has a link to the photos of the printout...

Interestingly, the large F3 appears to be a batch number for the print job, so it's just coincidental that it's the same as the first byte in the ROM. The first byte starts on line 732 and printed in octal not hex, so "000363" (the first byte of the output seems to be some kind of markup for addresses in multi-byte opcodes (immediate loads seem to have 000 for these).

I'm just impressed nobody peeled away the perforations on this printout:

https://images.gatesnotes.com/12514eb8-7b51-008e-41a9-512542...

No kidding. That was always the first thing I did after picking up a printout.
Hard to be certain, but those edges look non-perforated to me.
I agree. I don't think they are perforated.
I had to do this in the 80s when I made a mistake that resulted in the file being erased. The fun doesn't begin until you start debugging it.

Actually this happened twice: once, someone else's code but I had the listing. Another, my own code but no listing.

You're not an idiot. I have transcribed several listings manually. I also tried OCR. In my experience, fixing OCR mistakes take at least as much work as just typing the whole thing.

I recruited a team to transcribe a particularly important document, and I had us type each page to catch typos.

> Let's see how many days it takes!

I reckon there are ~40,000 lines in the printout. So: ~10 years.

Best of luck. I hope that you don't fizzle out at some point, but I suppose 10 lines a day isn't too daunting a task.

I wonder if MS has this code somewhere in a digital vault.

I'm having a struggle with this computer, it loves to correct Altair to Alta and I keep forgetting to correct it. Whoops.
Let's hope it doesn't autocorrect too much of the asm...
You can add an exclusion or exception to your applicable spell check
I figured it out! Never using a Mac again. Hopefully my school gives me a different computer next year.
If I believed in God, I'd say you're doing God's work.