Hacker News new | ask | show | jobs
by astrobe_ 2174 days ago
> Someone measured the dispatch overhead for fetching the next word in FIG-Forth for the 6502 at over 80 cycles. STC only needs 12 for a JSR/RTS pair and potentially 0 if the word is inlined.

I have implemented STC with cod inlining for 8086 a long time ago, as well as DTC and bytecode, in assembly, C and Forth (sic), so I did my share of cycle counting.

Sorry if I project my younger self on you, but cycle count is not the best approach to factoring. Factoring in Forth is about spotting redundancies. This is a bit different from what people mean today by "(re)factoring", which is more about how one splits a task, a program, into functions or classes. This should be called "restructuring" instead. Refactoring in Forth is really a compression process, even when you could not care less about the size of your object code.

The best approach is to write your definitions the way you like, mostly ignoring speed and cycles and byte count. This should result in clean, elegant definitions. This first step reveals some short words like your @+1 that could be good candidates for an implementation in assembler. Some other words can be inlined using IMMEDIATE. Some others can be de-factored as a result of this refactoring-for-speed process. Doing it that would have made you realize t0_ppp is always 1, just like t0 is in the C version.

The Python basis for this port needs some optimizations to begin with. If I am not mistaken, all tiles are square. No need for a t_height or a t_width. No need for both in the tiles structure either.

The C port of the Python program can be improved. Tiles are defined, then there's a big array of pointers to those tiles. This is necessary because the tile data does not have a constant length, but it seems to me preferable to have an array of structures defining the geometry and colors of the tiles, and a pointer to the pixel data. The size is the same, just a different layout, but I expect that delaying the indirection from fetching the tile to fetching the data of the tile to be more convenient for the code.

Spotting unneeded complexity is harder than one would expect. Past the point of complexity being the consequence of laziness or lack of skill or lack of time, you find out that complexity also results from (bad) habits. You need to configure a bunch of paths for your application, you start writing routines to read them from an initialization file. Then after a while you realize you are using an interpreted language, so you could have sourced the configuration file directly, duh. I'm still making that kind of mistake after years of practice.

Simplification is a progressive process. Jeff Fox explains it better than I would in the third chapter of his essay on Forth [0].

> we are stuck in the same morass where we can't tell what anything takes or returns

I am puzzled to hear this complain from someone who can program in assembly. Forth and assembly are both un-typed, "no declaration needed" languages. What makes those properties acceptable in asm but not in Forth? Your expectations of Forth being a higher level language maybe?

In any case, all I can do is tell you that with practice, it is not as a big deal as you think, just like the weird symbols in APL are not a big deal, just like parenthesis in Lisp are not a big deal.

> Would you like to rewrite it as an example?

Sorry but no. I have no motivation to do that - I dislike coding challenges and prefer to code things that are actually useful for me - and nothing to gain from it.

[0] http://www.ultratechnology.com/forth.htm

1 comments

> The best approach is to write your definitions the way you like, mostly ignoring speed and cycles and byte count. This should result in clean, elegant definitions. This first step reveals some short words like your @+1 that could be good candidates for an implementation in assembler.

Yikes! When I get into debates about this, writing part of the program in assembly usually comes up eventually. This is a good example of how inefficient the system is when you have to write assembly to do what would be *ptr++ in C. Another example is something like "swap 5 + swap" which burns all kinds of cycles. Someone recommended I rewrite this in assembly as well although it would only be something like "x+=5" in C. The absurdity of that speaks for itself.

>The Python basis for this port needs some optimizations to begin with. If I am not mistaken, all tiles are square. No need for a t_height or a t_width. No need for both in the tiles structure either.

Sure, but storing those as one value instead of two would also slightly speed up the C and assembly versions. Forth would still be just as slow and inefficient compared to the other languages if you made that change.

> The C port of the Python program can be improved. Tiles are defined, then there's a big array of pointers to those tiles. This is necessary because the tile data does not have a constant length, but it seems to me preferable to have an array of structures defining the geometry and colors of the tiles, and a pointer to the pixel data. The size is the same, just a different layout, but I expect that delaying the indirection from fetching the tile to fetching the data of the tile to be more convenient for the code.

No, the color data varies as well since some tiles have no color pairs and the rest have a varying number. Also, different color pairs can be applied to the same tile depending on the situation, so there is no 1:1 correspondence that would make your suggestion make sense here. In any case, adding one level of indirection to save memory and keep the system organized by passing an 8-bit index instead of a 16-bit pointer adds a few dozen cycles to a function that takes around 20,000 to draw the tile. Do you mean that if the C and Forth versions were reorganized, there wouldn't be such a large speed disparity between the two? That is highly doubtful.

>I am puzzled to hear this complain from someone who can program in assembly. Forth and assembly are both un-typed, "no declaration needed" languages. What makes those properties acceptable in asm but not in Forth? Your expectations of Forth being a higher level language maybe?

The original criticism is that you can't tell what anything takes or returns. If you look at the assembly, you'll see that the function calls there are just as clear as in C since we have named arguments and can instantly see what happens with the return value after the subroutine call. It's very telling that it's a lot easier to implement a usable scheme for local variables in assembly, primitive as it is, while Forth still lacks this.

> Sorry but no. I have no motivation to do that - I dislike coding challenges and prefer to code things that are actually useful for me - and nothing to gain from it.

Fair enough. How about an existing example you could point to then? There are a lot of unbelievable claims about the performance of Forth compared to C and assembly, but no one seems to be able to prove any of this. That's one of the main motivations for doing this project.