|
|
|
|
|
by johnwbyrd
1025 days ago
|
|
Seems like your problem is more with the venerable 6502 itself rather than the compiler. Most of that assembly code is spent calculating the offsets inside the Ball struct, which must be done at 16 bits of resolution in every case. The compiler's using the indirect indexed (zero page address with Y offset) 6502 addressing mode to get at all the fields in your struct. It has placed all the variables in zero page, so no instruction is more than two bytes long; additionally, the code in question is entirely linear, with no JSRs or other subroutines. Note in particular how it efficiently uses DEY/INY pairs of one byte instructions to get at low and high bytes of 16-bit memory. Hand-written assembly might be speedier, but not by much and still deal with all the corner cases that your generated code does. "While writing Apple BASIC for a 6502 microprocessor I repeatedly encountered a variant of Murphy's Law. Briefly stated, any routine operating on 16 bit data will require at least twice the code that it should." -Steve Wozniak |
|
For 6502, to get the optimum assembly you'd have to structure your data in structure-of-arrays instead of arrays-of-structures and use indices instead of pointers as much as possible (at least when amount of Ball objects would be < 256).