Hacker News new | ask | show | jobs
by Keyframe 3619 days ago
I wish you and your project all the best. Hardware, and especially CPUs and alike are tough and rare. We haven't seen much new competitors (any) in that area, especially relevant ones.

When you say you rest your high hopes on toolchain, aren't you a bit scared of what happened to Itanium? Intel had toolchain under their r&d and it failed because they couldn't deliver. I'm interested to hear more about "mythical 'sufficiently smart compiler' and how it relates to your architecture.

1 comments

Based on our software results so far, I wouldn't say I'm scared, but am definitely anxious. Since our main focus up to this point has been building the first test chip along with software tool prototyping, our progress in compiling "real" libraries and small applications is fairly early, but we're happy with the results. Now that we've taped out, we can devote more resources, and once we have real hardware, we will be able to test our applications ~1000x faster than the cycle accurate software simulation capabilities we have right now.

All that being said, we have good reason to believe that our approach is valid and won't suffer the flaws of "Itanic" that I've mentioned on this page and many times elsewhere. Unlike any prior VLIW (Intel called their bastardized version implemented in Itanium "EPIC"), our hardware was built with an emphasis on hard real time guarantees and strict determinism at every level of the design, which allows for a level of optimization that is impossible on any other architecture.

Basically, if the compiler has to make worse case assumptions almost all the time to prevent control and data hazards (as did Itanium due to a very convoluted design), how do you expect to have any compiler generated programs to be at all performant/efficient?

Does this means that users have to recompile the world for every cpu generation because of microarchitectural changes? I.e. is the pipeline exposed? Are you planning a Mill-like intermediate level bytecode?
Yes, and in certain cases of the same generation of chip (e.g. same microarchitecture but fewer number of cores and/or less memory per core; no problem if you compiled for a small number of cores/less memory and it is run on a "bigger" chip) as the compiler would need to remap the program and data location based on the global address map.

It is a very simple pipeline, and we expose the exact latencies required for all operations, along with things like branches with delay slots. As I have mentioned ad infinitum, determinism is a key part of our architecture, and having a fixed pipeline is necessary. Plus, we want anyone crazy and skilled enough who wants to hand write assembly the freedom to be crazy ;)

For the applications (HPC and DSP-like stuff) we are targeting, source code is always available, there are very long periods between when you have to recompile due to source code change, and optimization is a key factor. Our customers aren't only accepting with recompiling for every new generation of hardware, they expect it and want to be able to take advantage of any new improvements that the compiler would be able to make.