| Thanks for the info these all certainly sound like promising developments though I still think there's a major hurdles to overcome. > good PPA with popular backend tools Getting good PPA for any given thing you can express in the language is only part of the problem. The other aspect is how easy does the language make it to express the thing you need to get the best PPA (discussed in example below)? > Think of it like a source map that allows you to jump back and forth between the final System Verilog and the source HDL. This definitely sounds useful (I wish synthesis tools did something similar!) but again it's only part of the puzzle here. It's all very well to identify the part of the HDL that relates to some physical part of the circuit but how easy is it to go from that to working out how to manipulate the HDL such that you get the physical circuit you want? As a small illustrative example here's a commit for a timing fix I did recently: https://github.com/lowRISC/opentitan/commit/1fc57d2c550f2027.... It's for a specialised CPU for asymmetric crypto. It has a call stack that's accessible via a register (actually a general stack but typical used for return addresses for function calls). The register file looks to see if you're accessing the stack register, in which case it redirects your access to an internal stack structure and when reading returns the top of the stack. If you're not accessing the stack it just reads directly from the register file as usual. The problem comes (as it often does in CPU design) in error handling. When an error occurs you want to stop the stack push/pop from happening (there's multiple error categories and one instruction could trigger several of them, see the documentation: https://opentitan.org/book/hw/ip/otbn/index.html for details). Whether you observed an error or not was factored into the are you doing a stack push or pop calculation and in turn factored into the mux that chose between data from the top of the stack and data from the register file. The error calculation is complex and comes later on in the cycle, so factoring it into the mux was not good as it made the register file data turn up too late. The solution, once the issue was identified, was simple, separate the logic deciding whether action itself should occur (effectively the flop enables for the logic making up the stack) from the logic calculating whether or not we had a stack or register access (which is based purely on the register index being accessed). The read mux then uses the stack or register access calculation without the 'action actually occurs' logic and the timing problem is fixed. To get to this fix you have two things to deal with, first taking the identified timing path and choosing a sensible point to target for optimization and second actually being able to do the optimization. Simply having a mapping saying this gate relates to this source line only gets you so far, especially if you've got abstractions in your language such that a single source line can generate complex structures. You need to be able to easily understand how all those source lines relate to one another to create the path to choose where to optimise something. Then there's the optimization itself, pretty trivial in this case as it was isolated to the register file which already had separate logic to determine whether we were actually going to take the action vs determine if we were accessing the stack register or a normal register. Because of SystemVerilog's lack of powerful abstractions making a tweak to get the read mux to use the earlier signal was easy to do but how does that work when you've got more powerful abstractions that deal with all the muxing for you in cases like this and the tool is producing the mux select signal for you. How about where the issue isn't isolated to a single module and spread around (e.g. see another fix I did https://github.com/lowRISC/opentitan/commit/f6913b422c0fb82d... which again boils down to separating the 'this action is happening' from the 'this action could happen' logic and using it appropriately in different places). I haven't spend much time looking at Chisel so it may be there's answers to this but if it gives you powerful abstractions you end up having to think harder to connect those abstractions to the physical circuit result. A tool telling you gate X was ultimately produced by source line Y is useful but doesn't give you everything you need. > the combination of Chisel and CIRCT offers a unique solution to a deeper problem than dealing with minor annoyances in System Verilog: capturing design intent beyond the RTL
> you could add information about bus interfaces directly in Chisel, and have a single source of truth generate both the RTL and other collateral like IP-XACT. Your example here certainly sounds useful but to me at least falls into the bucket of annoying and tedious tasks that won't radically alter how you design nor the final quality and speed of development. Sure if you need to generate IP-XACT for literally thousands of variations of some piece of IP this kind of things is essential but practically you have far fewer variations you actually want to work with and the manual work required is annoying busy work that will generate some issues but you can deal with it. Then for the thousand of variations case the good old pile o' python doing auto-generation can work. Certainly having a solution based upon a well designed language with a sound type system sounds great and I'll happily have it but not if this means things like timing fixes and ECOs become a whole lot harder. Thanks for the link to the video I'll check it out. Maybe I should make one of my new year's resolution to finally get around to looking at Chisel and CIRCT more deeply! Could even have a crack at toy HDL in the form of the fixed SystemVerilog with a decent type system solution I proposed above using CIRCT as an IR... |
This is the exact type of activity that CIRCT is trying to make easier! There are both enough core hardware dialects that new languages (generator-style embedded domain specific languages or actual languages) can be quickly built as well as the flexibility of MLIR to define _new_ dialects that represent the constructs and type system of the language you are trying to build while still inter-operating with or lowering to existing dialects.
This was the kind of thing that didn't work well with Chisel's FIRRTL IR as it was very closely coupled to Chisel and it's opinions. Now FIRRTL is just another CIRCT dialect and, even if you're not using Chisel and FIRRTL, you're benefitting from the shared development of the core hardware dialects and SystemVerilog emission that Chisel designs rely on.