Hacker News new | ask | show | jobs
by sfrigon 1948 days ago
What stops major SoC designers to come up with a standard "programmable IOs" interface for all their IOs, a little bit like these PIOs, instead of shipping hundreds of flavors of the same CPU with different IO options? I guess it's more expensive to design & manufacture a truly general purpose IO, but doesn't the cost of warehouse and the risk of not having a market for that specific SoC outweigh the initial cost? It would also lower the number of pins on the SoC. e.g. You only want HDMI & SATA, here's the VHDL for it, you can even individually select the pins you want to use.
3 comments

> You only want HDMI & SATA

SATA is a 6Gbps port, while HDMI is a 10Gbps port.

The PIO ports discussed here are on the order of 100kHz, roughly 1-million times slower than HDMI, and 600,000 times slower than SATA.

> HDMI is a 10Gbps port.

> The PIO ports discussed here are on the order of 100kHz

PIO runs at the system clock, 125 MHz by default, overclocks of over 400 MHz have been reported stable. A single PIO can clock out 32 bits every cycle (with a DMA and a memory system that can feed this), giving you a total of 4 Gbps.

Running at full throttle like that, especially for a decent length of time is tricky if you want to actually do anything other than blast out bits but 16 or 8 bits per cycle is a lot more straight forward, so 2 or 1 Gbps.

DVI output has already been demonstrated, running two displays at 480p: https://github.com/Wren6991/picodvi

The DVI is maybe more of a party trick than something you'd do in production hardware but it does demonstrate how capable the PIO can be. You could happily implement the same concept in a more performant device and reach 10 Gbps or more in a reasonable way.

I was skeptical at first, but read the github and yeah, not only does it work, it passes eye mask tests. Crazy.
> The PIO ports discussed here are on the order of 100kHz

That's not what the datasheet[1] says:

When outputting DPI, PIO can sustain 360 Mb/s during the active scanline period when running from a 48 MHz system clock. In this example, one state machine is handling frame/scanline timing and generating the pixel clock, while another is handling the pixel data, and unpacking run-length-encoded scanlines.

Still not SATA speeds though.

[1]: https://datasheets.raspberrypi.org/rp2040/rp2040-datasheet.p...

It actually is possible to get HDMI on the RP2040, if you're willing to have lower resolution.

https://hackaday.com/2021/02/12/bitbanged-dvi-on-a-raspberry...

Wow that's nice.

Even though I did not intend to say we should have SATA and HDMI on the RP2040 itself (I didn't know if it was possible), it still proves that having realtime control on the IOs opens the door for way more functionalities than SPI/I2C/UART specific ports. All of it using the same SoC and potentially less pins.

Having the same level of control on any device would be beneficial in my opinion.

Right, but I didn't mean for this specific device. I meant for higher end SoCs, as in the RaspberryPi4/BeagleBone Black or even higher end SoCs. The RP2040 could be used as an inspiration to provide the same level of freedom to other SoCs.
My overall point is that GHz-speed decoding in a flexible manner seems... difficult... to say the least.

Your discussion point of "here's a VHDL block" seems to understand the general issue. You need a non-trivial amount of FPGA-magic (LUTs) to implement logic and routing at those speeds. SATA has some kind of error-correction code if I remember correctly... so its not exactly easy to parse those messages.

I absolutely agree that this would be non-trivial and a lot of magic is required to make it happen. But it would need to be done only once, after that it can be shared to all the users, a little bit like a GPU firmware/driver.

I am just surprised that this is not more widespread among major players as a way to reduce costs and increase flexibility. Though I'm pretty sure I'm overlooking the core of the issue here haha

I think one major factor is that it would increase unit costs in many cases. We are talking cheap chips that are sold in high volume. So any small increase in cost gets multiplied quickly. Combine that with competition (your competitor provides a less flexible chip, but it is $0.20 cheaper and has the IO ports you need), and you can see why we have the mess we do.

I think I’m many cases the flexibility is great during the prototype phase. But those are used at lower volume. When you move to production, you’d want to have the cheapest BOM as possible.

Well actually, it would be nice to replace any specific interface with a generic FPGA-like interface. But of course what you can implement with it would be limited to the speed of the CPU / peripheral.
What stops them? Mostly business considerations.

The differentiate their price according to features and so they extract more money. And by writing your code to specific peripherials, it's harder to switch to another mcu.

And they have large libraries of proven hardware peripherials and code which make it harder for competitors to enter. Why would they want to compete with open-source pio libraries?

The raspberry pi foundation doesn't care about all that. So they created this chip.

You sort of have that with generic SERDES blocks.