|
|
|
|
|
by sennight
1871 days ago
|
|
It was eating characters, which is the second worst thing it could do. Digging into the source, I found a really wonky software buffer... which kind of blew my mind - since hardware buffers and flow control exist. I don't remember if that was a case where python was somehow in the pipeline, but that was unfortunately very common in OpenBMC. I also had a lot of trouble getting the thing to honor the most basic settings, like baud rate. But it is harder to confidently place the blame on that one. Is it OpenBMC's fault that the software provides no clue as to where things are going wrong in the serial plumbing? Is it the AST2500's fault that designers are routing every serial line through it and making it impossible to eliminate as the cause while troubleshooting? Is it IBM's fault, because the design of OPAL (OpenPower Abstraction Layer) encourages construction of this kind of Rube Goldberg machine? If anyone from IBM is listening: don't route all your serial lines through the BMC on your reference designs... system implementers will just copy it. |
|
Okay, that's interesting. Did you file a bug with your system's firmware provider or with upstream?
You can file it against obmc-console at https://github.com/openbmc/obmc-console/issues
> Digging into the source, I found a really wonky software buffer... which kind of blew my mind - since hardware buffers and flow control exist.
Yep, it's how OpenBMC provides the console via IPMI, Redfish, SSH and on the commandline, as well as routing the console out to the connector on the rear of the chassis.
> I don't remember if that was a case where python was somehow in the pipeline, but that was unfortunately very common in OpenBMC.
Generally there hasn't been any python in the console handling pipeline. That said, python, while especially slow on a BMC, was pretty important for getting the project off the ground.
At this point OpenBMC has been python-free for several years.
> I also had a lot of trouble getting the thing to honor the most basic settings, like baud rate
There's some nuance to this, as depending on your system design the console may be coming from the host to the BMC via Aspeed's Virtual UARTs (VUARTs). The way the VUARTs work is the BMC and host are connected to either end of the two FIFOs between the UARTs' APB and LPC/eSPI interfaces. As such there's no baud rate as no data is clocked out in RS-232 fashion - the data is transferred as quickly as either side can access their register interfaces.
This has caused issues in the past with control flow that lead to data integrity issues:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
> But it is harder to confidently place the blame on that one. Is it OpenBMC's fault that the software provides no clue as to where things are going wrong in the serial plumbing? Is it the AST2500's fault that designers are routing every serial line through it and making it impossible to eliminate as the cause while troubleshooting?
I agree it can be difficult to debug where things are going wrong in the console pipeline.
In concept a serial console should be simple, but when you bring in requirements like various kinds of SoL, it starts to get more complex.