Hacker News new | ask | show | jobs
by lunfard000 772 days ago
Was it a secret? You could have guessed that something advertised [0] for "AI" had some kind of SIMD. Even ChatGPT 3.5 can give relevant code to use "AI" features [1].

0: https://www.espressif.com/en/products/socs/esp32-s3

1: https://chat.openai.com/share/3e1f990d-e8eb-4e56-acbb-ad5a33...

3 comments

Not a secret - just not documented very well if at all.

We all knew there were SIMD instructions, but if there’s no information on how to use them or what they do…

IIRC, they have 128bit alignment requirements, so tricky to autovectorize.
True - load and store mask off the bottom 4 bits of the address. They try to help the situation by including an instruction which can shift a pair of 128-bit registers by bytes.
That sounds really familiar. Maybe Altivec did that? I remember it did something like that but I wish that it would just fault.
And the author is not documenting them either, just announcing his new niche library. It is not like disassembling a few functions to prove that they exist is dark magic. I just don't see any value in the article.
I’m not sure I’d call a JPEG decoding library “niche”.

There are some numbers here on the performance improvements he’s managed to make.

https://atomic14.substack.com/p/even-faster-jpeg-decoding

maybe I am missing something but isn't it barely faster than the offical ESP32_JPG? But fair enough, didn't know than JPEG decoding on MCUs is a widespread thing.
The "official" version used in that blog post decodes the JPG all in one go - so it's pretty memory hungry. With JPEG encoders that decode sections of the image at a time you can minimise the amount of RAM that needs to be allocated. It's also possible to stream the display data out to screen using DMA while the next chunk of image data is being decoded.

It's explained in a bit more details in this original blog post written before the library was optimised: https://atomic14.substack.com/p/the-fastest-esp32-jpeg-decod....

It's very easy to forget what a range of MCUs there are, from very puny, to very capable. For example the Espressif range of MCUs - which you'll find in all sorts of consumer products - are very powerful. Couple that with a lot of cheap SPI based display modules and you very quickly start wanting to show images.

You need to go back and read it again. I provide links to the relevant Espressif documents and in my next article I provide a simple example to get started. Would you rather have me copy the hundreds of pages of PDF into my blog post instead of providing a link?
> Even ChatGPT 3.5 can give relevant code to use "AI" features

I've seen ChatGPT invent its own functions and commands ...

I've also definitely seen it reference invented methods on APIs (that would have been very nice if they existed) - that no past or future version implemented.
if problem: solve_problem()

There, problem solved!

# rest of problem-solving code goes here
I love doing engineering based off of advertising material…