Hacker News new | ask | show | jobs
by miki123211 151 days ago
This wouldn't be easy due to Eloquence's internal architecture. eci.[dll|so|dylib] only contains the low-level platform abstraction layer, things like threads, queues, mutexes etc, as well as utility classes for .ini file handling and such. It then loads a language module (from a path specified in eci.ini). The actual speech stack is statically linked separately into each language module (possibly with modifications, not sure about that); in theory, if you reverse-engineered the API between the main and language libraries, you could write an Eloquence wrapper for any arbitrary speech synthesizer. This means you'd have to reverse-engineer this separately for each language.

From what we know, Eloquence was compiled in two stages, stage1 compiled a proprietary language called Delta (for text-to-phoneme rules) to C++, which was then compiled to machine code. A lot of the existing code is likely autogenerated from a much more compact representation, probably via finite state transducers or some such.

2 comments

I'm bullish on LLMs being able to help with this kind of reverse engineering effort, if not current models then in a few more years. I've had conversations with people where they managed to get Claude to help reverse engineer old weird binaries with very little input. I wouldn't hype it up as being a magical tool that'll definitely work, but it can't hurt to try.
I gather decompiling mario 64 wasn't easy either. Just having C++ that can be recompiled to other architectures would seem to be useful. The original Eliza chatbot was converted to modern C++ in a similar way recently, and that used a compact representation for its logic as well.