Hacker News new | ask | show | jobs
The Xiaomi 14 with the new Snapdragon 8 Gen 3 has a 32 to 64-bit translator (twitter.com)
120 points by sharkski 971 days ago
6 comments

Hello HN! I'm the main developer behind the Tango binary translator.

I'm happy to answer any questions you may have about the technology.

Hi! I'm interested in what has to happen for a (presumably) pretty small company to supply such an integral piece.

It seems to me like everyone from Qualcomm over Android to manufacturers would want this. Why did you have to build it instead of them? And is Xiaomi the entity licensing it from you? Does this mean that there will likely be implementations of the Snapdragon 8 Gen 3 that don't support 32 bit apps?

Is this just such a niche problem because most phones have been on 64-bit processors for so many years so that vendors expect the compatibility breakage won't be too big?

How did you anticipate this issue and how are you using the situation? i.e. do you already have experience/contacts in the industry?

Phew, that's a lot of questions and I get it if you don't want to answer all of them. Thanks for your time and stopping by!

When ARM released AArch64, it was a completely new instruction set rather than an extension of the existing 32-bit ISA (like x86-64). AArch64 is a much better designed ISA than AArch32 and supporting both in hardware effectively doubles the complexity of the instruction decoder, so it was clear that eventually there would be AArch64-only CPUs (as there are today).

The technology behind Tango started as a research project while I was doing my PhD. After finishing my dissertation in 2016, I looked into opportunities to commercialize it by contacting every company that was building or planning to build an AArch64-only CPU. There are sufficiently few that it is easily doable even for a small company.

Building a production-ready binary translator is technically challenging and requires a lot of work. The difficult parts are achieving high performance (Tango scores within 10% of native 32-bit execution on benchmarks), low latency (using AOT translation to accelerate startup times) and compatibility (Tango was tested against the top 1000 Android apps and works with all of them).

Already by 2017, Tango was capable of translating AArch32 Android applications. At that point it makes more sense for companies to license our technology rather than developing their own implementation from scratch.

Congrats!

What was the main technical challenge of this project and what was the solution?

The first commit in what would eventually become Tango was all the way back in 2014, so this project has been in development for a long time. As such there have been many technical challenges.

One challenge that particularly comes to mind was dealing with anti-emulating/anti-debugging code in various Android applications. These apps would do all sorts of crazy things like attaching to themselves with ptrace, installing bizarre seccomp filters which check for specific 32-bit syscalls and using self-modifying code without proper cache flushing to check for the presence of an instruction cache.

The solution for each of those was to emulate the relevant functionality well enough to trick these apps into thinking they were running natively. Although in the case of self-modifying there was no good solution and we ended up hard-coding some particular instruction sequences in the translator for special handling.

One thing that really made the above possible is that for Tango v2.0 we re-wrote a large part (~half) of the codebase in Rust, which was previously entirely written in C. In particular, the ptrace emulation code needs to maintain a lot of internal state about traced threads. This requires maintaining complex data structures, and the ability to easily use enums, Option, HashMap, etc, is a huge help for this.

Sounds similar to the effort that went into supporting legacy DOS and Win3.1 applications on NT, back in the day.
> One challenge that particularly comes to mind was dealing with anti-emulating/anti-debugging code in various Android applications

While I don't care or want to digress into the ethics of this suggestion, if you had the expertise, wouldn't it be significantly more valuable to author automation APIs for TikTok, Snap, Instagram, Messenger etc.?

The "proper" solution would be for apps to support 64-bit, at which point they can just run natively on the device. The whole reason Tango is required is that there is still a large number of apps that use 32-bit native libraries. This is particularly true in markets such as China which don't use the Google Play Store.
TBH, it has always been surprising to me that this wasn't a default feature in Android. It couldn't be that difficult to add.

Then again, they want to support small storage sizes, so maybe discouraging the OS from growing too much with redundant 32 and 64 bit versions of base libraries is the real reason?

I don't think Android has a big priority on storage size, every APK bundles all of its dependencies (except for ART), so if multiple apps use the same exact library and version, that library's code will not be shared among them
Reminds me of libhoudini (ARM -> x86 translation layer) from the days when x86 Android phones were still thing. Performance wasn't amazing but compatibility was surprisingly good.
One sneaky thing Intel did was their Atoms targeting tablets of the same era used the same PowerVR chipset as the iOS devices of the time, which were what everyone had optimised for. The consequence was many games could run on such things surprisingly well since the GPU tended to be the limiting factor anyway.
Not just phones, I recently set up waydroid on an old netbook so my son could play the Khan Academy Kids app. Since the Android container is x86-64 and KAK is ARM-only the libhoudini translation layer was required and worked perfectly.
One thing I'm curious about translators like these: are they smart enough to convert software polyfill functions to hardware instructions? For example, armv7a doesn't have a divide instruction, so a software polyfill function is used (like __aeabi_idiv). Do translators recognize these common software functions and translate them to native instructions (i.e. sdiv in this example)?
That's a good question, I don't know.

I suspect the specific case of sdiv may be practically handled already for post-div Armv7a and ARMv8a AARCH32 on dynamically loaded operating systems, either by the dynamic loader or the compiler support library choosing to swap in a different implementation of __aeabi_idiv according to the available capabilities.

I guess Google play store can restrict the downloadable set of apps to what is natively compiled for available hardware.

So why this over-the-top feature than?

I don't know if it's over the top. ARM isn't too complex to convert. There's some register mangling that needs to be managed for SIMD, data struct alignment checks and overbounding behavior that needs to be accounted for; but it's all relatively (keyword on relatively, it's still a lot of work) simple and orthogonal.

I would assume it's provided because in Xiaomi's primary two markets (India and China), there's a ton of 32-bit apps that their users need and won't upgrade phones (a lot more common in those markets) if they can't keep them. Google Play is also less commonly used in those markets.

https://www.amanieusystems.com/technology explains how the emulator works. Large parts of the translation are done ahead of time before the app is run first.

The paper https://dl.acm.org/doi/10.1145/3140587.3062371 is also related to the emulator that they are using.

While India is certainly a big market, to the best of my knowledge it’s nowhere close to China as far as using legacy apps or non-play store distribution. (Also expensive phones unsurprisingly don’t sell very well in India but that’s not very unusual.)
Not only that, since 2019 Google have required apps published on play store to have 64bit variants, this should not be a problem.

However I suspect that the Xiaomi has to contend with the Chinese app ecosystem which may not be as strictly controlled, so probably has a decent number of legacy 32bit apps floating about.

Because Xiaomi cares about software backwards compatibility on its devices? Just because many people are now used to being told "this app developer hasn't updated their app in a few years so now you can't use it, tough shit" doesn't mean it should be the standard.
I know people who lost paid for games when Apple switched to 64bit.
>So why this over-the-top feature than?

The industry has been deprecating 32 bit ARM and we have finally reached the point where support is being dropped from the CPUs themselves. This means that if you still want to support 32 bit ARM, then you will need to do it in software.

To enable the device to use 32bit programs? The same reason Apple includes a translator for amd64 binaries on its M1/M2 platforms?
Always better to have a choice. Why needlessly snatch away the user's freedom to install and use whatever app he wants?
Wrong title!

It has a 64- to 32-bit translator!

Not the other way around!

No, the title is right
I think a better name would of been

AArch32 to AArch64 translator, because it may imply that the apps are suddenly running as 64bit applications