|
|
|
|
|
by mwcampbell
3904 days ago
|
|
I notice that no off-the-shelf web view wrappers (e.g. PhoneGap/Cordova) or non-web-view JavaScript frameworks (Titanium, NativeScript) are in this list. Would Xamarin, RoboVM, or RubyMotion be detectable with this method? Or can we safely assume that all of the top 100 apps are native ObjC or Swift? |
|
We actually scan millions of apps from the app stores. Here's a public view of the top 500 in US, and if you choose the cross-platform category, you can see apps like this. For example, "Pac-Man 256" is rank 28 (Unity), Amazon is rank 30 (Cordova), etc.
https://sourcedna.com/stats/
It's actually very difficult to accurately match code written in so many different languages as compilers discard a lot of the info you need. I spent a lot of time researching and evaluating different ways to fingerprint libraries, as well as reconstruct the boundaries of internal modules when there weren't any symbols.
We match code by using a similarity search across all components we've ever seen. Since code written in C can be compiled to x86 or ARM, we disassemble the code into an intermediate language. Then we reconstruct control-flow graphs, data dependencies, and other platform-independent features. We index these in a custom search engine, which allows quick lookup and matching.
It's very difficult, but ultimately a really fun problem to solve. Most of our engineers got started with exactly the exercise Ryan did here. :-)