Just playing around with it, do typical strategies for using these tools include "bad" data? I drew a '-' and got '4' as the guess, which feels very wrong.
just a function of our simple demo, it returns confidence intervals for what it thinks it is of digits between 0-9 and we pick the top one. Not perfect by any means but shows how easily we can compare the libraries against each other.