Hacker News new | ask | show | jobs
by beau 1835 days ago
We've released the underlying Rust implementation here: https://github.com/InstantDomain/instant-distance with Python bindings at https://pypi.org/project/instant-distance — feedback welcome!
3 comments

For Linux, in the Makefile change the copy command to

cp target/release/libinstant_distance.so instant-distance-py/test/instant_distance.so

and it works. Built and running. The main tree was MacOS only.

Here's resource consumption in a sample run.

Time: 4.49s, Memory: 1552 mb.

Single word. Three langs including en.

How did you figure this out? I've done lots of Linux software build troubleshooting as a result of using Gentoo, BuildRoot, and pacaur, but this doesn't ring any bells for a common issue
They probably tried it, it couldn't find a dynlib which is a Macos shared object file, opened the three line makefile, and then fixed it to copy a .so
Did you try spacy's most similar method? It's written in cython so is presumably quite fast as well. Thanks for the rust implementation though, I will most likely use this.
I’ve not much to say on the actual lib, it seems great! However, don’t feel compelled to put all your rust code into a single lib.rs. You can split your work into several files and use ‘pub use’ and ‘mod’ in lib.rs to re-export your functions & types into a public API of your choosing.

cargo check and format time might also slightly improve!

Funny, I often say the opposite. Don't feel compelled to split up your lib.rs. It's really refreshing to see a nice, compact library in one or two files. Much easier to follow, especially over "type per file". Of course, there are limits, but for a small lib like this, I personally would keep it in a single, or maybe two files.
I have a fair bit of experience writing Rust code and the current status is totally deliberate. I find module file sizes of about 400-800 lines of code optimal in terms of my ability to find things vs the unnecessary complexity of having to skip around files when changing something that touches an API boundary.