Hacker News new | ask | show | jobs
by westurner 1557 days ago
Isn't there already a good way to push computation closer to the data?

GmailFS and pyfilesystem (userspace FUSE) and rclone are neat as well.

https://stackoverflow.com/questions/1960799/how-to-use-git-a... explains about the `git push` step that git-remote-dropbox enables: https://github.com/anishathalye/git-remote-dropbox

1 comments

GitHub also has a code search now: https://cs.github.com
Needing to tie into a specific API (like codesearch) couples you to the specific storage backend (Github). If you build your software to operate on a POSIX-y file system, you can support anything that shows up as a file system. For example: A local working tree of files, an NFS share, or now a remote git repository.
Running the code where the data already is saves network transfer: with data locality, you don't need to download each file before grepping.

Locality_of_reference#Matrix_multiplication explains how the cache miss penalty applies to optimizing e.g. matrix multiplication: https://en.wikipedia.org/wiki/Locality_of_reference#Matrix_m...