|
|
|
|
|
by tanujjain
2442 days ago
|
|
Scale is unfortunately not the focus of the current implementation. We would address this aspect in the future releases however.
Considering the speed and memory requirements, following are the current considerations:
1. Hashing methods: Generation of hashes is quick (a couple of seconds on about 10K images). The tricky part is the retrieval of duplicates, which on the same 10K dataset takes a few minutes. (I would refrain from giving exact numbers since this was done on a local system, not a very good environment for benchmarking)
2. CNN: Here, the time consuming part is encoding generation, which, in the absence of GPU would take much more time (a couple of minutes on 10k images). The retrieval part is pretty quick, but requires memory.
So, at this point, using this package on a scale of more than a couple of thousand images is not a good idea when done locally. We would however, address the scale aspect of the problem in future releases. Thanks for your question. |
|
How much memory would you need for ~2000 images, how slow does it get, etc.
Thx