Okay, I got ANNoy nearest-neighbor search working for my meme search project. Results are promising. Tested on multiple image folders locally, vs. GPU-enabled matrix-vector multiplication.
Over a 50-image dataset, ANNoy is 3× slower building the index, and equal or slower on inference. Over a 6,000-image dataset, it’s 2× slower building the index but over 30× faster at inference. Over a 26,000-image dataset: 5× slower to build, but over 100× faster to infer — 50 ms to 7 s!
It’s also much faster to load once cached. Sucks to take 2.5 minutes to process and index 26,000 images, but you only have to do it once (unless you add or remove from the dataset, lol).
Thanks to [names] for much help with the sublinear ranking problem! I am but a wee lad on the shoulders of giants.